`analysis`

Angle calculation

dataset_angles

 dataset_angles (trajs:list)

Given a set of trajectories, calculate all angles between displacements

	Type	Details
trajs	list	set of trajectories from which to calculate angles
Returns	list	list of angles between displacements

source

get_angle

 get_angle (a:tuple, b:tuple, c:tuple)

Calculates the angle between the segments generate by three points

	Type	Details
a	tuple	2d position point A
b	tuple	2d position point B
c	tuple	2d position point C
Returns	tuple	angle between segments AB and BC points

MSD based analysis

source

msd_analysis

 msd_analysis ()

Contains mean squared displacement (MSD) based methods to analyze trajectories.

source

msd_analysis.tamsd

 msd_analysis.tamsd (trajs:numpy.ndarray, t_lags:numpy.ndarray, dim=1)

Calculates the time average mean squared displacement (TA-MSD) of a trajectory at various time lags,

	Type	Default	Details
trajs	ndarray		Set of trajectories of dimenions NxTxD (N: number of trajectories, T: lenght, D: dimension)
t_lags	ndarray		Time lags used for the TA-MSD
dim	int	1	Dimension of the trajectories (currently only 1 and 2 supported)
Returns	np.array		TA-MSD of each trayectory / t_lag

source

msd_analysis.get_diff_coeff

 msd_analysis.get_diff_coeff (trajs:numpy.ndarray, t_lags:list=None)

Calculates the diffusion coefficient of a trajectory by means of the linear fitting of the TA-MSD.

	Type	Default	Details
trajs	ndarray
t_lags	list	None	Time lags used for the TA-MSD.
Returns	np.array		Diffusion coefficient of the given trajectory.

Here we show an example for the calculation of the diffusion coefficient of a 2D Brownian motion trajectory. We create trajectories from displacements of variance \(\sigma =1\), which results in a diffusion coefficient \(D=0.5\).

pos = np.cumsum(np.random.randn(1070, 100, 2), axis = 1)
D = msd_analysis().get_diff_coeff(pos)

plt.hist(D, bins = 30);
plt.axvline(np.mean(D), c = 'k', label = f'Mean of predictions = {np.round(np.mean(D), 2)}')                                                                  
plt.axvline(0.5, c = 'r', ls = '--', label = 'Expected')
plt.legend()

<matplotlib.legend.Legend>

source

msd_analysis.get_exponent

 msd_analysis.get_exponent (trajs:numpy.ndarray, t_lags:list=None)

Calculates the diffusion coefficient of a trajectory by means of the linear fitting of the TA-MSD.

	Type	Default	Details
trajs	ndarray
t_lags	list	None	Time lags used for the TA-MSD.
Returns	np.array		Diffusion coefficient of the given trajectory.

To showcase this function, we generate fractional brownian motion trajectories with \(\alpha = 0.5\) and calculate their exponent:

trajs, _ = models_phenom().single_state(N = 1001, T = 100, alphas = 0.5, dim = 2)
alpha = msd_analysis().get_exponent(trajs.transpose(1,0,2))

plt.hist(alpha, bins = 30);
plt.axvline(np.mean(alpha), c = 'k', label = f'Mean of predictions = {np.round(np.mean(alpha), 2)}')
plt.axvline(0.5, c = 'r', ls = '--', label = 'Expected')
plt.legend()

(1001, 100, 2)

<matplotlib.legend.Legend>

Velocity Autocorrelation Function (VACF)

source

vacf

 vacf (trajs, delta_t:int|list|numpy.ndarray=1,
       taus:bool|list|numpy.ndarray=None)

Calculates the velocity autocorrelation function for the given set of trajectories.

	Type	Default	Details
trajs	np.array		NxT matrix containing N trajectories of length T.
delta_t	int \| list \| numpy.ndarray	1	If not None, the vacf is calculated in the demanded time lags.
taus	bool \| list \| numpy.ndarray	None	Time windows at wich the vacf is calculated.
Returns	np.array		VACF of the given trajectories and the given time windows.

We show here an example of the VACF for FBM trajectories at various time lages, showing that they all coincide (as expected for this diffusion model).

deltats = np.arange(1, 5).tolist()
taus = np.arange(0, 100)
trajs, _ = models_phenom().single_state(N = 200, T = 200, alphas = 0.5)
trajs = trajs.transpose(1, 0, 2)[:,:,0]

for deltat in deltats:  
    v = vacf(trajs, deltat, taus)        
    plt.plot(taus/deltat, v.flatten(), 'o-', alpha = 0.4)    
plt.xlim(-1, 10)
plt.ylabel('VACF'); plt.xlabel(r'$\tau / \delta$')

Text(0.5, 0, '$\\tau / \\delta$')

Convex hull analysis

source

CH_changepoints

 CH_changepoints (trajs, tau:int=10, metric:{'volume','area'}='volume')

Computes the changes points a multistate trajectory based on the Convex Hull approach proposed in PRE 96 (022144), 2017.

	Type	Default	Details
trajs	np.array		NxT matrix containing N trajectories of length T.
tau	int	10	Time window over which the CH is calculated.
metric	{‘volume’, ‘area’}	volume	Calculate change points w.r.t. area or volume of CH.
Returns	list		Change points of the given trajectory.

We showcase the use of the convex hull in a Brownian motion trajectory with two distinct diffusion coefficients, one 10 times the other:

# Generate trajectories and plot change points
T = 100; on = 40; off = 60; tau = 5
traj = np.random.randn(T, 2)
traj[on:off, :] = traj[on:off, :]*10
traj = traj.cumsum(0)
plt.axvline(on-tau, c = 'k')
plt.axvline(off-tau, c = 'k', label = 'True change points')

# Calculate variable Sd 
Sd = np.zeros(traj.shape[0]-2*tau)
for k in range(traj.shape[0]-2*tau):       
    Sd[k] = ConvexHull(traj[k:(k+2*tau)], ).volume  

# Compute change points both with volume and area
CPs = CH_changepoints([traj], tau = tau)[0].flatten()-tau
CPs_a = CH_changepoints([traj], tau = tau, metric = 'area')[0].flatten()-tau


# Plot everything
label_cp = 'CH Volume'
for cp in CPs:
    plt.axvline(cp, c = 'g', alpha = 0.8, ls = '--', label = label_cp)
    label_cp = ''
label_cp = 'CH Area'    
for cp in CPs_a:
    plt.axvline(cp, alpha = 0.8, ls = '--', c = 'orange', label = label_cp)
    label_cp = ''
    
plt.plot(Sd, '-o')
plt.axhline(Sd.mean(), label = 'CH Volume mean', c = 'g',)
plt.legend()
plt.xlabel('n'); plt.ylabel(r'$S_d(n)$')

Text(0, 0.5, '$S_d(n)$')

Cramér-Rao lower bound

Expressions of the CRLB for the estimation of the diffusion coefficient for trajectories without noise, for dimension 1 and 2, at different trajectory lengths

source

CRLB_D

 CRLB_D (T:int, dim:int=1)

Returns the Cramer-Rao lower bound S(D)/D for Brownian motion without noise, namely

S(D) / D = sqrt(2 / (d·T) )

	Type	Default	Details
T	int		Length of the trajectory
dim	int	1	Dimension of the trajectoy
Returns	float		Cramér-Rao bound

Here we show how for Brownian motion trajectories, the MSD with is nearly optimal and follows the same scaling as the CRLB (i.e. \(S(D)/D = \sqrt(6/T)\).

Ts = np.logspace(1, 3,20).astype(int) # Trajectory lengths
Ds = np.arange(0.5, 2.1, 0.5) # We will test on different diffusion coefficients

# Saving the standard deviations
std_ds = np.zeros((len(Ds), len(Ts)))

for idxD, D in enumerate(tqdm(Ds)):
    
    # Generate the BM trajectories with given parameters
    trajs = (np.sqrt(2*D)*np.random.randn(int(1e4), Ts[-1], 1)).cumsum(1) 

    for idxL, T in enumerate(tqdm(Ts)):
        
        std_ds[idxD, idxL] = np.array(msd_analysis().get_diff_coeff(trajs = trajs[:, :T], t_lags = [1,2])).std()

fig, axs = plt.subplots(1,4, figsize = (4*3,3), tight_layout = True)

for idx, (ax, std_d, D) in enumerate(zip(axs.flatten(), std_ds, Ds)):
    ax.scatter(Ts, std_d/D, label = r'MSD$_{1-2}$')
    ax.loglog(Ts, np.sqrt(6/Ts), label = 'Predicted std for MSD', alpha = 0.9, c = 'C1')
    ax.loglog(Ts, CRLB_D(Ts), label = 'Cramer-Rao bound', alpha = 0.9, c = 'k')
    ax.set_title(fr'$D = {D}$')
axs[0].legend()
plt.setp(axs[0], ylabel = 'std(D) / D')
plt.setp(axs, xlabel = 'Traj. length');

Bounds for \(\alpha\) and \(K_{alpha}\) in FBM

source

derivative_covariance_matrix_K_alpha

 derivative_covariance_matrix_K_alpha (alpha, K_alpha, N, delta_t=1.0)

Calculate the derivative of the covariance matrix with respect to K_alpha.

source

derivative_covariance_matrix_alpha

 derivative_covariance_matrix_alpha (alpha, K_alpha, N, delta_t=1.0)

Calculate the derivative of the covariance matrix with respect to alpha using vectorized operations.

source

generate_covariance_matrix

 generate_covariance_matrix (alpha, K_alpha, N, delta_t=1.0)

Generate the covariance matrix for FBM using vectorized operations.

source

fisher_information_matrix

 fisher_information_matrix (alpha, K_alpha, N, delta_t=1.0)

Calculate the full Fisher Information Matrix and its inverse.

source

fisher_information_cross

 fisher_information_cross (alpha, K_alpha, N, delta_t=1.0)

Calculate the Fisher Information cross term I_alpha,K_alpha.

source

fisher_information_K_alpha

 fisher_information_K_alpha (K_alpha, N, delta_t=1.0)

Calculate the Fisher Information component I_K_alpha,K_alpha.

In this case, it can be analytically calculate as N/(2*K_alpha**2)

source

fisher_information_alpha

 fisher_information_alpha (alpha, N, delta_t=1.0)

Calculate the Fisher Information component I_alpha,alpha.

Example: CRLB of \(\alpha\) for unknown and known \(K_\alpha\)

# Example parameters
alphas = np.linspace(0.1, 1.9, 50) # anomalous exponent to check
T = 100  # Length of the trajectory
K_alpha = 1 # diffusion coefficient (note that these results are independnet of K_alpha)

# Variance with known K_alpha (i.e. inverse of diagonal term of the Fisher information matrix)
var_a_k = np.array([1/fisher_information_alpha(alpha, T) for alpha in alphas])

# Variance with unknown K_alpha (i.e. diagonal term of the inverse of the full Fisher information matrix)
var_a_uk = np.array([fisher_information_matrix(alpha, K_alpha, T)[1][0,0] for alpha in alphas])


plt.plot(alphas, var_a_k, c = '#003f5c', lw = 2, label = r'Known $K_\alpha$')
plt.plot(alphas, var_a_uk, c = '#ffbf00', lw = 2, label = r'Unknown $K_\alpha$')
plt.plot(alphas, var_a_uk-var_a_k, c = '#a3a3a3', ls = '--', label = 'Difference')
plt.grid(alpha = 0.3)
plt.xlabel(r'$\alpha$')
plt.legend()

<matplotlib.legend.Legend>

p-variation

source

p_variation

 p_variation (traj:<built-infunctionarray>, m:int, p:int)

Given a trajectory, computes the p-variation for the given lag m and variation p.

	Type	Details
traj	array	Trajectory for which to compute the p-variation
m	int	Time lag between positions considered in the p-variation (see Eq. (8) of https://doi.org/10.1103/PhysRevE.82.021130)
p	int	Integer defining the p-variation
Returns	array	p-variation of the current trajectory

source

p_variation_FBM

 p_variation_FBM (H:float, sigma:float, t:<built-infunctionarray>)

Given a Hurst exponent and the std of a FBM process with MSD = sigma^2 t^(2H), computes the predicted p-variation:

p^(1/H) (t) = E(|B(1)|^(1/H))

See more at Eq. (8) in https://doi.org/10.1103/PhysRevE.82.011129

	Type	Details
H	float	Hurst exponent
sigma	float	Standard deviation of the FBM process
t	array	times for which to compute p^(1/H) (t)
Returns	array	Theoretical prediction for the p^(1/H) (t)

Example 1: p-variation scaling for p = 2

In the case of p = 2, based on [1]:

For FBM: we should have an increasing p-variation with bigger resolution (in the paper it means increasing n in Eq.(5) but in the discrete version used here it means decreasing m (see Eq. (8) of [2]).
For CTRW: the p-variation should estabilize to a value given [1] (i.e smaller m should start converging to this stable value).

T = 5000
H = 0.2

fbm_traj = models_theory().fbm(alpha = H*2, T = T)
ctrw_traj = models_theory().ctrw(alpha = H*2, T = T)

fig, ax = plt.subplots(1, 2, figsize = (7,3))

for m in [1,2,3,4,5]:
   
    pvar_fbm = p_variation(fbm_traj, p = 2, m = m)
    pvar_ctrw = p_variation(ctrw_traj, p = 2, m = m)

    ax[0].plot(np.arange(T), pvar_fbm, label = m)
    ax[1].plot(np.arange(T), pvar_ctrw, label = m)
    
ax[0].legend(title = 'm')
ax[0].set_ylabel('p-variation (p = 2)')
plt.setp(ax, xlabel = 'Time');

Example 2: Theoretical estiation of p-variation for FBM and p = 1/H

In the case of p = 1/H, with H being the Hurst exponent, we can theoretically calculated the value of the p-variation for an FBM process (see Eq. (6) of 1).

T = 5000
pvar_alpha = []

Hs = np.linspace(0.2,0.5,5)
pvars = []
pvars_t = []
Ds_est = []

min_idx = 1000
for idxH, H in enumerate(Hs): 

    # We create a trajectory with fixed D = 1
    traj = np.cumsum(models_phenom().disp_fbm(D = 1, alpha = H*2, T = T))
    
    pvar = p_variation(traj, p = 1/H, m = 1)
    pvar_theory = p_variation_FBM(H = H, 
                                  sigma = np.sqrt(2), # Note that definition of diffusion coefficient is different 
                                                      # for models_phenom and this function, hence introducint a sqrt(2) factor
                                  t = np.arange(T))

    plt.loglog(np.arange(T)[min_idx:], pvar[min_idx:], label = 'Numerical calculation' if idxH == 0 else '')
    plt.plot(np.arange(T)[min_idx:], pvar_theory[min_idx:], label = 'Theoretical prediction' if idxH == 0 else '',
             c = f'C{idxH}', ls = '--', alpha = 0.6)
plt.legend()
plt.ylabel('p-variation (p = 1/H)')
plt.xlabel('Time')

Text(0.5, 0, 'Time')

Power spectral density

source

psd

 psd (traj:numpy.ndarray)

Calculates the approximate power spectral density based on scipy.periodogram

	Type	Details
traj	ndarray	Trajectory of size (T,) (i.e. 1D) or (T, dim)
Returns	tuple	(f, psd): frequencies f and power spectral density. Sizes f: (floor(T/2)), psd: (floor(T/2))

Example: from clean to noisy trajectories

In this example we show how to use the PSD and also how the noise present in the trajectories can affect this assesment. This is based on the result of “Towards a robust criterion of anomalous diffusion”, V. Sposini et al..

T = int(1e5)
N = int(1e4)

# Hurst exponents we will consider
Hs = np.array([1/3, 1, 3/2])/2

# Storing variables
# mean and std of the PSD for the clean trajectories
muH = np.zeros((len(Hs), int(T/2)))
stdH = np.zeros_like(muH)
# mean and std of the PSD for the noise trajectories
muH_noise = np.zeros_like(muH)
stdH_noise = np.zeros_like(muH)

sigma_noise = 1e2

for idxH, H in enumerate(tqdm(Hs)):
    
    psdH = np.zeros((N, int(T/2)))
    psdH_noise = np.zeros((N, int(T/2)))
    
    for idxN in (range(N)):
        
        # Here we generate the clean trajectories and compute their PSD
        X = models_phenom().disp_fbm(D = 1, alpha = H*2, T = T).cumsum()
        f, psd = periodogram(X)
        psdH[idxN] = psd[1:]

        # Now we noise the trajectories and compute their new PSD
        X_noise = X + np.random.randn(X.shape[0])*np.sqrt(sigma_noise)
        _, psd_noise = periodogram(X_noise)
        psdH_noise[idxN] = psd_noise[1:]

    
    muH[idxH] = psdH.mean(0)
    stdH[idxH] = psdH.std(0)

    muH_noise[idxH] = psdH_noise.mean(0)
    stdH_noise[idxH] = psdH_noise.std(0)

We can can now show what we previously calculated. The solid lines represent the clean trajectories while the shaded ones are the noisy. The dashed line in the left plot shows the expected scaling for subdiffusive trajectories while the point-dashed one is the expected scaling for normal and superdiffusive trajectories. The dashed line in the rightmost plot show the expected theoretical convergence for each type of diffusion. See more about this in “Towards a robust criterion of anomalous diffusion”, V. Sposini et al..

fig, ax = plt.subplots(1, 2, figsize = (14,6), tight_layout = True)

for idxH, H in enumerate((Hs)):
    ax[0].loglog(f[1:], muH[idxH]/muH[idxH,0], label = fr'$\alpha$ = {np.round(H,2)*2}')
    
    ax[0].loglog(f[1:], muH_noise[idxH]/muH_noise[idxH,0], c=f'C{idxH}', alpha = 0.5)
    if idxH == 0:
        ax[0].plot(np.linspace(1e-4, 1e-2, 10), 1e-6*np.linspace(1e-4, 1e-2, 10)**(-H*2-1), c = 'k', ls = '--')
    elif idxH == 1:
        ax[0].plot(np.linspace(1e-4, 1e-2, 10), 1e-11*np.linspace(1e-4, 1e-2, 10)**(-H*2-1), c = 'k', ls = '-.')

    ax[1].semilogx(f[1:-1], stdH[idxH, :-1]/ muH[idxH, :-1])
    ax[1].semilogx(f[1:-1], stdH_noise[idxH, :-1]/ muH_noise[idxH, :-1], c=f'C{idxH}', alpha = 0.6)


for line in [1, np.sqrt(2), np.sqrt(5)/2]: ax[1].axhline(line, c='k', ls = '--', alpha = 0.8)

plt.setp(ax[0], 
         xlabel = 'f [Hz]',
         ylabel = 'PSD / PSD(f = 0)'
        )

plt.setp(ax[1], 
         xlabel = 'f [Hz]',
         ylabel = r'$\gamma$'
        )

ax[0].legend()

<matplotlib.legend.Legend>