Tools and classes for the identification of behavioural bouts
A histogram of log-transformed frequencies of x with a chosen bin width
and upper limit forms the basis for models. Histogram bins following empty
ones have their frequencies averaged over the number of previous empty bins
plus one. Models attempt to discern the number of random Poisson
processes, and their parameters, generating the underlying distribution of
log-transformed frequencies.
Currently, the model describing the histogram as it is built is implemented
in the BoutsNLS class. For the case of a mixture of two Poisson
processes, this class would set up the model:
where \(N_f\) and \(N_s\) are the number of events belonging to
process \(f\) and \(s\), respectively; and \(\lambda_f\) and
\(\lambda_s\) are the probabilities of an event occurring in each
process. Mixtures of more processes can also be added to the model.
The bout-ending criterion (BEC) corresponding to equation (1) is:
This is the preferred approach to modelling mixtures of random Poisson
processes, as it does not rely on the subjective construction of a
histogram. The histogram is only used to generate reasonable starting
values, but the underlying paramters of the model are obtained via maximum
likelihood, so it is more robust.
For the case of a mixture of two processes, as above, the log likelihood of
all the \(N_t\) in a mixture can be expressed as:
Abstract base class for models of log-transformed frequencies
This is a base class for other classes to build on, and do the model
fitting. Bouts is an abstract base class to set up bout
identification procedures. Subclasses must implement fit and bec
methods, or re-use the default NLS methods in Bouts.
Find starting values for mixtures of random Poisson processes
Starting values are calculated using the “broken stick” method.
Parameters:
x_break (array_like) – One- or two-element array with values determining the break(s)
for broken stick model, such that \(x < x\_break_0\) is
first process, \(x >= x\_break_1\) & \(x < x\_break_2\)
is second process, and \(x >= x\_break_2\) is third one.
plot (bool, optional) – Whether to plot the broken stick model.
ax (matplotlib.Axes, optional) – An Axes instance to use as target. Default is to create one.
Maximum Likelihood estimation for models of Poisson process mixtures
Methods for modelling log-frequency data as a mixture of Poisson
processes via maximum likelihood estimation [2], [3]. Mixtures of
two or three Poisson processes are supported.
Even in these relatively simple cases, it is very important to provide
good starting values for the parameters.
One useful strategy to get good starting parameter values is to proceed
in 4 steps. First, fit a broken stick model to the log frequencies of
binned data (see init_pars()), to obtain estimates of 4
parameters in a 2-process model [1], or 6 in a 3-process model.
Second, calculate parameter(s) \(p\) from the \(\alpha\)
parameters obtained by fitting the broken stick model, to get tentative
initial values as in [2]. Third, obtain MLE estimates for these
parameters, but using a reparameterized version of the -log L2
function. Lastly, obtain the final MLE estimates for the three
parameters by using the estimates from step 3, un-transformed back to
their original scales, maximizing the original parameterization of the
-log L2 function.
init_pars() can be used to perform step 1. Calculation of
the mixing parameters \(p\) in step 2 is trivial from these
estimates. Method negMLEll() calculates the negative
log-likelihood for a reparameterized version of the -log L2 function
given by [1], so can be used for step 3. This uses a logit
transformation of the mixing parameter \(p\), and log
transformations for density parameters \(\lambda\). Method
negMLEll() is used again to compute the -log L2 function
corresponding to the un-transformed model for step 4.
The fit() method performs the main job of maximizing the -log L2
functions, and is essentially a wrapper around
minimize(). It only takes the -log L2 function,
a DataFrame of starting values, and the variable to be modelled, all
of which are passed to minimize() for
optimization. Additionally, any other arguments are also passed to
minimize(), hence great control is provided for
fitting any of the -log L2 functions.
In practice, step 3 does not pose major problems using the
reparameterized -log L2 function, but it might be useful to use method
‘L-BFGS-B’ with appropriate lower and upper bounds. Step 4 can be a
bit more problematic, because the parameters are usually on very
different scales and there can be multiple minima. Therefore, it is
almost always the rule to use method ‘L-BFGS-B’, again bounding the
parameter search, as well as other settings for controlling the
optimization.
start (pandas.DataFrame) – DataFrame with starting values for coefficients of each process
in columns. These can come from the “broken stick” method as
in Bouts.init_pars(), and will be transformed to minimize
the first log likelihood function.
fit1_opts (dict) – Dictionaries with keywords to be passed to
scipy.optimize.minimize(), for the first and second fits.
fit2_opts (dict) – Dictionaries with keywords to be passed to
scipy.optimize.minimize(), for the first and second fits.
Returns:
fit1, fit2 – Objects with the optimization result from the first and second
fit, having a x attribute with coefficients of the solution.
Log likelihood function of parameters given observed data
Parameters:
params (array_like) – 1-D array with parameters to fit. Currently must be either
3-length, with mixing parameter \(p\), density parameter
\(\lambda_f\) and \(\lambda_s\), in that order, or
5-length, with \(p_f\), \(p_fs\), \(\lambda_f\),
\(\lambda_m\), \(\lambda_s\).
x (array_like) – Independent data array described by model with parameters
params.
istransformed (bool) – Whether params are transformed and need to be un-transformed
to calculate the likelihood.
Generate samples from mixture of exponential distributions
Simulate a mixture of two or three random exponential distributions.
This uses a special definition for the probabilities (\(p_i\)). In
the two-process case, \(p\) represents the proportion of “fast” to
“slow” events in the mixture. In the three-process case, \(p_0\)
represents the proportion of “fast” to “slow” events, and \(p_1\)
represents the proportion of “slow” to “slow” and “very slow” events.