A Guide to Product-multinomial and Extended Product-multinomial Likelihoods for Capture-Mark-Recapture Models

Using R and Stan

1 Overview

The purpose of this guide is to facilitate the implementation of extended product-multinomial likelihoods (EPM) for transient mixture models in R and Stan. Although the emphasis is on EPM likelihoods, we include an introduction to product-multinomial (PM) likelihoods and their implementation. Most of the programming effort goes into the reduction of capture histories to m-arrays and the calculation of associated multinomial cell probabilities. Having done this, only minor extensions are required to include transients. All R and Stan files mentioned in this guide are available in this repository.

We consider the situation where one has an initial base model for which a PM likelihood is available, and one wants to extend the model to account for transients. Three base models of increasing complexity are considered:

  1. A single-state Cormack-Jolly-Seber model (Section 2).
  2. A multi-state model of survival and movement between two sites (Section 3).
  3. A multi-state model of survival and movement between two sites, incorporating trap-dependence (Section 4).

Each section follows the same format, first specifying the base model, then explaining the m-array data summary, the PM likelihood for the m-array, and the calculation of cell probabilities for the PM likelihood. Following that, we discuss implementations, first of the base model’s PM likelihood, then the EPM likelihood for the extended model including transients.

Prerequisites and further resources

Some familiarity with Stan would be helpful, but the intention and algorithmic details of the code should be clear even if the syntax is unfamiliar. Excellent Stan resources are available here.

We have also attempted to provide a self-contained introduction to product-multinomial likelihoods for single state models (in Section 2.2) and multi-state models (in Section 3.2). For further reading we recommend Williams et al. (2002, chap. 17) or Schaub and Kéry (2021, chap. 4.5).

2 Single state model

2.1 Base model specification

The model has a single live state, and capture histories consist of ones and zeros, indicating whether or not an individual was captured on that occasion. The model parameters consist of

  • \(\phi_t\): the probabilities of surviving from time \(t\) to \(t+1\),

  • \(p_t\): the probability of being detected at time \(t\) given that an individual is alive then.

2.2 PM likelihood for the base model

2.2.1 Constructing the m-array

The m-array summarises capture histories by the number of individuals that are released on a given occasion and next recaptured at some later occasion, or never recaptured. Table 1 and Table 2 illustrate the construction of an m-array from capture histories for a study with five individuals over four sampling occasions.

The first step, illustrated in Table 1, is to decompose the capture histories into segments spanning from a release to the next recapture, or to the end of the capture history if the individual is never recaptured. For example, the history \(1011\) describes an individual who was released on occasion \(1\) and next recaptured on occasion \(3\), then released on occasion \(3\) and next recaptured on occasion \(4\). Accordingly, \(1011\) is split into two segments, \(101*\) and \(**11\), corresponding to these two release-next recapture occasions. In contrast, an individual with history \(0100\) was released on occasion \(1\) but never recaptured, resulting in one segment \(*100\). We use the wildcard symbol, ‘\(*\)’, to convey that a segment will be found in any capture history containing the given pattern, e.g \(**10\) is common to all of \(0010, 0110, 1010, 1110\).

Table 1: Some complete capture histories decomposed into release - next recapture segments
Complete capture history Release at t=1 Release at t=2 Release at t=3
\(1011\) \(101*\) - \(**11\)
\(1110\) \(11**\) \(*11*\) \(**10\)
\(1001\) \(1001\) - -
\(0110\) - \(*11*\) \(**10\)
\(0100\) - \(*100\) -

The m-array counts the number of times a segment appears in the capture histories and arranges the totals in rows corresponding to release occasions and columns corresponding to occasions of next recapture or never recaptured. Table 2 presents the m-array for these data. For example, the segment \(101*\) involves a release on occasion \(1\) and next recapture on occasion \(3\), so its frequency in the data appears in the row labelled 1 and column labelled 3. The segment \(**10\) appears twice in the data; it involves release on occasion three without subsequent recapture, hence the ‘\(2\)’ in row 3, column never.

Table 2: m-array summary of the capture histories in Table 1: row names indicate release occasions and column names the occasion of next recapture, or never recaptured.
2 3 4 never
1 \(1\) \(1\) \(1\) \(0\)
2 \(0\) \(2\) \(0\) \(1\)
3 \(0\) \(0\) \(1\) \(2\)

2.2.2 Product-multinomial likelihood

Every individual released on a given occasion must be recaptured at some later occasion, or never recaptured. The number of individuals realising these possibilities are distributed according to a multinomial distribution. That is, each row \(\mathbf{m}_i\) of the m-array \(\mathbf{m}\) in Table 2 are data for a multinomial distribution: \[ \mathbf{m}_i \sim \text{multinomial}\left(\btheta_i\right) \hspace{5pt} \text{ for } \hspace{5pt} i = 1,2,3. \] where the cell probabilities \(\btheta_i\) are simplices (vectors of non-negative numbers that sum to one) parametrised by survival and detection probabilities. The likelihood for the full m-array is the product of these - hence the name product-multinomial likelihood.

2.2.3 Calculating cell probabilities

Now we calculate the cell probabilities \(\btheta_i\) in terms of survival (\(\phi_t\)) and detection (\(p_t\)) probabilities. For convenience let \(q_t = 1 - p_t\) be the probability of not being captured at time \(t\). We need to calculate the probabilities of all possible segments; this is done differently for those segments involving recapture and those where the individual was never recaptured. In the former case, the following two examples should make the general pattern clear:

  • \(\text{Pr}(11**) = \phi_1p_2\) since it involves surviving from \(t=1\) to \(2\) and being recaptured at \(t=2\),
  • \(\text{Pr}(*101) = \phi_2q_3 \cdot \phi_3p_4\), corresponding to survival from \(t=2\) to \(3\), not being recaptured at \(t=3\), then surviving from time \(3\) to \(4\) and being recaptured at \(t=4\).

In the case of segments where an individual was released at time \(t\) and never recaptured, the probabilities are one minus the probability of being released at time \(t\) and recaptured at any later occasion. That is, if \(\xi_t\) is the probability of never being recaptured given release at time \(t\), then \(\xi_t\) is calculated as one minus the sum of all the other entries in the same row. The \(\xi_t\) can be written out in terms of survival and detection probabilities but the expressions are lengthy and not used directly, so we opt not do so.

Table 3: Cell probabilities for the m-array in Table 2: row names indicate release occasions and column names the occasion of next recapture, or never recaptured.
2 3 4 never
1 \(\phi_1p_2\) \(\phi_1q_2 \phi_2p_3\) \(\phi_1q_2 \phi_2q_3 \phi_3p_4\) \(\xi_1\)
2 \(0\) \(\phi_2p_3\) \(\phi_2q_3 \phi_3p_4\) \(\xi_2\)
3 \(0\) \(0\) \(\phi_3p_4\) \(\xi_3\)

Each row of this table provides the cell probabilities, \(\boldsymbol{\theta}_i\), for the multinomial likelihood for the corresponding row, \(\textbf{m}_i\), of the m-array.

2.3 Implementing the PM likelihood

2.3.1 Preparing the data in R

We construct the m-array from the capture histories in R, using a function we have written to do this, get_marray. Others are available, e.g. in R2ucare. For the single-state model and a study of \(T\) capture occasions, the m-array is a \(T-1 \times T\) matrix.

2.3.2 Stan model code

Stan code specifies how to calculate the log-likelihood given data (here an m-array) and parameters. The code is arranged into blocks for functions, data, parameters and model. We’ll briefly explain each of these in turn.

The most involved part of the code is calculating the cell probabilities for the multinomial distributions. We bundle this into a function get_multinomial_probs in the functions code block, which we explain in detail shortly:

functions {
  matrix get_multinomial_probs( data int T, vector p, vector phi ) { 
  // ... function body ...
  }
}

The study length and m-array are declared in the data block:

data {
  int<lower=2> T;                     // number of years
  array[T-1, T] int<lower=0> marr;    // m-array 
}

We have declared the m-array as a two-dimensional array of non-negative integers; one can think of it as a matrix, or declare it as such if desired.

The survival and detection parameters are declared in the parameters block. The MCMC will return samples from the joint posterior distribution over these.

parameters {
  vector<lower=0,upper=1>[T] p;         // detection probability
  vector<lower=0,upper=1>[T-1] phi;     // survival probability
} 

We constrain these probabilities with <lower=0,upper=1> in the declaration. Although there is no data with which to estimate p[1], and survival phi[T-1] and detection p[T] are not separately identifiable, we prefer to use vectors of this length for p and phi so that the indices into these vectors correspond to sampling occasions.

In the model block we define priors for the parameters, call our function to calculate the multinomial probabilities, and evaluate the log-likelihood.

model {
  // multinomial probabilities
  matrix[T-1, T] pr;      
  pr = get_multinomial_probs(T, p ,phi);
  
  // priors
  p ~ beta(1,1);
  phi ~ beta(1,1);
  
  // likelihood
  for (t in 1:(T - 1)) {
    marr[t] ~ multinomial(pr[t]');
  }
}

The m-array has \(T-1\) rows, and the cell probabilities for each row are stored in the corresponding row of the matrix pr. Stan’s multinomial() function accepts a column vector argument, so we transpose the row vector pr[t] (the \(t^{\text{th}}\) row of pr) in the function call (with pr[t]'). One does not need to supply the total number of ‘trials’ to multinomial(), the function calculates it as the sum over cells.

Lastly, here is the Stan code for the function get_multinomial_probs that calculates cell probabilities for the multinomial likelihoods, as described in Section 2.2.3.

matrix get_multinomial_probs(
    data int T,
    vector p,
    vector phi
  ) {
    // matrix to store multinomial probabilities
    matrix[T-1, T] pr;
    // convenience parameter
    vector[T] q = 1 - p;
    // define multinomial probabilities
    for (t in 1:(T-1)) {
      for (j in 1:(t-1)) {
        pr[t][j] = 0;
      }
      for (j in t:(T - 1)) {
        pr[t][j] = prod(phi[t:j]) * prod(q[(t+1):j]) * p[j+1] ;
      }
      pr[t][T] = 1 - sum(pr[t][1:(T-1)]);
    }
    return pr;
  }

In the notation of Section 2.2.3, pr[i] is \(\btheta_i\) and pr[i][T] is \(\xi_i\).

The file 01_cjs_pm.stan in the folder stan contains the following complete Stan program for the single-state model with PM likelihood:

functions {
  matrix get_multinomial_probs(
    data int T,
    vector p,
    vector phi
  ) {
    // matrix to store multinomial probabilities
    matrix[T-1, T] pr;
    // convenience parameter
    vector[T] q = 1 - p;
    // define multinomial probabilities
    for (t in 1:(T-1)) {
      for (j in 1:(t-1)) {
        pr[t][j] = 0;
      }
      for (j in t:(T - 1)) {
        pr[t][j] = prod(phi[t:j]) * prod(q[(t+1):j]) * p[j+1] ;
      }
      pr[t][T] = 1 - sum(pr[t][1:(T-1)]);
    }
    return pr;
  }
}

data {
  int<lower=2> T;                      // number of years
   array[T-1, T] int<lower=0> marr;    // m-array 
}

parameters {
  vector<lower=0,upper=1>[T] p;         // detection probability for 'residents'
  vector<lower=0,upper=1>[T-1] phi;     // survival probability
} 

model {
  // multinomial probabilities
  matrix[T-1, T] pr;      
  pr = get_multinomial_probs(T, p ,phi);
  
 // priors
  p ~ beta(1,1);
  phi ~ beta(1,1);
  
  // likelihood...
  for (t in 1:(T - 1)) {
    marr[t] ~ multinomial(pr[t]');
  }
}

2.4 Implementing the EPM likelihood

The base model assumes that all individuals are residents. Now we turn to the extension to account for transients. This requires some minor changes to

  • the construction of m-arrays in R, and

  • the Stan model code.

2.4.1 Preparing the data in R

The following steps are required to prepare the data for the extended product-multinomial likelihood.

  1. Split the capture histories into single-encounter and multiple-encounter histories.
  2. Construct an m-array from the multiple-encounter capture histories alone. It is crucial that all single-encounter histories are removed before the m-array is constructed.
  3. Count the number of single-encounter and multiple-encounter capture histories in each cohort.

R code to perform these steps can be found in the script 01a_single-state_transients_epm.R in the folder R.

2.4.2 Stan code

Fortunately, the most complicated part of the Stan code, namely the function get_multinomial_probs requires no changes, so the functions code block remains unchanged.

In the data block, we add the counts of single- and multi-encounter capture histories for each cohort:

data {
  int<lower=2> T;                      // number of years
   array[T-1, T] int<lower=0> marr;    // m-array from multi-encounter histories only 
   array[T-1] int<lower=0> Nsingle;    // no. single encounter histories per cohort 
   array[T-1] int<lower=0> Nmult;      // no. multiple encounter histories per cohort
}
Caution

The m-array here must be constructed from only multiple-encounter capture histories. This is easy to forget as the Stan code declaring the m-array is unchanged from the initial model.

We consider time-varying residency probabilities \(\pi_t\) and add these to the parameters block:

parameters {
  vector<lower=0,upper=1>[T] p;         // detection probability for 'residents'
  vector<lower=0,upper=1>[T-1] phi;     // survival probability
  vector<lower=0,upper=1>[T-1] pi_r;    // proportion 'residents' in each cohort
}

In the model block one adds priors for the new parameters, and increments the log-posterior with the extra terms resulting from transients:

model {
  // multinomial probabilities
  matrix[T-1, T] pr;
  pr = get_multinomial_probs(T, p ,phi);

  // priors
  p ~ beta(1,1);
  phi ~ beta(1,1);
  pi_r ~ beta(1,1);
  
  // likelihood...
  // ... for indiv recaptured at least once
  for (t in 1:(T-1)) {
    marr[t] ~ multinomial(pr[t]');
    target += Nmult[t] * log(pi_r[t]); 
  }
  // ... for indiv never recaptured
  for (t in 1:(T-1)) {
    target += Nsingle[t] * log(pr[t][T] * pi_r[t] + (1 - pi_r[t]));
  }
}

The tilde in statements like marr[t] ~ multinomial(pr[t]') tells Stan to increment the log-posterior with the corresponding log-density term. We can also increment this log-posterior directly using target += statements, and we use this to adjust the log-posterior with the extra terms that arise in the EPM likelihood for the transient mixture model, as described in the main text.

The file 01a_cjs_transients_epm.stan in the folder stan contains the full Stan program for this model.

3 Multi-state model for two sites

3.1 Base model specification

The model has two live states corresponding to two sites. We assume that capture histories consist of zeros, ones and twos, indicating whether an individual was captured, and what site they were at if captured. The model parameters consist of

  • \(\phi_{i,t}\): probability that an individual alive at site \(i\) at time \(t\) survives to time \(t+1\),

  • \(p_{i,t}\): the probability that an individual alive at site \(i\) at time \(t\) is detected,

  • \(m_{i,j}\): the probability that an individual moves from site \(i\) to site \(j\) in one time interval, given survival.

The movement probabilities are conditional on survival so for each source site they sum to one (e.g. \(m_{1,1} + m_{1,2} = 1\)).

3.2 PM likelihood for the base model

The m-array for models with more than one (live) state is still based on release-next recapture segments but incorporates the state at release and recapture as well. If there are \(S\) states and \(T\) capture occasions it consists of an \(S(T-1) \times S(T-1) + 1\) array.

3.2.1 Constructing the m-array

Table 4 presents some example capture histories and the decomposition into release-next recapture (or release-never recapture) segments.

Table 4: Some multi-state capture histories decomposed into release - next recapture segments
Complete capture history Release at t=1 Release at t=2 Release at t=3
\(1021\) \(102*\) - \(**21\)
\(2110\) \(21**\) \(*11*\) \(**10\)
\(1001\) \(1001\) - -
\(0210\) - \(*21*\) \(**10\)
\(0100\) - \(*100\) -

The m-array for these data are presented in Table 5. Now there are \(S = 2\) rows for each release occasion and for each recapture occasion: the occasions are typset in bold and the state at release and recapture in italics. The entries in the table once again represent the frequency of the corresponding segment in the data.

Table 5: m-array summary of the capture histories in Table 4: row names indicate release occasions (bold) and states (italic) and column names indicate the occasion (bold) and state (italic) of next recapture, or never recaptured.
2 2 3 3 4 4 never
1 2 1 2 1 2
1 1 \(0\) \(0\) \(0\) \(1\) \(1\) \(0\) \(0\)
1 2 \(1\) \(0\) \(0\) \(0\) \(0\) \(0\) \(0\)
2 1 \(1\) \(0\) \(0\) \(0\) \(1\)
2 2 \(1\) \(0\) \(0\) \(0\) \(0\)
3 1 \(0\) \(0\) \(2\)
3 2 \(1\) \(0\) \(0\)

3.2.2 Product-multinomial likelihood

As before, the totals in each row are distributed according to a multinomial distribution. However, the cell probabilities for these multinomial distributions are easiest to calculate in \(S \times S\) blocks using matrix multiplication.

3.2.3 Calculating cell probabilities

Consider the probability that an individual released from site \(1\) at time \(t\) is next recaptured at site \(2\) at time \(t+2\). Because we do not know where it was at time \(t+1\), we have to average over the possibilities, yielding

\[ \underbrace{\phi_{1,t}m_{1,1}q_{1,t+1} \cdot \phi_{1,t+1}m_{1,2}p_{2,t+2}}_{\text{undetected at site } 1 \text{ at time } t+1 } + \underbrace{\phi_{1,t}m_{1,2}q_{2,t+1} \cdot \phi_{2,t+1}m_{2,2}p_{2,t+2}}_{\text{undetected at site } 2 \text{ at time } t+1 } \] We also have to calculate the analogous probabilities for the other three combinations of site at release and next recapture. It turns out all the probabilities can be calculated at once as the product of suitable matrices:

\[ \begin{pmatrix} \phi_{1,t}m_{1,1} & \phi_{1,t}m_{1,2} \\ \phi_{2,t}m_{2,1} & \phi_{2,t}m_{2,2} \end{pmatrix} \begin{pmatrix} q_{1,t+1} & 0 \\ 0 & q_{2,t+1} \end{pmatrix} \begin{pmatrix} \phi_{1,t+1}m_{1,1} & \phi_{1,t+1}m_{1,2} \\ \phi_{2,t+1}m_{2,1} & \phi_{2,t+1}m_{2,2} \end{pmatrix} \begin{pmatrix} p_{1,t+2} & 0 \\ 0 & p_{2,t+2} \end{pmatrix}. \]

The \(i,j^{\text{th}}\) entry of the resulting matrix is the probability that an individual released from site \(i\) at time \(t\) is next recaptured at site \(j\) at time \(t+2\). Therefore this matrix contains the cell probabilities for the \(2 \times 2\) block of Table 5 corresponding to release occasion 1 and next recapture occasion 3.

The blocks corresponding to other release and recapture occasions follow a similar pattern. Define

\[ \bg_t = \begin{pmatrix} \phi_{1,t}m_{1,1} & \phi_{1,t}m_{1,2} \\ \phi_{2,t}m_{2,1} & \phi_{2,t}m_{2,2} \end{pmatrix}, \hspace{10pt} \bo_t = \begin{pmatrix} p_{1,t} & 0 \\ 0 & p_{2,t} \end{pmatrix} \hspace{5pt} \text{ and } \hspace{5pt} \boc_t = \begin{pmatrix} q_{1,t} & 0 \\ 0 & q_{2,t} \end{pmatrix}. \tag{1}\]

Then the cell probabilities for the m-array form a matrix defined in blocks as

Table 6: Cell probabilities for the multistate m-array in Table 5 arranged in state blocks: row names indicate release occasions and column names the occasion of next recapture, or never recaptured.
2 3 4 never
1 \(\bg_1 \bo_2\) \(\bg_1 \boc_2 \bg_2 \bo_3\) \(\bg_1 \boc_2 \bg_2 \boc_3 \bg_3 \bo_4\) \(\bx_1\)
2 \(\bz\) \(\bg_2 \bo_3\) \(\bg_2 \boc_3\bg_3 \bo_4\) \(\bx_2\)
3 \(\bz\) \(\bz\) \(\bg_3 \bo_4\) \(\bx_3\)

As in the single-state case, the \(\bx_i\) are determined by the constraint that each row sum to one (but now they are column vectors of length \(2\)). The cell probabilities for each multinomial distribution are contained in the rows of this matrix.

3.3 Implementing the PM likelihood

3.3.1 Preparing the data in R

In R we construct the multi-state m-array with our custom function get_marray. The m-array now has dimensions \(2(T-1) \times 2(T-1)+1\) for a study of length \(T\).

3.3.2 Stan model code

The structure of the Stan code is very similar to the single-state model (without transients), except that the function to compute multinomial cell probabilities is a little more complicated. Treating this function as a black box for now, the Stan code is:

functions {
  matrix get_multinomial_probs(
    data int T,
    array[] vector p,
    array[] vector phi,
    array[] vector m
  ) {
    // ... function body ...
  }
}

data {
  int<lower=2> T;                                 // number of years
  array[2*(T-1), 2*(T-1)+1] int<lower=0> marr;    // m-array
}

parameters {
  array[2] vector<lower=0,upper=1>[T] p;      // detection probabilities
  array[2] vector<lower=0,upper=1>[T-1] phi;  // survival probabilities
  array[2] simplex[2] m;                      // movement probabilities 
}

model {
  // calculate multinomial probabilities
  matrix[2*(T-1), 2*(T-1)+1] pr;
  pr = get_multinomial_probs( T, p, phi, m );
  // m-array capture-recapture likelihood
  for (i in 1:(2*(T-1))) {
    marr[i] ~ multinomial(pr[i]');
  }
}

We have declared survival and detection probabilities as arrays of vectors, with the array index corresponding to site, and the vector index to time. For example, phi is an array with two elements, phi[1] is a vector of survival probabilities at site 1 for occasions \(t = 1, \cdots, T-1\), and phi[2] contains those for site 2. The movement probabilities are conditional on survival, thus they must sum to one for each source site. Stan’s simplex data type conveniently handles this constraint. For example, m[1] is a simplex (a vector of non-negative numbers that sums to one) containing the movement probabilities \(m_{1,1}, m_{1,2}\).

The function get_multinomial_probs implements the calculation of cell probabilities in blocks using matrix multiplication, as described in Section 3.2.3.

matrix get_multinomial_probs(
  data int T,
  array[] vector p,
  array[] vector phi,
  array[] vector m
) {
  // matrix to store multinomial probabilities
  matrix[2*(T-1), 2*(T-1)+1] pr = rep_matrix(0.0, 2*(T-1), 2*(T-1)+1);
  // auxillary matrices to define multinomial probabilities
  array[T-1] matrix[2,2] gamma;   // state transition probabilities
  array[T] matrix[2,2] omega;     // state-dependent detection probabilities
  for (t in 1:(T-1)) {
    gamma[t][1] = [phi[1][t]*m[1][1], phi[1][t]*m[1][2]];
    gamma[t][2] = [phi[2][t]*m[2][1], phi[2][t]*m[2][2]];
  }
  for (t in 1:T) {
    omega[t][1] = [p[1][t], 1-p[1][t]];
    omega[t][2] = [p[2][t], 1-p[2][t]];
  }
  // define entries of pr using matrix multiplication
  for (rI in 1:(T-1)) { // block row index, reverse order
    int I = T - rI;     // I runs from T-1 to 1
    // row indices for 2x2 block
    int i1 = 1 + 2*(I-1);
    int i2 = 2*I;
    // diagonal block 
    pr[i1:i2, i1:i2] = diag_post_multiply(gamma[I], omega[I+1][:,1]); 
    // above-diagonal blocks
    if (I < T-1) {
      matrix[2,2] temp = diag_post_multiply(gamma[I], omega[I+1][:,2]);
      for (J in (I+1):(T-1)) {
        // column indices for 2x2 block
        int j1 = 1 + 2*(J-1);
        int j2 = 2*J;
        pr[i1:i2, j1:j2] = temp * pr[(i1 + 2):(i2 + 2), j1:j2];
      }
    }
  }
  // enforce row-sum-to-one constraint
  for (i in 1:(2*(T-1))){
    pr[i, 2*(T-1)+1] = 1 - sum(pr[i, 1:2*(T-1)]);
  }
    return pr;
  }

The \(2 \times 2\) matrices gamma[t] in the code are the \(\bg_t\) of Section 3.2.3, but we have abused notation with the variables omega[t]. The code’s omega[t] are not the \(\bo_t\) of the text, but instead contain the diagonal entries of both \(\bo_t\) and \(\boc_t\) as their first and second column, respectively. Then \(\bg_t \bo_{t+1}\) is calculated with diag_post_multiply(gamma[t], omega[t+1][:,1]) and \(\bg_t \boc_{t+1}\) with diag_post_multiply(gamma[t], omega[t+1][:,2]). These functions use \(\mathcal{O}(S^2)\) operations as opposed to the \(\mathcal{O}(S^3)\) required if one constructs the full diagonal matrix first, then multiplies.

We have also arranged the calculation of blocks to avoid repeating any matrix multiplications. A naive calculation of all the blocks in table Table 6 involves \(\mathcal{O}\left(T^3\right)\) multiplications of \(\bg_t \boc_{t+1}\)’s with each other and \(\bg_t \bo_{t+1}\)’s, each at cost \(\mathcal{O}(S^3)\). However, we can take advantage of the fact that the above-diagonal blocks can be obtained from the entries below them by pre-multiplying by one matrix of the form \(\bg_t\boc_{t+1}\). This reduces the number of \(\mathcal{O}(S^3)\) multiplications to \(\mathcal{O}\left(T^2\right)\).

Here is how the code works in in the case of Table 6:

  1. Start on the diagonal block for the bottom row by calculating \(\bg_3 \bo_4\). There are no above-diagonal blocks so this completes the calculation for the bottom row.

  2. Next we move up one row, and start by calculating \(\bg_2 \bo_3\) for the diagonal block. For the above-diagonal block on the second row, we first calculate \(\bg_2 \boc_3\) and store it as temp, then pre-multiply it to \(\bg_3 \bo_4\), which we calculated in the previous step, to get \(\bg_2 \boc_3 \bg_3 \bo_4\).

  3. Again we move up one row to the first row, and start by calculating \(\bg_1 \bo_2\) for the diagonal block. For the above-diagonal blocks on the first row, we first calculate \(\bg_1 \boc_2\) and store it as temp, then we pre-multiply it to \(\bg_2 \bo_3\) and to \(\bg_2 \boc_3 \bg_3 \bo_4\), which we calculated in the previous step, to get \(\bg_1\boc_2\bg_2 \bo_3\) and \(\bg_1 \boc_2\bg_2 \boc_3 \bg_3 \bo_4\).

3.4 Implementing the EPM likelihood

3.4.1 Preparing the data in R

The data preparation steps are almost identical to those in the single-state case:

  1. Split the capture histories into single-encounter and multiple-encounter histories.
  2. Construct an m-array from the multiple-encounter capture histories alone. It is crucial that all single-encounter histories are removed before the m-array is constructed.
  3. For each occasion and site, count the number of single-encounter and multiple-encounter capture histories that were first captured on that occasion at that site.

R code to perform these steps can be found in the script 02a_multistate_transients_epm.R in the folder R.

3.4.2 Stan code

The changes to the Stan code are also minor, since the function to compute cell probabilities remains unchanged from the base model.

In the data block, we add the number of individuals that were recaptured or were never recaptured per cohort and site (i.e. state).

data {
  int<lower=2> T;                                 // number of years
  array[2*(T-1), 2*(T-1)+1] int<lower=0> marr;    // m-array (see above)
  array[2, T-1] int<lower=0> Nmult;       // no. indiv per cohort and state that were ever recaptured
  array[2, T-1] int<lower=0> Nsingle;     // no. indiv per cohort and state that were never recaptured
}
Caution

The m-array here must be constructed from only multiple-encounter capture histories. This is easy to forget as the Stan code declaring the m-array is unchanged from the initial model.

We add the residency probabilities \(\pi_{i,t}\), which are now allowed to vary by site \(i\) as well as occasion \(t\), to the parameters block:

parameters {
  array[2] vector<lower=0,upper=1>[T] p;        // detection probabilities
  array[2] vector<lower=0,upper=1>[T-1] phi;    // survival probabilities
  array[2] vector<lower=0,upper=1>[T-1] pi_r;   // residency probabilities
  array[2] simplex[2] m;                        // movement probabilities 
}

The model block increments the log-posterior with the extra terms resulting from transients:

model {
  // calculate multinomial probabilities
  matrix[2*(T-1), 2*(T-1)+1] pr;
  pr = get_multinomial_probs( T, p, phi, m );

  // m-array capture-recapture likelihood...
  // ... for indiv recaptured at least once
  for (i in 1:(2*(T-1))) {
    marr[i] ~ multinomial(pr[i]');
  }
  for (t in 1:(T-1)) {
    for (s in 1:2) {
      target += Nmult[s,t] * log(pi_r[s,t]);
    }
  }
  // ... for indiv never recaptured 
  for (t in 1:(T-1)) {
    for (s in 1:2) {
      int i = s+2*(t-1);   // indexes row s in block t
      target += Nsingle[s,t] * log( pr[i][2*(T-1)+1] * pi_r[s,t] + (1 - pi_r[s,t]) );
    }
  }
}
Caution

Care is required to get the indexing into pr correct, so that pr[i][2*(T-1) + 1] is the conditional probability that an individual first captured on occasion \(t\) in state \(s\) is never recaptured, given that they are resident.

The file 02a_ms_transients_epm.stan in the folder stan contains the full Stan program for this model.

4 Multi-state model for two sites with trap-dependence

4.1 Base model specification

We model trap-dependence using the approach of Pradel and Sanz-Aguiler (2012). When an individual is captured on a certain occasion their state becomes ‘trap-aware’, and when they are not captured they become ‘trap-unaware’. The transitions between these states involves detection probabilities which depend on the state.

In the multi-site trap-aware model their are four (live) states:

  1. alive at site \(1\), trap-aware,

  2. alive at site \(1\), trap-unaware,

  3. alive at site \(2\), trap-aware,

  4. alive at site \(2\), trap-unaware.

Survival and movement parameters are the same as the two-site model (Section 3):

  • \(\phi_{i,t}\): probability that an individual alive at site \(i\) at time \(t\) survives to time \(t+1\),

  • \(m_{i,j}\) the probability that an individual moves from site \(i\) to site \(j\) in one time interval, given survival

Now detection probabilities are site- and trap-dependent:

  • \(p^A_{i,t}\): the probability that a trap-aware individual is detected at site \(i\) at time \(t\),

  • \(p^U_{i,t}\): the probability that a trap-unaware individual is detected at site \(i\) at time \(t\).

We assume that survival and movement occur over the interval from \(t\) to \(t+1\), whilst detection occurs at \(t+1\) (Pradel and Sanz-Aguilar 2012). For example, the probability of transitioning from the state alive at site 1, trap-aware at time \(t\) to alive at site 2, trap-unaware at \(t+1\) is \[ \phi_{1, t} \cdot m_{1,2} \cdot \left(1 - p^A_{2, t+1}\right). \tag{2}\]

4.2 PM likelihood for the base model

The model can be implemented as a multi-state model with some modification to the state-transition and observation matrices introduced in Equation 1. The state-transition matrices \(\bg_t\) are \(4 \times 4\) matrices whose entries include detection probabilities, as in Equation 2. Introducing convenience parameters \(q^A = 1 -p^A\) and \(q^U = 1 - p^U\), the transition matrices are:

\[ \bg_t = \begin{pmatrix} \phi_{1,t}m_{1,1}p^A_{1,t+1} & \phi_{1,t}m_{1,1}q^A_{1,t+1} & \phi_{1,t}m_{1,2}p^A_{2,t+1} & \phi_{1,t}m_{1,2}q^A_{2,t+1} \\ \phi_{1,t}m_{1,1}p^U_{1,t+1} & \phi_{1,t}m_{1,1}q^U_{1,t+1} & \phi_{1,t}m_{1,2}p^U_{2,t+1} & \phi_{1,t}m_{1,2}q^U_{2,t+1} \\ \phi_{2,t}m_{2,1}p^A_{1,t+1} & \phi_{2,t}m_{2,1}q^A_{1,t+1} & \phi_{2,t}m_{2,2}p^A_{2,t+1} & \phi_{2,t}m_{2,2}q^A_{2,t+1} \\ \phi_{2,t}m_{2,1}p^U_{1,t+1} & \phi_{2,t}m_{2,1}q^U_{1,t+1} & \phi_{2,t}m_{2,2}p^U_{2,t+1} & \phi_{2,t}m_{2,2}q^U_{2,t+1} \\ \end{pmatrix} \]

With the detection probabilities contained in the state-transition matrices, the observation matrices \(\bo_t\) and \(\boc_t\) just express the fact that recaptures are, by definition, of trap-aware states, and non-captures are, again by defintion, of trap-unaware states:

\[ \bo_t = \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 0 \end{pmatrix} \hspace{10pt} \text{ and } \hspace{10pt} \boc_t = \begin{pmatrix} 0 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix}. \]

With these definitions the reduction of capture histories to m-arrays and calculation of cell probabilities proceeds exactly as it did for the two-state model.

4.3 Implementing the PM likelihood

The Stan implementation of the product-multinomial likelihood for this four-state model is similar to that for the two-state model (see Section 3.3.2). The function get_multinomial_probs incorporates the changes to the state-transition and observation matrices discussed above. The zeroes in the observation matrix cause problems for Stan’s automatic differentiation, so we use the hack of replacing them with a very small positive number eps, which is defined in a transformed data block:

functions {
  matrix get_multinomial_probs(
    data int T,
    data real eps,
    array[] vector pA,
    array[] vector pU,
    array[] vector phi,
    array[] vector m
  ) {
    // matrix to store multinomial probabilities
    matrix[4*(T-1), 4*(T-1)+1] pr = rep_matrix(0.0, 4*(T-1), 4*(T-1)+1); 
    
    // convenience parameters
    array[2] vector[T] qA;
    array[2] vector[T] qU;
    for (i in 1:2) {
      qA[i] = 1 - pA[i];
      qU[i] = 1 - pU[i];
    }
    
    // auxillary matrices to define multinomial probabilities
    array[T-1] matrix[4,4] gamma;   // state transition probabilities
    array[T] matrix[4,2] omega;     // state-dependent detection probabilities
    // define gamma
    for (t in 1:(T-1)) {
      // define 2x2 site blocks
      for (I in 1:2) {      // I indexes source site
        int first_i = 2*(I-1) + 1;
        int last_i = 2*I;
        for (J in 1:2) {    // J indexes destination site
          int first_j = 2*(J-1) + 1;
          int last_j = 2*J;
          gamma[t][first_i:last_i, first_j:last_j] = [
            [phi[I][t]*m[I][J]*pA[J][t+1], phi[I][t]*m[I][J]*qA[J][t+1]],
            [phi[I][t]*m[I][J]*pU[J][t+1], phi[I][t]*m[I][J]*qU[J][t+1]]
          ];
        }
      }
    }
    // define omega
    // note: here omega's are the same for all t, so one could just define
    // omega as a single matrix; we retain this code for ease of generalisation
    for (t in 1:T) {
      omega[t] = [
        [1-eps,   eps],
        [  eps, 1-eps],
        [1-eps,   eps],
        [  eps, 1-eps]
      ];
    }
    // define entries of pr using matrix multiplication
    for (rI in 1:(T-1)) { // block row index, reverse order
      int I = T - rI;     // I runs from T-1 to 1
      // row indices to for 4x4 block
      int i1 = 1 + 4*(I-1);
      int i2 = 4*I;
      // diagonal block
      pr[i1:i2, i1:i2] = diag_post_multiply(gamma[I], omega[I+1][:,1]); 
      if (I < T-1) {
        matrix[4,4] temp = diag_post_multiply(gamma[I], omega[I+1][:,2]);
        for (J in (I+1):(T-1)) {
          // column indices for 4x4 block
          int j1 = 1 + 4*(J-1);
          int j2 = 4*J;
          pr[i1:i2, j1:j2] = temp * pr[(i1 + 4):(i2 + 4), j1:j2];
        }
      }
    }
    // enforce row-sum-to-one constraint
    for (i in 1:(4*(T-1))){
      pr[i, 4*(T-1)+1] = 1 - sum(pr[i, 1:4*(T-1)]);
    }
    return pr;
  }
}

data {
  int<lower=2> T;                           // number of years
  array[4*(T-1), 4*(T-1)+1] int<lower=0> marr;    // m-array
}

transformed data {
  real<lower=0> eps = pow(10, -15);
}

parameters {
  // detection probabilities
  array[2] vector<lower=0,upper=1>[T] pA;     // for trap-aware indiv
  array[2] vector<lower=0,upper=1>[T] pU;     // for trap-unaware indiv
  array[2] vector<lower=0,upper=1>[T-1] phi;  // survival probabilities
  array[2] simplex[2] m;                      // movement probabilities 
}

model {
  // calculate multinomial probabilities
  matrix[4*(T-1), 4*(T-1)+1] pr;
  pr = get_multinomial_probs( T, eps, pA, pU, phi, m );
  // m-array capture-recapture likelihood
  for (i in 1:(4*(T-1))) {
    marr[i] ~ multinomial(pr[i]');
  }
}

4.4 Implementing the EPM likelihood

Having implemented the PM likelihood for the base model, the steps to implement the EPM likelihood for the transient mixture model are the same as in two-site model (Section 3.4). Code for this case is in 03a_multistate-plus-trapdep_transients_epm.Rof the R folder, and 03a_ms-plus-td_transients_epm.stan of the stan folder.

References

Pradel, Roger, and Ana Sanz-Aguilar. 2012. “Modeling Trap-Awareness and Related Phenomena in Capture-Recapture Studies.” PloS One 7 (3): e32666.
Schaub, Michael, and Marc Kéry. 2021. Integrated Population Models: Theory and Ecological Applications with R and JAGS. Academic Press.
Williams, Byron K, James D Nichols, and Michael J Conroy. 2002. Analysis and Management of Animal Populations. Academic press.