**1. Simulated data**

We use the simulated dataset provided in Exercise 6 'Single season occupancy mixture models' in Donovan, T. M. and J. Hines (2007) Exercises in occupancy modeling and estimation.

These data were simulated with heterogeneous time-dependent detection probabilities using the following parameter values:

Parameter p1 p2 p3 p4 psi N pi

Mixture 1 0.8 0.7 0.6 0.5 0.8 100 0.4

Mixture 2 0.4 0.35 0.3 0.25 0.8 150 0.6

Number of sites R = 250.

The data are available here.

We considered a model with 2 mixtures and used: 4 states (5 in E-SURGE), unoccupied in class (or group) 1, unoccupied in class 2, occupied in class 1, occupied in class 2. There are two observations, species undetected (0) or detected (1).

**2. Identifiability issue**

The vector of initial state probabilities is:

[pi * (1-psi1), (1-pi) * (1-psi2), pi * psi1, (1-pi) * psi2].

In GEPAT, the vector of initial state probabilities can be simply written as

*Initial State*`p p p *`

then summing the first and third components gives pi, the ratio of the first component over pi gives 1-psi1 hence psi1 and so on… However, when this model is fitted, E-SURGE detects that the model is non-identifable: just check the output file in the 'Informations about parameter identifiability' section where it is reported that parameters labeled 1, 2 and 3 are not estimable; these are the initial state probabilities that contain the occupancy and class membership probabilities.

To make these parameters estimable, the constraint psi1 = psi2 needs to be imposed to ensure identifiability of the occupancy probability. This requires to disentangle i) the assignment of sites to a class of heterogeneity and ii) the occupancy process. Here again, the ability in E-SURGE to specify the state or the observation process in several steps will prove useful. More precisely, we'd like to write the vector of initial state probabilities above as the (matrix) product of X and Y where:

X = [pi 1-pi]

Y =

[1-psi1 0 psi1 0

0 1-psi2 0 psi2]

then force the psi's to be the same. This is accomplished in GEPAT by considering 2 steps (this needs to be specified under the Initial states box) while specifying the initial state probabilities:

*Initial State*step 1`p *`*Initial State*step 2`* - p -`

`- * - p`

**3. Analysis in E-SURGE**

The matrix of transition probabilities is the identity matrix (diagonal of ones with zeros elsewhere).

The matrix of observation probabilities is:

[1 0

1 0

1-p(class1) p(class1)

1-p(class2) p(class2)

1 0]

- Start » New session
- Data » Load data (Mark); click OK in the window that pops up asking 'How many columns do we extract from the data?'
- In the 'DATA' section in the main window, click the 'Modify' button and use 5 states and 1 age class
- Models » Markovian states only » Occupancy
- In the 'Advanced Numerical' section in the main window, tick the 'Compute C-I (Hessian)' box to get confidence intervals
- In the 'COMPUTE A MODEL' section in the main window, click on the 'Gepat' yellow button and use the following Matrix Patterns:

*Initial State*- see previous section
*Transition*`* - - - -`

`- * - - -`

`- - * - -`

`- - - * -`

`- - - - *`*Event*`* -`

`* -`

`* b`

`* b`

`* -`

- Click Exit to go to the next step.
- In the 'COMPUTE A MODEL' section in the main window, click on the 'Gemaco' green button and use the following syntax in the Model definition dialog box:

*Init state*`i`*Init state*`i`(or`from`if psi1 is not equal to psi2)*Transition*:

:*Event*`from.t`to get a time effect and heterogeneity in detection

- Gemaco » Call Gemaco (all phrases) or Ctr+G, then click Exit
- In the 'COMPUTE A MODEL' section in the main window, click on the 'IVFV' pink button. Exit.
- In the 'COMPUTE A MODEL' section in the main window, click on the 'RUN' red button to fit the model to the simulated dataset.
- When the dialog box pops up, modify the model name if needed, then click OK
- In the 'Output' section of the main window, click on 'Selected Model Results (.out)' to get the results. More precisely, check out the 'Reduced set of parameters' section in the output file. The three lines below are organised as follows: initial state, transition and event parameters, with the maximum likelihood estimates, the limits of the 95% confidence interval and the SE:

Par# 1# pi( 1, 1)( 1, 1)( 1 1) **pi** | 0.670223435 0.028072182 0.993055896 0.479678224

Par# 11# psi( 1, 3)( 1, 1)( 1 2) **psi** | 0.787743021 0.631495211 0.889349310 0.065920804

Par# 45# E( 3, 2)( 1, 1)( 1 1) **p class 1 time 1** | 0.435233946 0.109171372 0.828946242 0.230593683

Par# 46# E( 4, 2)( 1, 1)( 1 1) **p class 2 time 1** | 0.840010447 0.078600488 0.996915042 0.282487123

Par# 52# E( 3, 2)( 2, 1)( 1 1) **p class 1 time 2** | 0.380863004 0.101195997 0.770693663 0.204299633

Par# 53# E( 4, 2)( 2, 1)( 1 1) **p class 2 time 2** | 0.734923975 0.184464632 0.971415448 0.249093088

Par# 59# E( 3, 2)( 3, 1)( 1 1) **p class 1 time 3** | 0.331877158 0.100845127 0.687499500 0.168355720

Par# 60# E( 4, 2)( 3, 1)( 1 1) **p class 2 time 3** | 0.618910836 0.217770947 0.904526206 0.212229864

Par# 66# E( 3, 2)( 4, 1)( 1 1) **p class 1 time 4** | 0.272369993 0.082276082 0.609819509 0.144511430

Par# 67# E( 4, 2)( 4, 1)( 1 1) **p class 2 time 4** | 0.524283674 0.190940453 0.837308016 0.196108023

**4. Check the results with program PRESENCE**