Inverse Modelling

The second underlying question in this chapter is how we can estimate the time of oviposition for a maggot of a given length, and known larval stage and drug treatment. This is called inverse prediction, and for linear regression models, existing formulae can be used to estimate the value of the explanatory variables (see also Wells and LaMotte 1995). However, for GAM and models with multiple variances, there

Series

Fig. 8.11 Estimated model for the observations that did not receive any drug combination. The solid line represents the estimated fitted values and was obtained by bootstrapping. The dotted lines around the smoother are 95% confidence intervals for the population. For a maggot of length 7 and larval stage 2, inverse prediction using bootstrapping gives a value of Series = 2.92, with a lower confidence interval of 2.22 and upper confidence interval of 3.47

are no existing equations (to the best of our knowledge). Therefore, we use our bootstrapping scheme of the previous section. The solid line in Fig. 8.11 is the predicted values for the observations that did not receive a drug treatment (this is the panel labelled APE = 000 in Fig. 8.10). The question we ask ourselves is as follows. Suppose we have a 7 mm long maggot, and it has reached larval stage 2, what was the time of oviposition? Fig. 8.11 contains a dotted horizontal line which intersects the vertical axis at length = 7. Intuitively, we look up where this vertical line intersects with the smoother, and the corresponding Series value gives us the answer. This is the value of Series labelled as S7 in the graph. We would also like to have a confidence interval around S7. These are taken from intersection the 95% confidence bands for the population with the horizontal dotted line. In the graph, these are denoted by L7 (lower confidence band) and U7 (upper confidence band). Draper and Smith called these the "fiducial limits". They advise to consider the interval as the inverse confidence limits for X, given a Y.

There are a couple of potential problems. First of all, if the length is chosen too large (say 17), we end with an inverse confidence interval between approximately 6 and infinity. In terms of biology, this means that you cannot accurately determine the time of oviposition. The second problem is how to determine the lower (L7) and upper (U7) confidence bands. We are still working with the model in Eq. 8.11a.

The difference with the situation in the previous section is that we now know the value of the larval stage, and we can therefore more easily get the population confidence interval. We again created 1,000 similar data sets for each drug treatment using the bootstrap approach. For each of the 1,000 data sets, we fitted the GAM

and predicted the smoother along the entire time axis. Because we know that larval stage is 2, we can easily create the 95% population confidence bands. For each of these models, we determined L7, S7 and U7. This gives us 1,000 realisations of L7, S7 and U7 for each drug treatment. The median L7, S7 and U7 for this group are 2.227, 2.926, and 3.476 respectively. For the observations that received the three drug treatments, we have 2.312, 2.985, and 3.530. We can use these median values as estimators of L7, S7 and U7.

A serious problem is that the L7-U7 interval is not a real confidence interval. For a given length and larval stage, it just provides a range of plausible time values. What we really want is a probability distribution that would allow us to make statements along the lines of:

Or something like: In 95% of the cases the oviposition was between 2 and 3 days ago. For this, we need to derive a probability distribution for Series, given length and stage. This is illustrated in Fig. 8.12. The curve shows the fitted values for the observations without any drug treatment. The dotted lines are population confidence intervals, and the density curves on top of the smoother show the probability of other length values at a certain time. Hence, a maggot of length 7 may be from Series = 2.92, but it is also possible that it is from Series = 2, or even from Series = 1, albeit with a small probability. This probability is visualised as vertical lines along the x-axis. Note that it is not a discrete but continuous distribution.

Series

Fig. 8.12 Illustration of probability distribution for time. The solid line shows the fitted values for observations with no drug treatment, and the dotted line are its population confidence intervals. On top of the fitted line, we have sketched Gaussian density curves, and these show the range of likely length values at different time points. Hence, a maggot of length 7 is most likely from Series = 2.9, but there is also a small probability that it is from Series = 4. The vertical thick lines show the probability distribution for the Series values, for a maggot of length 7

Series

Fig. 8.12 Illustration of probability distribution for time. The solid line shows the fitted values for observations with no drug treatment, and the dotted line are its population confidence intervals. On top of the fitted line, we have sketched Gaussian density curves, and these show the range of likely length values at different time points. Hence, a maggot of length 7 is most likely from Series = 2.9, but there is also a small probability that it is from Series = 4. The vertical thick lines show the probability distribution for the Series values, for a maggot of length 7

To obtain the probability distribution, we need Bayes theorem:

P(Length,Stage)

To calculate these probabilities for a range of series values, various arguable choices were made. Discussing these choices requires concepts like MCMC and Bayesian statistics, and is outside the scope of this chapter. Further research is required. It is interesting to know that the 95% confidence interval obtained by this approach is given by 2.19-3.43, which is similar to our bootstrapping approach.

0 0

Post a comment