Page 152 - Textos de Matemática Vol. 47
P. 152
142 P. DE ZEA BERMUDEZ, M. A. AMARAL TURKMAN, AND K. F. TURKMAN
on carrying out the following steps:
Step 1. simulate a value of ✓ from the prior distribution, p(.)
Step 2. simulate a new sample Dt⇤ from the model M corresponding to this parameter value.
Step 3. accept the sampled value ✓ if d(S(Dt⇤), S0) .
Here, d(.,.) is a metric and is a certain, nonnegative, tolerance level. The simulated data are observations from p(✓ | d(S(D⇤), S0) ), but not from the true posteriori distribution, unless = 0. If ! 1 then the sampled ✓s are simply observations of the prior distribution, and not from the posteriori as it is intended (see Marjoram et al. [18] and Wilkinson [39]).
Evidently, the performance of the algorithm depends on several factors, such as the distance function considered, the kind and the quality of the summary statistics used to summarize the data and the choice of . The values of ✓ must balance the computational e ciency and the precision that we want the final results to have. The euclidean norm,
vuXn d(x,y)=||x y||=t (xi yi)2, x,y2Rn,
i=1
is the distance function usually considered.
The number and kind of summary statistics depend on the problem in hand.
The sample moments are natural choices. The first four moments compare the location, dispersion, asymmetry and kurtosis of the observed sample with the corresponding features of the data simulated from the model M, conditional to the value of ✓ simulated from the prior distribution. For instance, if the tail of the underlying distribution of an observed sample is heavy then a measure of tail weight similarity between the simulated and observed data should be contemplated. For example, the moments estimator of the tail index might be considered (Embrechts et al. [17]).
In what concerns all these issues, there have been several developments in ABC methods in the last few years (see, for instance, Beaumont et al. [1] and Blum et al. [3]). Biau et al. [2] propose considering the kN values of ✓ that are closest to S0 for a certain proximity measure. Usually, kN corresponds to the 0.90 sample quantile. The ABC algorithms are easy to implement but are generally computationally intensive. For instance, according to Biau et al. [2], if the vector of parameters has dimension p = 3 and the number of summary statistics is m = 7, then N = 106 samples must be simulated in order to obtain a sample of size kN = 1000 of ✓, i.e.,
p+4 kN ⇡Nm+p+4.