Table 3. Interlaboratory results
Tol and any other detected analytes, for each
for the certified QA standard
of the six fortified soil subsamples.
All analyte concentrations were reported in g/
(Volatiles Mix 2, Ultra Scientific).
mL. We converted these results to soil concentra-
Benzene (50.2 g/mL certified)
tions using the masses of the subsamples. In addi-
Mean
51.9
tion, each laboratory was asked to report the date
Std Dev.
3.8
of analysis and the model and manufacturer of
RSD*
7.4%
their mass spectrometer, purge-and-trap instru-
Trichloroethylene (50.1 g/mL certified)
ment, purge-and-trap column, gas chromatograph
Mean
51.9
and column.
Std Dev.
3.8
RSD
7.2%
Toluene (50.1 g/mL certified)
RESULTS
Mean
50.8
Std Dev.
4.5
Rationale for statistical tests
RSD
8.8%
The primary purpose of this study was to dem-
* Relative standard deviation
onstrate that soil samples could be reproducibly
vapor fortified with VOCs and distributed to labo-
ratories for various performance evaluation pur-
oratory 16 was also noted, but the data were re-
poses. After careful inspection and the use of sev-
tained for further review.
eral statistical tests, a few extreme (outlier) values
were excluded from the final analysis. Knowledge
Youden two-sample plots
of the analytical system was used in reaching deci-
Youden two-sample plots provide an excellent
sions to retain some results despite the presence of
visualization of the relative amount of systematic
small but statistically significant systematic errors.
error between laboratories compared to the
Evaluation of collaborative test results always
amount of random error (Youden and Steiner
seems to require compromise to avoid excessive
1975). Since the two batches of subsamples for each
exclusions while preventing gross distortion that
soil were prepared independently, we plotted the
can be caused by a few extreme values. When a
two concentration values from each laboratory
laboratory produces several outliers in the same
against each other. Concentrations of all four ana-
direction, this is strong evidence of a large system-
lytes for a given soil were plotted on a single graph
atic error, and exclusion of that data is justified.
after adjusting the means for each analyte (by sub-
traction) to the mean of the analyte with the lowest
Examination of laboratory reports
concentration. This process leaves the absolute
Analytical results and sample analysis dates
were first reviewed to detect any obvious prob-
Figure 1 contains the soil C results prior to any
lems. Laboratory 13 was unable to analyze the
data exclusion. In the presence of random error
treated soil subsamples within the specified hold-
only, the points should form a circular array with
ing time due to instrument problems, so those data
approximately equal numbers of points in each of
were omitted. The results were excellent for the
the four quadrants formed by the intersection of
analysis of the three analytes in the QA standard
lines representing the means. Systematic error
supplied with each sample set. The mean values
causes points to depart from this intersection of the
means along a 45 line. Clearly, there is large sys-
for TCE, Ben and Tol differed from the certified
concentrations by an average of only 2.8%. Rela-
tematic error for laboratories 3 and 12 for all four
tive standard deviations for the three analytes var-
ied from 7.2 to 8.8% (Table 3).
between laboratories. Although not shown here,
All of the results provided by the 13 laboratories
similar patterns were observed for soils A and B.
are given in Appendix B. Results for subsample A1
If we exclude data from laboratories 3 and 12
from laboratory 5 showed that two analytes were
and replot the data with revised means (Fig. 2),
not detected, while the other two were abnormally
there is still a distinct elliptical pattern characteris-
low. It was concluded that the ampoule must not
tic of a combination of random and systematic er-
have been properly sealed. To maintain balance in
ror. Instead of an equal number of points in each
the data array, both subsamples A1 and A2 from
quadrant, the low-low quadrant contains 15
laboratory 5 were excluded. A very low recovery
points, the high-high has 14, and the other two
for the internal standard of subsample B2 from lab-
quadrants contain only 5 and 6 points. It is also
4