Routine quality control (QC) is a necessary part of any quality assurance (QA) plan. Many articles and seminars offer guidance on how often QC events should be performed to maximize error detection. However, even with a well-designed QC program, QC events are snapshots in time. What if we could gather information on assay performance with each new patient sample? The interindividual differences in analyte concentration prohibit using each patient data as a point on its own, but the rolling mean of the combined patient data can be quite informative.

The idea of using the rolling or moving average of patient values was first proposed in 1965 by Hoffman and Waid1. This strategy has been further refined in subsequent articles that have demonstrated that the number of patient results to average and plot on what is essentially a Levy-Jennings graph varies by analyte. These articles also point out that the truncation of outlying data points/patient values can either increase or decrease the sensitivity of the protocol depending on the distribution of patient values. Despite the fact that the articles that describe a step by step process have been published for up to almost three decades the adoption of this practice in the clinical laboratory has been hindered by the unavailability of software packages that can monitor the process in real time.

Now that many automated laboratory analyzers have software programs that can monitor rolling patient means, the challenge is translating theory into practice. Throughout a year of trial and error and now statistical modeling to establish and optimize a moving averages program, I wonder how widely the moving averages concept can be applied in other laboratories without the ability to test and optimize the protocols. In the absence of the ability to model the moving averages protocols with a separate modeling program such as MATLAB, the user has to decide what change in the mean value constitutes a significant error, how many patient results to average per point, and where to place the truncation limits to maximize sensitivity and minimize detection of false shifts. Some of these choices are easier than others. Selecting the error limits doesn’t require a modeling process; one just needs to define what an acceptable degree of error is. Choosing how many patient results to average is also not too taxing and can be guided from a published nomogram2. The true challenge is selecting where to truncate the distribution of data.

A review of the protocols I developed prior to using MATLAB to simulate the moving averages protocols has shown me that my truncation limit placement was not always ideal. Some protocols (e.g. potassium) fared well when tested in silico, virtually matching the computer optimized protocol in the average number of patient samples until detection of an artificially induced error. However, other protocols (e.g. calcium and albumin) performed poorly in detecting either a positive or negative shift.

These results have left me wondering if the laboratorians experience will be enough to generate sensitive moving averages protocols. Or will the implementation of this QA technique require computer modeling to optimize error detection. Perhaps the answer to these questions will be the answer to a further question. How sensitive do the moving averages protocols need to be to be of value to an established QA program? Does the protocol need to detect an error with in 5, 20 or 100 patient results to be of value to a clinical laboratory? If there is value in detecting a shift even after 100 – 200 patient results then perhaps modeling is not necessary to establishing a sensitive moving averages program.


  1. Hoffmann RG, Waid ME. The “average of normals” method of quality control. Am J Clin Pathol 1965;43:134-41.
  2. Cembrowski GS, Chandler EP, Westgard JO. Assessment of “Average of Normals” Quality Control Procedures and Guidelines for Implementation. Am J Clin Pathol 1984; 81: 492-9