0 Comments

**NORMAL VALUES & THE STATISTICAL EVALUATION OF DATA**

The approximate ranges of values in normal humans for some commonly measured plasma constituents are summarized in the table on the inside back cover. A worldwide attempt has been under way to convert to a single standard nomenclature by using SI (Systeme International) units. The system is based on the seven dimensionally independent physical quantities summarized in Table 1. Units derived from the basic units are summarized in Table 2, and the prefixes used to refer to decimal fractions and multiples of these and other units are listed in Table 3. There are a number of complexities involved in the use of these units—-for example, the problem of expressing enzyme units—-and they have been slow in making their way into the medical literature. In this book, the values in the text are in traditional units, but they are followed in key instances by values in SI units. In addition, values in SI units are listed beside values in more traditional units in the table on the inside back cover of this book.

The accuracy of the methods used for laboratory measurements varies. It is important in evaluating any single measurement to know the possible errors in making the measurement. For chemical determinations on body fluids, these include errors in obtaining the sample and the inherent error of the chemical method. However, the values obtained by using even the most accurate methods vary from one normal individual to the next as a result of what is usually called **biologic variation.** This variation occurs because in any system as complex as a living organism or tissue there are many variables that affect the particular measurement. Variables such as age, sex, time of day, time since last meal, etc., can be taken into account. Numerous other variables cannot, and for this reason the values obtained differ from individual to individual.

The magnitude of the normal range for any given physiologic or clinical measurement can be calculated by standard statistical techniques if the measurement has been made on a suitable sample of the normal population (preferably more than 20 individuals). It is important to know not only the average value in this sample but also the extent of the deviation of the individual values from the average.

The average **(arithmetic mean, M)** of the series of values is readily calculated:

where

The average deviation is the mean of the deviations of each of the values from the arithmetic mean. From a mathematical point of view, a better measure of the deviation is the **geometric mean** of the deviations from the mean. This is called the **standard deviation of the sample (s):**

The term n – 1, rather than n, is used for complex mathematical reasons. s should be distinguished from the standard deviation of the mean of the whole population, which is designated σ. However, if the sample is truly representative, s and σ will be comparable.

Another commonly used index of the variation is the **standard error of the mean (SEM):**

Strictly speaking, the SEM indicates the reliability of the sample mean as representative of the true mean of the general population from which the sample was drawn.

A **frequency distribution** curve can be constructed from the individual values in a population by plotting the frequency with which any particular value occurs in the series against the values. If the group of individuals tested was homogeneous, the frequency distribution curve is usually symmetric (Figure 1), with the highest frequency corresponding to the mean and the width of the curve varying with σ (curve of **normal distribution**). Within an ideal curve of normal distribution, the percentage of observations that fall within various ranges is shown in Table 4. The mean and s of a representative sample are approximately the mean and σ of the whole population. It is therefore possible to predict from the mean and s of the sample the probability that any particular value in the general population is normal. For example, if the difference between such a value and the mean is equal to 1.96 s, the chances are 1 out of 20 (5 out of 100) that it is normal. Conversely, of course, the chances are 19 out of 20 that it is abnormal.

Statistical analysis is also useful in evaluating the significance of the difference between two means. In physiologic and clinical research, measurements are often made on a group of animals or patients given a particular treatment. These measurements are compared with similar measurements made on a control group that ideally has been exposed to exactly the same conditions except that the treatment has not been given. If a particular mean value in the treated group is different from the corresponding mean for the control group, the question arises whether the difference is due to the treatment or to chance variation. The probability that the difference represents chance variation can be estimated in many instances by using **Student’s t test.** The value t is the ratio of the difference in the means of two series (Ma and Mb) to the uncertainty in these means. The formula used to calculate t is

where na and nb are the number of individual values in series a and b, respectively. When na = nb, the equation for t becomes simplified to

The higher the value of t, the less the probability that the difference represents chance variation. This probability also decreases as the number of individuals (n) in each group rises, because the greater the number of measurements, the smaller the error in the measurements. A mathematical expression of the probability (P) for any value of t at different values of n can be found in tables in most texts on statistics. P is a fraction which expresses the probability that the difference between two means was due to chance variation. Thus, for example, if the P value is 0.10, the probability that the difference was due to chance is 10% (1 chance in 10). A P value of < 0.001 means that the chances that the difference was due to random variation are less than 1 in 1000. When the P value is < 0.05, most investigators call the difference “statistically significant”; ie, it is concluded that the difference is due to the operation of some factor other than chance. The use of t tests is appropriate only for comparison of two groups. If they are used for comparison in experiments involving more than two groups, a systematic error is introduced that makes the probability of deciding that there is a significant difference too high. In this situation, **analysis of variance (ANOVA)** is the appropriate statistical test. This and other techniques are discussed in statistics texts.

These elementary methods and many others available for statistical analysis in the research laboratory and the clinic provide a valuable objective means of evaluation. Statistical significance does not arbitrarily mean physiologic significance, and the reverse is sometimes true; but replacement of evaluation by subjective impression with analysis by statistical methods is an important goal in the medical sciences.

Useful books on statistics include:

Dawson B, Trapp RG: *Basic and Clinical Biostatistics,* 3rd ed. McGraw-Hill, 2001.

Rosner B: *Fundamentals of Biostatistics.* Duxbury, 1982. *The SI for the Health Professions:* World Health Organization, 1977.

Zar JH: *Biostatistical Analysis.* Prentice-Hall, 1974.

## Leave a Reply