The term Z-score, Z-values, Z-ratio, or Z is a statistical measurement of a number in relationship to the mean of the group of numbers. It refers to points along the base of the standardized normal curve. The center point of the curve has a Z-value of 0. Z-values to the right of 0 are positive and Z-values to the left are negative values. A Z-score is above the mean if to the right of 0 and is below the mean if left of the 0 center point. The distance from the mean is measured by standard deviations. If the Z-score is 0, it is 0 standard deviations from the mean and is equal to the mean.

Z-score is calculated by taking the difference between the number and the mean (average) and then dividing the difference obtained by the standard deviation.

X represents the raw number. The population mean is represented by - and the standard deviation is the symbol o.

A Z-score is standardized. It is irrelevant whether you are comparing Canadian dollars, U.S. dollars, euros, or British pounds. In fact, the unit may be measuring height, weight, education levels, or test scores. The Z-score is always relative to the mean that is the center or designated as zero. The Z-score tells you much about the distribution of the numbers in your data set and can highlight extremes.

With the Z-score, the area under the normal curve can be determined by computer calculation or by looking at tables. You can search the Internet for these tables. An example is located at sagepub.com/fitzgerald/study/materials/appendices/app_d.pdf.

A selected portion of a Z-score table is created in Table 5.2.

TABLE 5.2 Partial Z-Score Table

A Z-score of 1.50 indicates that 43.32 percent of the area under the normal curve is located between the mean and the Z-score of 1.50. How much area under the curve would Z-scores between -1.96 and +1.96 be? Since we are looking at both the negative and positive sides of the mean, we multiply the area between the mean and Z by 2 (.4750 - 2), resulting in 95 percent. The higher the absolute value of the Z-score, the further the number is away from the mean or the norm. The auditor may wish to examine transactions that are extreme outliers.

Statistical theory states that 99.7 percent of the time, the Z-score will be between -3.00 and +3.00. It will be between -2.00 and +2.00 for 95 percent of the time, and 68 percent of the time, it will be between -1.00 and +1.00

IDEA does not automatically calculate the Z-score for you. However, it does have built-in calculations that make the calculation of the Z-score for each transaction easy.

Going back to our payment tender net database, let us take a look at the field statistics for the PAYAMOUNT_SUM field.

We note that the average value or the mean is $19.48 and that the Pop Std Dev or population standard deviation is $15.17 in Figure 5.16. These are calculated for you by IDEA.

FIGURE 5.16 Field Statistics for Payment Amounts

IDEA assigns numbers to be used with the @FieldStatistics function. The assigned number for the average or mean is 11 and 18 for the population standard deviation as shown in Figure 5.17.

The syntax for use of the function is @FieldStatistics("FieldName", Statistic).

We create or append a field called Z_SCORE by applying the formula for Z-score in the equation editor.

The formula obtains the average value and the population standard deviation amounts from the field statistics area to use in the equation shown in Figure 5.18.

When we index the Z_SCORE field by descending order, we can see that all of the 84,718 records, with the exception of 519 transactions, had Z-scores of 3.99 or below in Figure 5.19. In this particular data set, the Z-score goes as high as 23.78. The auditor needs to use judgment as to how many outliers should be examined in detail or to select the Z-score point for the examination cut-off.

When the Z_SCORE field is indexed by descending order, the PAYAMOUNT_SUM field would also have the largest amounts shown from the highest to lowest. While both fields indicate amount order, the Z-score provides you with a sense of how far each amount is from the mean so you can gauge the magnitude of the anomaly.