Log in / Register
Home arrow Business & Finance arrow Fraud and fraud detection
< Prev   CONTENTS   Next >


The number duplication test (NDT) is not specifically a Benford's Law test; rather it is an associated test that can be used to provide more information resulting from Benford's Law tests. The NDT can output specific numbers that caused the spikes in the first order tests (the first two digits test is one example) and the summation test, which is an advanced Benford's Law test.

Spikes from any first order test are caused by numbers occurring more frequently than expected while spikes in the summation test are usually due to high volumes of the same numbers repeating more often than normal. Mark Nigrini wrote, "The number duplication test was developed as a part of my PhD dissertation when I was looking for taxpayer amounts that were duplicated abnormally often. I believed that these abnormal duplications were there because taxpayers were inventing numbers and that, since we think alike, people would gravitate toward making up the same numbers. There were some interesting results for deduction fields such as charitable contributions."5

The NDT should be applied to different criteria in the data set.

- Greater than or equal to 10

- From 0.01 to 9.99

- From -0.01 to -9.99

- Less than or equal to -10

The output should show the amount that was duplicated and the count for each amount.

The ranking of the counts presents those with the highest number of duplicate amounts near the top of the results.

IDEA does not have an automated process for the NDT. However, it is not complex to perform this manually in IDEA.

We decided to use a travel expense data file to perform the NDT on. The travel expense file has fields for the various types of travel expenses, such as transportation, airfare, accommodation, meals, and so on. We will test the accommodation amounts, and for amounts greater than or equal to $10.00.

1. Summarize by our selected test field of ACCOMMODATION with the criteria of ACCOMMODATION > = 10 as shown in Figure 5.11.

Summarizing the Test Field of Accommodations with Amounts Greater Than or Equal to $10.00

FIGURE 5.11 Summarizing the Test Field of Accommodations with Amounts Greater Than or Equal to $10.00

2. Create a new file using the Sort feature on the NO_OF_RECORDS field in descending order shown in Figure 5.12. This is our count field, which we will easily rank later as we had selected the sort to be in descending order. A sort creates a new database with the selected sort order physically set.

New Sorted File on Number of Records

FIGURE 5.12 New Sorted File on Number of Records

3. We will create two new fields by using the Append field feature. Name the first new field as RANK set as a virtual numeric field with no decimals. Employ the function of @Precno( ), which returns the physical record number in the file that would remain the same even if the file was subsequently indexed differently. This would equate to the ranking that we would like to see.

In addition, we can pull out the first two digits from the accommodation amount for each record. Append a new field called FIRST_TWO set as a virtual numeric field with no decimal places. For the new field use the equation

@Val(@Left(@Str(ACCOMMODATION, 2, 2), 2))

This equation converts the ACCOMMODATION field to a character field using the @Str function to output a minimum of two characters. The second 2 is the number of decimal places. We could enter 0 or any number of decimal places, but the decimal places will be dropped because we specified that the new field would contain no decimal places. Two decimal places were used here as normally you would desire two decimal places, and also to provide an opportunity to make this point. The numeric ACCOMMODATION field must be changed to a character field in order to perform any position operations. We use the @Left function to extract the first two characters starting from the left position. Then we use the @Val function to convert everything back to numeric, as shown in Figure 5.13.

We now have all the accommodation expenses from the NO_OF_RECS field, so we know how many times those exact amounts appeared in the database in Figure 5.14. The top ranking of $174.02 appeared 52 times. We can click on the content of the NO_

OF_RECS field to display the details.

The FIRST_TWO field displays numbers we would expect. That is, low first two digits ranked near the top as in our results to conform to Benford's Law, and not high first two digits such as 99, 98, 97, and so on.

Summarizing the data file by the FIRST_TWO field and indexing the resulting file on NO_OF_REC1 in descending order will give us the frequency of each two

Appending the Two New Fields

FIGURE 5.13 Appending the Two New Fields

Results of the Number Duplication Test

FIGURE 5.14 Results of the Number Duplication Test

digit number that further confirms conformity to Benford's Law as displayed in Figure 5.15.

Number Records for the First Two Digits

FIGURE 5.15 Number Records for the First Two Digits

The NDT is a good provider of detailed information to identify spikes from the summation test and the first two digits test. This helps determine whether further investigation is necessary.

Found a mistake? Please highlight the word and press Shift + Enter  
< Prev   CONTENTS   Next >
Business & Finance
Computer Science
Language & Literature
Political science