Data Analysis Technology for the Audit Community

 

DATAS For Excel

  4 of  5 

Relative Size Factor (Subset_Relative_Size)

This program calculates the Relative Size Factor (largest number divided by second largest number) for each subset.  The test is useful for detecting errors in the number amounts in a subset.  The output table shows the:

1.       Subset reference,

2.       Largest number for the subset,

3.       Second largest number for the subset,

4.       The Relative Size Factor (calculated as (2) divided by (3)), and

5.       Total number of items for the subset.

The RSF is only calculated for subsets with more than one entry and only takes into account numbers that are 1.00 and larger.

Your data can include field values under 1.00 and subsets with only one entry, these items/subsets will be automatically deleted during processing.

Round Number Subsets (Subset_Round_Numbers)

This program checks subsets for abnormal duplications of round numbers (divisible by 100 (one output) or divisible by 1,000 (another output).  The outputs show the following details for each subset:

1.       Subset reference

2.       The total of the numbers for the subset.

3.       The Z-statistic which measures the significance of the difference between the actual proportion of round numbers and 0.01 (for the multiples of 100) and 0.001 (for the multiples of 1,000).

4.       The count of all the numbers in the subset

5.       The count of the round numbers

6.       The proportion of the numbers that are round ((5) divided by (4))

The Round Number Subset statistics are only calculated for subsets with one or more items where the items are 1.00 or larger.

            Your data can include field values under 1.00 these items/subsets will be automatically deleted during processing.

            The Round Number statistics are based on the integer portion of the number only

            - all digits to the right of the decimal point are deleted before the program checks

              whether the number is round or not.

            - numbers such as 4,500.20 and 505.05 will be analyzed as if they were 4,500 and 505

            - both 63,400.21 and 505,000 would be multiples of 100

            - 634.00 would not be a multiple of 100

            - 500.05 would be a multiple of 100

The next group of Advanced tests use more than one subset variable.  These are all especially powerful error-detecting technologies.

Two Subsets (Subset_SameSameDiff)

This program finds instances where two subsets have the same entries.  The test is a powerful test for errors, and could, for example find observations where:

the invoice numbers are the same,

the dollar amounts are the same, and

the subsets (vendor #s) are different.

Another test could be the same date, the same dollar amount, and different vendor numbers.  The output shows each match on two lines (more if needed) and the details shown are:

1.       The first field found to be the same,

2.       The second field found to be the same (a numeric field), and

3.       The subset reference.

            A “Yes” in the Duplicates column indicates that there were two identical entries for that observation.

Three Subsets (Subset_SameSameSameDiff)

This program is an extension of the above and will detect instances where:

a numeric field is the same, (e.g., dollar amount)

another field is the same, (e.g., invoice number)

another field is the same, and (e.g., invoice date)

another character field is different. (e.g., vendor number)

   This test is very useful for very large data files where Two Subsets yields thousands of matches.  The Output format is the same as for the Same-Same-Different program.

Three Alike (Subset_SameSameSame)

 This program will detect matches where:

a field is the same,

another field is the same, and

another field is the same.

            This test is useful for finding unusual matches in accounts payable, employee reimbursement, inventory, and payroll files.

Four Alike (Subset_SameSameSameSame)

            This program is an extension of the above and will detect instances where:

four fields are the same for two different observations.

            This test is very useful for finding unusual matches in accounts payable, employee reimbursement, refund, sales data, inventory, and payroll files.

 

  4 of  5

TOP

 

 

Mark J. Nigrini Ph.D.
606 Rockcrossing Lane, Allen, Texas 75002
Tel: (972) 359-0020  E-mail: mark_nigrini at msn dot com