Data Analysis Technology for the Audit Community

 

DATAS For Excel

  2 of  5  

DATASâ 2000 FOR EXCEL: RUNNING THE PROGRAMS

1.         In summary, all the programs are run using,

            Tools/Macro/Macros/Run

            The programs display instructions after they have been opened.  The programs indicate whether subset variables are needed and where the data should be pasted.

            You must Enable the macros to run the programs.  All programs should be opened as Read Only.

2.         Excel 97/2000 is limited to 65,536 rows.  This means that a maximum of 65,535 observations can be processed because the first row is used for headings.

3.         The programs have been tested using Excel 97 for use in the United States.  These programs also work without error in Excel 2000.

4.         The programs have somewhat different data analysis criteria.  The programs indicate, upon opening, which numbers are permissible, and which numbers will be retained for processing.

5.         By way of an example, the Basic_ProfileFirSecFir2.xls program is run as follows,

Load Excel

Open your data file (assuming that this is Census1990.txt) using:

File / Open / Census1990.txt (follow the prompts (Next, Next, Finish) to open the .txt file)

Highlight the numbers in the data column using F5 and A1:A3141.

Copy to Clipboard using, Edit / Copy

Open Basic_ProfileFirSecFir2.xls using:

File / Open / Basic_ProfileFirSecFir2.xls

Paste data in data column B with first observation in B2 using

Put cursor on B2 / Edit / Paste

Run program using:

Tools / Macro / Macros, highlight Headings_Digits_Benfords_Law_Grpahs / Click Run

Never save Basic_ProfileFirSecFir2.xls, use Save As, and give the spreadsheet a new name.

               If you do save Basic_ProfileFirSecFir2.xls and then rerun it after clearing the contents, the program might not correctly calculate the number of observations in the new data set.  The most likely result is that it analyzes the same number of observations that was in the spreadsheet when it was saved.

To view graphs and tables Click on Tabs to view Output

DATASâ 2000 FOR EXCEL: DESCRIPTION OF PROGRAMS

The tests are divided into:

1.  Basic Tests,

2.  Advanced Tests (using Subset variables), and

3.  Other Tests.

BASIC TESTS (FOUR PROGRAMS)

Data Profile and Basic Digit Tests (Basic_ProfileFirSecFir2)

A.         The first part of this program partitions the numbers in the selected numeric field as follows:

1.       Amounts equal to or larger than 10.00,

2.       Amounts from 0.01 to 9.99,

3.       Amounts equal to zero,

4.       Amounts from -0.01 to -9.99,

5.       Amounts equal to or smaller than -10.00, and

6.       Amounts from 0.01 to 50.00.

            The final table shows (1) the Count, (2) the % of Total Count, and (3) the sum of all the elements.

 

B.         The second part of this program prepares a graph and table for the following digital tests:

1.       First Digits,

2.       Second Digits, and

3.       First-Two Digits.

             The digit graphs and tables are only based on numbers equal to or greater than 10.00.  To bypass this restriction you could multiply the numbers by a suitable value (e.g. 1000) before processing.

The three graphs each include upper and lower confidence bounds showing the limits for differences at the 5 percent level of significance.  The tables show:

1.       The count,

2.       The actual proportion, and the expected proportion,

3.       The difference in proportions, and the direction of the deviation, and

4.       The Z-statistic for each digit or digit combination.

            The output is viewed by clicking on the tabs at the bottom of the spreadsheet.

First-Three Digits Tests (Basic_First3_Digits)

            This program prepares a graph and table for the first-three digits test which is run on positive numbers equal to or greater than 1.00.

The graphs include upper and lower confidence bounds showing the limits for differences at the 5 percent level of significance.  The table follows the same format as the tables in the Basic_ProfileFirSecFir2 output.

            Your data must only include numbers equal to or greater than 1.00.  If numbers less than 1.00 are included for processing the program will not produce any output.  Several of the table columns starting with Count will display #NUM errors.  This is because the program uses logs to extract the first-three digits and the logs of numbers less than 1.00 are either negative numbers or are undefined.

            Where a number has only one or two digits, such as 3 or 35, the first-three digits are calculated to be 300 or 350 by adding zeros to the end of the number.

            This program includes an optional second macro which sorts the data by first-three digits to help with the lookup of observations associated with first-three digits that occurred excessively.

            The Z-stat tab table gives a table of the first-three digit combinations that occurred abnormally often.

 

  2 of  5 

TOP

 

 

Mark J. Nigrini Ph.D.
606 Rockcrossing Lane, Allen, Texas 75002
Tel: (972) 359-0020  E-mail: mark_nigrini at msn dot com