NEW SYSTEM AND
OTHER TESTS (ONE PROGRAM)
My
Law (Other_My_Law)
This
program compares the digit patterns of two data sets.
Examples could be the output before a system (or Y2K) conversion
and the output after a system (or Y2K) conversion.
The program could also be used to compare two time periods to find
errors even when there is no system change involved.
The
user inputs the before first-two
digit proportions in column K starting at K2.
A message in M26 will signal a warning if the before
proportions do not sum to 1.00.
The
after data is entered in column
F.
The
program will produce the following output:
1.
The after graph minus the before
graph (this shows the differences between the two graphs),
2.
A detailed table of the digit frequencies.
The
after data set may only have
numbers that are 1.00 and larger. If numbers below 1.00 are included the program will fail and
many #NUM errors will be displayed.
DATASâ
2000 FOR EXCEL: DESCRIPTION OF TABLES
The
programs that produce graphs:
Basic_ProfileFirSecFir2
Basic_First3
Basic_Last2
Other_My_Law
also
include accompanying tables. The
format of these tables is the same throughout.
The format of the Basic_ProfileFirSecFir2 tables is described
below.
Contents
of Tables
The
digit programs produce tables and graphs.
The Basic_ProfileFirSecFir2 table is described below with the
paragraph letter indicating the column in the spreadsheet.
A.
(Obs#). Observation
number. The programs include
one or more columns for transaction details.
You may include only enough details to trace back to the original
transactions in your original data set.
B.
($). The data to be
analyzed should be entered in this column.
The first observation must be in cell B2 with the remaining
observations filling the column downwards.
Do not include blanks or nonnumeric data (e.g., LK6340).
For example, if the data set contains 200 observations the input
range should be B2..B201 (or B2:B201).
C.
(TwoDig). This column
converts each observation to a number in the range [10,100), by moving the
decimal point. Each number in
column C is therefore greater than (and including) 10 and less than (and
excluding) 100. This
calculation is needed to identify the first-two digits.
The formula in this column is copied from C2 downwards.
D.
(First 2). This column
contains the first two-digits of the observations.
It is therefore an integer in the range [10,100).
The formula in this column is copied from D2 downwards.
E.
(First). This column
contains the first digit of the observation which is an integer in the
range [1, 2, ...,9]. The
formula in this column is copied from E2 downwards.
F.
(Second). This column
contains the second digit of the observation which is an integer in the
range [0, 1, ...,9]. The
formula in this column is copied from F2 downwards.
G.
(Digit). This column
contains the bin ranges for the frequency counts.
G2..G10 is the bin range for the first digits and G12..G21 is the
bin range for the second digits.
H.
(Count). This column
contains the counts of the digits to the left of the count number.
H2..H10 contains the first digit frequencies. For example, the number "28" in H3 would indicate
that there are 28 first digit 2's in the data set.
H12..H21 contains the second digit frequencies.
The number "27" in H13 would indicate that there are 27
second digit 2's in the data set. The
sum of the first and second digit counts (H2..H10 and H12..H21) must equal
the number of observations.
I.
(Prop). This is the
proportion that each count represents.
For example, if the data set has 200 observations and H3 had a
value of 28, then the proportion would be 0.140 (28/200).
J.
(Ben Law). This column
contains the expected proportions of Benford's Law.
J2..J10 contains the expected first digit proportions and J12..J21
contains the expected second digit proportions.
The proportions in each row are matched horizontally with the digit
that it relates to. For
example, J3 contains the expected proportion of 2 as a first digit.
K.
(Diff). This column
shows the difference between the actual proportions (column I) and the
expected proportions (column J). A
positive difference indicates that the actual proportion (column I)
exceeds the expected proportion (column J).
L.
(Signif). This column
shows the statistical significance of the difference between the two
proportions. The Z-stat in
row 2 measures the statistical significance of the first digit 1
difference. Significance
takes into account the size of the difference (over or under), the
expected proportion, and the sample size. Scores above 1.96 are significant at the 0.05 level, and
above 2.57 are significant at the 0.01 level.
M.
(FTDigit). This column
is the bin range for the first two-digit frequency counts.
M2..M91 contains the first-two-digits 10 to 99.
N.
(Count). This column
contains the actual counts of the first two-digit combinations to the left
of the count number. For
example, the number "27" in N3 would indicate that there are 27
numbers that start with the first two-digit combination of 11.
The sum of the first-two digit counts (N2..N91) should equal the
total number of observations in the data set.
O.
(Prop). This is the
proportion that each count represents.
For example, if the data set has 200 observations and N3 equaled
27, then the proportion would be 0.135 (27 / 200).
P.
(Ben Law). This column
contains the expected proportions of Benford's Law matched rowwise with
the appropriate first-two digits. For
example, P3 contains the expected proportion of 11 as a first-two digit
combination.
Q.
(Diff). This is the
difference between the actual proportions (column O) and the expected
proportions (column P). A
positive difference indicates that the actual proportion (in column O)
exceeds the expected proportion (in column P).
R.
(Signif). This column
shows the statistical significance of the difference between the two
proportions. The Z-stat in
row 2 measures the statistical significance of the first-two digits 10
difference. Significance takes into account the size of the difference,
the expected proportion, and the sample size.
Scores above 1.96 are significant at the 0.05 level, and above 2.57
are significant at the 0.01 level.