Data Analysis Technology for the Audit Community

 

DATAS for SAS

   2 of  8  
 

DATAS 2000 FOR SAS: INSTALLING THE PROGRAMS

1.         DATAS 2000 for SAS disk #1 contains several files (programs) and two directories (Documentation and Output).  Copy the directory [Datas2000_SAS] and all its contents to your c:\ drive.  After the copying is done, your c:\ drive should have a Dasas2000_SAS directory that is an exact copy of all the files of the DATAS 2000 for SAS disk #1.

            The contents of Datas2000_SAS must be copied as is to your c:\drive.  If you make any changes to the paths then the SAS programs that write their output to Datas2000_SAS/Output will not work properly unless modified.

Your c:\Datas2000_SAS\Output will not contain any files after installation.  The programs that write their output to .csv files will use this directory, as will the Excel program that reads the output and creates neat graphs and tables.

2.         DATAS 2000 for SAS disk #2 contains some test data files.  DATAS 2000 will work if disk #2 is not loaded.  However, the SAS programs are set up to automatically run against a test file on disk #2.  Copy the contents of Disk #2 to your c:\drive.  After the copying is done your c:\ drive should have a Testdata directory that is an exact copy of all the files of the DATAS 2000 for SAS disk #2.  Opening the SAS programs and submitting the jobs and seeing output similar to that in the Summary will be a good indicator that the installation has been done properly and that both SAS and DATAS are working.

3.         For mainframe installation, copy only the .sas programs on Disk #1 to your directory.  In order to execute the programs they must be linked to JCL statements invoking SAS.

4.         Make a backup copy of the DATAS 2000 for SAS disks (for your own use as authorized in the license agreement).

5.         DATAS 2000 for SAS assumes only that the user has SAS/Base.  Since SAS is a continuously licensed product, users should always have the most up to date version loaded on their machines.  These programs were originally written on SAS Version 5.  The programs will therefore execute on any current PC or mainframe version of SAS.

DATAS 2000 FOR SAS: PREPARING A FILE FOR ANALYSIS

The usual application of DA is on corporate disbursements.  The test data file and this Documentation therefore show this application as an example. 

The data file requirements are as follows:

1.      The file should be what is known as a flat file.  The fields (e.g., date, invoice number, amount, vendor number, and vendor name) should preferably be fixed-width and should preferably not be packed decimal.  If the file is comma delimited (.csv) it is important that the dollar amounts do not have commas as thousands separators, because this is read as a separator in comma delimited fields.  For comma delimited files commas that are a part of, for example, a vendor name sometimes cause problems.

2.      If the user can choose from various number formats it is preferred that negative signs be shown to the immediate left of any number, e.g., -2.204.

3.      The file should be on an invoice-by-invoice basis.  A dollar amount is required.  Other required extra details are vendor number, invoice number, invoice date (optional), and vendor name (not too important).

4.      Prepare a separate data set with only the dollar amounts of the invoices in a single column in case your delimiting process does not work with DATAS.

5.      A 133 MhZ Pentium computer should be able to process up to 1 million records.  A 500 MhZ Pentium computer with plenty of RAM should be able to process about 5 million records.  A mainframe computer is probably at its limit with 5 million records.

6.      The file could be on a Zip disk but reading the file will take a while.

7.      The file should include credit memos as negative numbers and zero invoices (if any).

8.      The hard drive should have free space of at least three times the size of the file mainly for the Data Sorts.

9.      To save processing time include only the fields that will be a part of the analysis in the file, unless the file is small (say under 250,000 records).

 

   2 of  8  

TOP

 

Mark J. Nigrini Ph.D.
606 Rockcrossing Lane, Allen, Texas 75002
Tel: (972) 359-0020  E-mail: mark_nigrini at msn dot com