Audits of Internet Logs

Internet logs tend to be both voluminous as well as contain data that is of little or no audit interest. Further, as the Internet IP address is stored in"dot" notation, (known as IPV4), it can be challenging to determine the origin (country) of the transaction source. For web log analysis, in order to determine the country of origin, it is first necessary to download current data from the five "RIR", which are known as LACNIC, ARIN, RIPE, AFRINIC and APNIC. These registries contain the currently assigned blocks of IP addresses and the country to which they are assigned. This data must be extracted, then sorted and merged, with the final result, a computer program which can be used to convert an IP address in dot notation (i.e. ipv4) to a standard two character country code.

Therefore, we are providing the set of tools below to assist the auditor/analyst in their review of web log transactions. There are two components of the set: 1) "WebLog" which takes as input the raw data from an Internet log and formats it into the industry standard TSV (tab separated value) format and 2) a program which can convert IP addresses in ipv4 (dot notation) to their country code.

Also included are a collection of SQLite database SQL commands which can be used to load the extracted data into a database, query it and perform further extractions.

Web Log Audit Information
Description Type Last Date Changed
WebLog Extract program
Web Log Analyzer source code and test data files in "setup" format Setup 12-29-2006
Web Log Analyzer source code and test data files in "zip" format Zip 12-29-2006
RIR Data (As of 02-03-2007)
Link to URL to download the data via FTP URL 12-01-2006
LACNIC data (Latin America) Data 04-27-2007
AFRINIC data (Africa) Data 04-27-2007
APNIC data as (Asia Pacific) Data 04-27-2007
ARIN data as (North America) Data 04-27-2007
RIPE data as of (Europe) Data 04-27-2007
IP Address to Country Code Conversion
Source code (C++), documentation and command files Setup 12-02-2006
Source code (C++), documentation and command files Zip 12-02-2006
Current lookup table (C++ code) (30,065 rows) Source 04-27-2007
Search Engine Statistics

All of the above software is open source which I have contributed to the public domain. Comments, suggestions and user experiences are welcomed and can be sent to Mike.Blakley AT ezrstats.com.

Web Page last updated on 04-27-2007
© EZ-R Stats, LLC 2005-2007

© EZ_R Stats

Visit EZ_R Stats on the web at:

www.ezrstats.com