Here is my setup for what it's worth in deciding on your own.
I import the access.log file into a MySQL database every night and then use Crystal Reports to report on web usage. I run Squid on a Linux box but MySQL on Windows 2k Server so your setup may vary.
First I reformat the log file to my liking with a VB hack:
myLine = myLine.Replace(",", ";") 'remove any (,) to avoid problems later
myFields = myLine.Split(" ")
myDate = CDate("01-01-1970 00:00:00") 'Initialize date at Unix 0 point
'First field should be a Unix date - turn it into a date by adding it as seconds to 1970
intDate = CInt(myFields(0))
myDate = myDate.AddSeconds(intDate)
myDate = myDate.AddHours(-5) 'timezone
myFields(0) = myDate.ToString("u")
'Put the line back together
newLine = Join(myFields, ",")
'Remove duplicate (,)
Dim replaceLine = newLine.Replace(",,", ",")
While replaceLine <> newLine
newLine = replaceLine
replaceLine = newLine.Replace(",,", ",")
End While
'Split the domain and the page into separate fields
Dim lastComma As Integer = 1
Dim domainEnd As Integer
For i As Int16 = 1 To 6
lastComma = newLine.IndexOf(",", lastComma + 1)
domainEnd = newLine.IndexOf("/", lastComma)
If Mid(newLine, domainEnd + 1, 1) = "/" Then
domainEnd = newLine.IndexOf("/", domainEnd + 2)
End If
If domainEnd = -1 Or newLine.IndexOf(",", lastComma + 1) < domainEnd Then
domainEnd = newLine.IndexOf(",", lastComma + 1)
End If
newLine = Left(newLine, lastComma) + Mid(newLine, lastComma + 1, domainEnd - lastComma) + "," + Mid(newLine, domainEnd + 1)
'Use lastDateReached (stored in a separate file) to make sure we haven't already imported these fields
If lastDateReached < myDate Then
'Write out the line we have edited
lastDateReached = myDate
End If
Then I import the new file into MySQL:
"C:\Program Files\MySQL\MySQL Server 4.1\bin\mysqlimport" --user=***** --fields-optionally-enclosed-by=""" --fields-terminated-by=, --lines-terminated-by="\r\n" ***database*** \\***server***\***share***\***squid***access.log
I'm looking for suggestions about what to use for log file analysis.
We need something that can be used to drill down into the particular
pages a particular user was accessing during a given stretch of time. A
web interface would be nice but is not a requirement.
Squidalyser looks like it's close. I can't see how to specify a
date/time range for the query though.
