Page 1 of 1

Can textpipe do this IIS web server logfile splitting job

Posted: Mon Dec 08, 2003 9:27 am
by grav
I've got a slightly unusual web server logfile analysis job, I wonder if Textpipe can help.

I want to only keep the logfile lines that represent a 'first visit' from some host or IP address. AIUI this is slightly non-trivial (one needs to specify parameters like time lapse before a same IP address is considered a 'new visit' etc.)

As regards why I can't just use some standard web log analysis software...

I then have to run these logfile lines though Geolyzer from Geobytes which converts IP addresses to countries/locations (where possible - I know there are limitations to this mapping).

Finally I can import that lot into Excel and get some charts approximating the geographic distribution of visitors. (If I do this on the raw logfiles, I'm going to get useless results based on 'hits' - i.e. pages & images, rather than visitors.)

As a kludge, I could zap out request lines that aren't for .htm / .html / .asp - I'm sure Textpipe could do that, but doing so would skew the results if (say) Australian visitors looked at more pages than UK ones.

Posted: Tue Dec 09, 2003 1:10 pm
by DataMystic Support
If you can sort your logfile lines by IP address (easy if the IP address is at the start of each line), then you could sort and discard duplicates.

You might have to reformat the IP address first to have leading zeros.