Icecast Streaming Media Server Forum Index Icecast Streaming Media Server
Icecast is a Xiph Foundation Project
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

access.log question

 
Post new topic   Reply to topic    Icecast Streaming Media Server Forum Index -> Icecast Server
View previous topic :: View next topic  
Author Message
mendocinotim



Joined: 16 Sep 2011
Posts: 3

PostPosted: Sun Sep 18, 2011 1:46 am    Post subject: access.log question Reply with quote

I am asking this because, after looking online, I have not found any analysis tools for Icecast/Nicecast that I can use with Filemaker, and have decided to write my own.

When done, it will be a Filemaker runtime solution (a stand-alone application, not requiring filemaker), that will be able to extract data from the Nicecast access log (Nicecast reports that the log format is yours, and I should ask you about it), so I can aggregate the pertinent information into a statistical analysis that I can then use to track individual user IPs, and display them on a google map.
Though the user's GUI is not yet created (this is just a development skeleton right now), here is an example of what I am working on..

A short 3.5 minute movie..
http://issues.videvent.com/icecast/09_13_11_1/


I was wondering how to identify access.log values, as I am writing a script that will parse that information into a database.
I have been trying to figure this out over the last few days, and here's what I "think" is going on with the logs.
Let me use one example log entry as an example.

NOTE: I am filtering all the log entries, to show only those that have the word "listen" in them - which seems to work for filtering out all the 127.0.0.1 entries - though based on your answers (above), I'm not so sure now that it's all that simple.

Here's the example, and my presumptions - please help me refine this, if you are able to do so.

184.36.97.2 - - [12/Sep/2011:10:17:14 -0700] "GET /listen HTTP/1.1" 200 5785 "-" "Apple Mac OS X v10.6.8 CoreMedia v1.0.0.10K549" 1

Here's how I "think" the individual columns are organized.. For clarity's sake, I will delimit these here with " -> "..

IP -> [unknown numeric value] -> [unknown numeric value] -> DATE/TIME -> The Connection Protocol -> [unknown text value] -> [player protocol] -> TIME CONNECTED

Or, displayed this way (so I can add comments)..

    IP
    [unknown numeric value]
    [unknown numeric value]
    DATE/TIME (when the listener disconnected from the stream)
    The Connection Protocol
    [unknown text value]
    PLAYER PROTOCOL (OR TYPE)
    TIME CONNECTED (in seconds)

I am using regular expressions to break these up into fields in my database, thus..

regEX = "^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}) - - \[(\d{1,2}/[^/]+)/(\d{4}):(\d{2}:\d{2}:\d{2})[^\]]+\] \"([^/]+)/[^\"]+\" (\d{3}) (\d+) \"([^\"]+)\" \"([^\"]+)\" (\d+)" ;

    The 1st set of parens = () = $1: will return the IP
    The 2nd set of parens = () = $2: will return the DAY/MONTH
    The 3rd set of parens = () = $3: will return the YEAR
    The 4th set of parens = () = $4: will return the HRS:MIN:SEC
    The 5th set of parens = () = $5: will return the PROTOCOL METHOD ("GET")
    The 6th set of parens = () = $6: will return the PROTOCOL ("listen HTTP/1.1")
    The 7th set of parens = () = $7: will return the HTTP RESPONSE CODE ("200")
    The 8th set of parens = () = $8: will return the BYTES DOWNLOADED
    The 9th set of parens = () = $9: will return the AVAILABLE PLAYER INFORMATION
    The 10th set of parens = () = $10: will return the CONNECTION TIME (expressed in seconds)

I will further parse $2 and $3 into a single date value, formatted as mm/dd/yyyy.


Regarding the access.log..

Between $1 and $2 (above) there are the two dashes " - - ", which I was wondering about.

I assume that log columns are all space-char delimited, where numeric values and text values are combined, so that all text values are also contained within double quotes. And that the double dashes between $1 and $2 are both intended to hold numeric values. Yes?
What are these two columns for? Do you intend to use them, or are they already being used in the access.log? I cannot find any entries where these two are being used.

I also assume that single dashes represent null-data values, yes?

I was also wondering if my assumption, that $2, $3 and $4 represent the moment when that person breaks off from the stream?

And if my assumption about $8 is also correct.

And, I am especially interested in $10 - where I assume this is the period of time that a person was connected in that session, 'expressed in seconds.

I have noticed that sometimes that value is set to zero, and at other times it is set to a value of one; while still at other times the value is greater than one. I have concluded that the values greater than one are the period of time the person was connected in seconds. Assuming I am so far correct, is there any particular significance to the zero or one values I sometimes see here?

Am I missing anything else here - for example, must I account for 127.0.0.1:8000, or /Listen.m3u somehow (differently)?
For example, in today's log, I found this one..

220.181.51.217 - - [13/Sep/2011:05:22:32 -0700] "GET /listen.m3u HTTP/1.0" 200 86 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" 1

How should I interpret this? Is it any different from the ones that are listed as "/Listen "?
I do not see the ":8000" mentioned in any of the logs either.

What should I look for to account for iDevice listeners?

What should I filter out? Since I do not care to account for other (non-listener-related) events, such as these.. ?

-0700] "GET /admin/stats HTTP/1.1" 200 1543 "-" "nicecast-stats-loader" 1
127.0.0.1 - - [13/Sep/2011:04:38:29 -0700] "GET /admin/listclients HTTP/1.1" 200 165 "-" "nicecast-stats-loader" 0
127.0.0.1 - - [13/Sep/2011:04:38:29 -0700] "GET /admin/listmounts HTTP/1.1" 200 253 "-" "nicecast-stats-loader" 0
127.0.0.1 - - [13/Sep/2011:04:38:34 -0700] "GET /admin/stats HTTP/1.1" 200 1543 "-" "nicecast-stats-loader" 0


I assume this will take some time for you to respond to; but I would appreciate as much information as you can provide concerning the access.log; since this will give me what I require to create this analysis tool - which, I'm sure you can appreciate, would be a popular addition to the Icecast/Nicecast community.

Thanks for all your kind assistance.
I will be looking forward to your answers.
Back to top
View user's profile Send private message AIM Address
karlH
Code Warrior
Code Warrior


Joined: 13 Jun 2005
Posts: 5476
Location: UK

PostPosted: Sun Sep 18, 2011 1:49 pm    Post subject: Reply with quote

The access log follows the common log format closely to maintain some compliance with existing tools. eg http://en.wikipedia.org/wiki/Common_Log_Format

karl.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mendocinotim



Joined: 16 Sep 2011
Posts: 3

PostPosted: Sun Sep 18, 2011 3:40 pm    Post subject: Reply with quote

karlH wrote:
The access log follows the common log format closely to maintain some compliance with existing tools. eg http://en.wikipedia.org/wiki/Common_Log_Format

karl.


Yes - but there are some variations from the common log format too.

The most important, and still unanswered, question has to do with the final item, the one at the far right end of each line.
This is where I asserted an assumption - which I would like confirmed; and asked (also) about Zero and One values 'sometimes found there..


Quote:
And, I am especially interested in $10 - where I assume this is the period of time that a person was connected in that session, 'expressed in seconds.
[please confirm (or correct) that first assumption]

(And) I have noticed that sometimes that value (on the right) is set to zero, while at other times it is set to a value of one; and still at other times the value is greater than one. I have concluded that the values greater than one are the period of time the person was connected in seconds. Assuming I am so far correct, is there any particular significance to the zero or one values I sometimes see here?


Thank you
Back to top
View user's profile Send private message AIM Address
karlH
Code Warrior
Code Warrior


Joined: 13 Jun 2005
Posts: 5476
Location: UK

PostPosted: Sun Sep 18, 2011 5:41 pm    Post subject: Reply with quote

The last item is the duration, to within a second, obviously 0 and 1 can appear as it will depend on the arrival and termination in relation to a change in seconds, in either case it was a short request something like a 404 or an m3u.

karl.
Back to top
View user's profile Send private message Send e-mail Visit poster's website
mendocinotim



Joined: 16 Sep 2011
Posts: 3

PostPosted: Sun Sep 18, 2011 8:37 pm    Post subject: Reply with quote

karlH wrote:
The last item is the duration, to within a second, obviously 0 and 1 can appear as it will depend on the arrival and termination in relation to a change in seconds, in either case it was a short request something like a 404 or an m3u.

karl.


Thanks Karl - that's what I thought, but wanted to be sure it was not indicative of something special. Now I know to just omit the Zero and One valued records from of the results.
Back to top
View user's profile Send private message AIM Address
Display posts from previous:   
Post new topic   Reply to topic    Icecast Streaming Media Server Forum Index -> Icecast Server All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2002 phpBB Group
subRebel style by ktauber