Icecast Streaming Media Server Forum Index Icecast Streaming Media Server
Icecast is a Xiph Foundation Project
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

AWS cloudfront and socket timeout

 
Post new topic   Reply to topic    Icecast Streaming Media Server Forum Index -> Icecast Server
View previous topic :: View next topic  
Author Message
rbalhorn



Joined: 30 Dec 2014
Posts: 4

PostPosted: Fri Jan 02, 2015 3:30 pm    Post subject: AWS cloudfront and socket timeout Reply with quote

I have two questions.

I am using Amazon EC2 for our icecast server and using cloudfront for edge distribution. My question is, how are people gathering full access statistics. In the access.log file it shows only the cloudfront accesses and not client accesses, yet in the error.log it is showing an accurate (I believe) concurrent listeners which means client interaction with the server is getting through. Right now I am looking at using the cloudfront access logs for stats but its a pain to use.

The next question is that about once every 2-3 weeks, I get a brief socket timeout. I increased the timeout to 60 which helps a bit, but I still get socket timeouts every now and then. Once it times out, it usually only takes 15-30 seconds to reconnect, but it drops all of our listeners, which the powers that be do not like. I created a backup server with relays but it seems the socket timeout affects both servers since they are in the same datacenter. Would a relay even help with socket time outs? Is there anything I can do to trouble shoot what is causing these?

Thank you for any help!
Back to top
View user's profile Send private message
dm8tbr



Joined: 09 Feb 2013
Posts: 45
Location: icecast.org

PostPosted: Fri Jan 02, 2015 10:53 pm    Post subject: Re: AWS cloudfront and socket timeout Reply with quote

rbalhorn wrote:
I am using Amazon EC2 for our icecast server and using cloudfront for edge distribution.

I'm not that familiar with EC2, but wouldn't just running relays in a few AZs achieve pretty much the same without adding the failure modes of a reverse proxy?
We know of some rather weird brokenness in e.g. EC2 LB when used with Icecast.
rbalhorn wrote:
My question is, how are people gathering full access statistics. In the access.log file it shows only the cloudfront accesses and not client accesses, yet in the error.log it is showing an accurate (I believe) concurrent listeners which means client interaction with the server is getting through. Right now I am looking at using the cloudfront access logs for stats but its a pain to use.

Does that send some sort of http header that would contain the original IP address? Most reverse proxies do.
We have a ticket for this, but looks like it didn't make the cut in prioritization so far.
We've seen some additional features recently implemented by sponsoring one of the core developers, as then it's not done in their limited free time. Contact me directly if interested, I can forward this.
rbalhorn wrote:
The next question is that about once every 2-3 weeks, I get a brief socket timeout. I increased the timeout to 60 which helps a bit, but I still get socket timeouts every now and then. Once it times out, it usually only takes 15-30 seconds to reconnect, but it drops all of our listeners, which the powers that be do not like. I created a backup server with relays but it seems the socket timeout affects both servers since they are in the same datacenter. Would a relay even help with socket time outs? Is there anything I can do to trouble shoot what is causing these?

I'm guessing the timeout happens between the source client and the Icecast server? This might just be timing runaway due to inaccuracies in the timing of the sample rate vs. real time. The error.log might have some better details.
Have you tried configuring a fallback (even to file) for your stream? If it's source←→Icecast timeout, then a fallback would keep listeners connected.
Back to top
View user's profile Send private message Visit poster's website
rbalhorn



Joined: 30 Dec 2014
Posts: 4

PostPosted: Fri Jan 02, 2015 11:22 pm    Post subject: Re: AWS cloudfront and socket timeout Reply with quote

dm8tbr wrote:
rbalhorn wrote:
I am using Amazon EC2 for our icecast server and using cloudfront for edge distribution.

I'm not that familiar with EC2, but wouldn't just running relays in a few AZs achieve pretty much the same without adding the failure modes of a reverse proxy?
We know of some rather weird brokenness in e.g. EC2 LB when used with Icecast.


I see what you are saying, we just have a world wide distribution model so edge servers were the obvious choice, although, as we listed, there are obvious drawbacks. Cloudfront was set up not just for LB but for edge caching.

dm8tbr wrote:

rbalhorn wrote:
My question is, how are people gathering full access statistics. In the access.log file it shows only the cloudfront accesses and not client accesses, yet in the error.log it is showing an accurate (I believe) concurrent listeners which means client interaction with the server is getting through. Right now I am looking at using the cloudfront access logs for stats but its a pain to use.

Does that send some sort of http header that would contain the original IP address? Most reverse proxies do.
We have a ticket for this, but looks like it didn't make the cut in prioritization so far.
We've seen some additional features recently implemented by sponsoring one of the core developers, as then it's not done in their limited free time. Contact me directly if interested, I can forward this.


The logs look like this:
Code:
205.251.221.187 - - [30/Dec/2014:23:21:35 +0000] "GET /new_music HTTP/1.1" 200 2700038 "-" "Amazon CloudFront" 261
54.239.140.54 - - [30/Dec/2014:23:21:37 +0000] "GET /spanish HTTP/1.1" 200 102508 "-" "Amazon CloudFront" 1
54.240.145.153 - - [30/Dec/2014:23:21:44 +0000] "GET /new_music HTTP/1.1" 200 25460336 "-" "Amazon CloudFront" 2117
54.239.135.132 - - [30/Dec/2014:23:21:47 +0000] "GET /new_music HTTP/1.1" 200 25186310 "-" "Amazon CloudFront" 2094
54.239.140.54 - - [30/Dec/2014:23:21:49 +0000] "GET /new_music HTTP/1.1" 200 63542 "-" "Amazon CloudFront" 0


The IPs are from the cloudfront edge server, not the actual client IP. The client etc. are not included, only that the request was Amazon CloudFront. We first noticed a problem because there were far fewer accesses than should be when comparing to the max concurrent listeners in the error.log. The AWS cloudfront logs do contain the client IPs and an accurate count and bytes transferred, however there is no way to pass that through that I know of, but somehow it is to the error.log since those counts are accurate. If there is a developer that can provide accurate logs, we would be happy to sponsor them, but I have to get approval.

dm8tbr wrote:

rbalhorn wrote:
The next question is that about once every 2-3 weeks, I get a brief socket timeout. I increased the timeout to 60 which helps a bit, but I still get socket timeouts every now and then. Once it times out, it usually only takes 15-30 seconds to reconnect, but it drops all of our listeners, which the powers that be do not like. I created a backup server with relays but it seems the socket timeout affects both servers since they are in the same datacenter. Would a relay even help with socket time outs? Is there anything I can do to trouble shoot what is causing these?

I'm guessing the timeout happens between the source client and the Icecast server? This might just be timing runaway due to inaccuracies in the timing of the sample rate vs. real time. The error.log might have some better details.
Have you tried configuring a fallback (even to file) for your stream? If it's source←→Icecast timeout, then a fallback would keep listeners connected.


So the encoders are not dropping. My gut tells me its our company firewall once in awhile getting over zealous, but our IT swears its not. However having the backup server in the same AWS data center makes it hard to trouble shoot. In the error.log all it tells me is that there is socket timeout, exiting, a bunch of listeners now 0, and reconnect. Ill have to track down an instance, but again, I didnt notice too much. There is the occasional "listener fell behind" but nothing out of the ordinary.

I am not sure what that means about the sample rate vrs realtime, could you explain?

The fallback file is a great idea however. I will try that!

Again, thanks for your help.
Back to top
View user's profile Send private message
dm8tbr



Joined: 09 Feb 2013
Posts: 45
Location: icecast.org

PostPosted: Fri Jan 02, 2015 11:57 pm    Post subject: Re: AWS cloudfront and socket timeout Reply with quote

rbalhorn wrote:
dm8tbr wrote:
rbalhorn wrote:
I am using Amazon EC2 for our icecast server and using cloudfront for edge distribution.

I'm not that familiar with EC2, but wouldn't just running relays in a few AZs achieve pretty much the same without adding the failure modes of a reverse proxy?
We know of some rather weird brokenness in e.g. EC2 LB when used with Icecast.


I see what you are saying, we just have a world wide distribution model so edge servers were the obvious choice, although, as we listed, there are obvious drawbacks. Cloudfront was set up not just for LB but for edge caching.

In case of Icecast there is of course nothing to cache though, as each listener starts at a different point in the stream, etc.
Not saying you shouldn't use that, just that from my limited view of your setup it doesn't seem to add much, just causes issues.
rbalhorn wrote:

dm8tbr wrote:

rbalhorn wrote:
My question is, how are people gathering full access statistics. In the access.log file it shows only the cloudfront accesses and not client accesses, yet in the error.log it is showing an accurate (I believe) concurrent listeners which means client interaction with the server is getting through. Right now I am looking at using the cloudfront access logs for stats but its a pain to use.

Does that send some sort of http header that would contain the original IP address? Most reverse proxies do.
We have a ticket for this, but looks like it didn't make the cut in prioritization so far.
We've seen some additional features recently implemented by sponsoring one of the core developers, as then it's not done in their limited free time. Contact me directly if interested, I can forward this.


The logs look like this:
Code:
205.251.221.187 - - [30/Dec/2014:23:21:35 +0000] "GET /new_music HTTP/1.1" 200 2700038 "-" "Amazon CloudFront" 261
54.239.140.54 - - [30/Dec/2014:23:21:37 +0000] "GET /spanish HTTP/1.1" 200 102508 "-" "Amazon CloudFront" 1
54.240.145.153 - - [30/Dec/2014:23:21:44 +0000] "GET /new_music HTTP/1.1" 200 25460336 "-" "Amazon CloudFront" 2117
54.239.135.132 - - [30/Dec/2014:23:21:47 +0000] "GET /new_music HTTP/1.1" 200 25186310 "-" "Amazon CloudFront" 2094
54.239.140.54 - - [30/Dec/2014:23:21:49 +0000] "GET /new_music HTTP/1.1" 200 63542 "-" "Amazon CloudFront" 0


The IPs are from the cloudfront edge server, not the actual client IP. The client etc. are not included, only that the request was Amazon CloudFront.

Yes, I understood that. I was asking if you knew how they forward the real IP. I now went to the AWS documentation and found it:
http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/RequestAndResponseBehaviorS3Origin.html#RequestS3IPAddresses
This means the ticket I referenced applies, proper access.log would be possible.
rbalhorn wrote:
We first noticed a problem because there were far fewer accesses than should be when comparing to the max concurrent listeners in the error.log.
This is an alarm bell right there. I suspect that this is the same brokenness we've been seeing before with EC2. A check of netstat on the Icecast machine would likely reveal a pile of CLOSE_WAIT connections - c.f. http://lists.xiph.org/pipermail/icecast/2014-October/012979.html and the following postings in that thread.
rbalhorn wrote:
The AWS cloudfront logs do contain the client IPs and an accurate count and bytes transferred, however there is no way to pass that through that I know of, but somehow it is to the error.log since those counts are accurate.

I think this needs some further sanity checks to compare logs, but first I'd rule out the CLOSE_WAIT problem.
rbalhorn wrote:
If there is a developer that can provide accurate logs, we would be happy to sponsor them, but I have to get approval.

Please send a mail to webmaster@xiph.org (or my private mail). I'll arrange the contact.
rbalhorn wrote:

dm8tbr wrote:

rbalhorn wrote:
The next question is that about once every 2-3 weeks, I get a brief socket timeout. I increased the timeout to 60 which helps a bit, but I still get socket timeouts every now and then. Once it times out, it usually only takes 15-30 seconds to reconnect, but it drops all of our listeners, which the powers that be do not like. I created a backup server with relays but it seems the socket timeout affects both servers since they are in the same datacenter. Would a relay even help with socket time outs? Is there anything I can do to trouble shoot what is causing these?

I'm guessing the timeout happens between the source client and the Icecast server? This might just be timing runaway due to inaccuracies in the timing of the sample rate vs. real time. The error.log might have some better details.
Have you tried configuring a fallback (even to file) for your stream? If it's source←→Icecast timeout, then a fallback would keep listeners connected.


So the encoders are not dropping. My gut tells me its our company firewall once in awhile getting over zealous, but our IT swears its not. However having the backup server in the same AWS data center makes it hard to trouble shoot.

Sounds like it might be a network issue, doing packet dumps on the connection in several places (inside your corporate network, on the icecast side) might help, but would be tricky to set up.
rbalhorn wrote:
In the error.log all it tells me is that there is socket timeout, exiting, a bunch of listeners now 0, and reconnect. Ill have to track down an instance, but again, I didnt notice too much. There is the occasional "listener fell behind" but nothing out of the ordinary.

I am not sure what that means about the sample rate vrs realtime, could you explain?

So let's say your stream runs audio that is encoded at 44.1k samples per second. Now if your encoder's reference clock isn't perfectly stable or even slightly skewed, then while you should have sent 44100 samples in one second, it might have been actually 44090 or 44110. But now that I think of it, that shouldn't affect the source connection, it would actually cause problems to /really/ long running listeners. We've seen bad sound cards that were >100Hz off and there it's quite pronounced and might starve the listener software in <60min, causing a disconnect.
So yeah, bets are on "something between Icecast and your encoders".
That's what we introduced fallbacks for.
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    Icecast Streaming Media Server Forum Index -> Icecast Server All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2002 phpBB Group
subRebel style by ktauber