View previous topic :: View next topic |
Author |
Message |
e3dward
Joined: 14 Jul 2014 Posts: 8
|
Posted: Mon Feb 16, 2015 5:38 pm Post subject: Can't Pass 40 Connections without Slow Listeners + Drops |
|
|
We can't seem to get above 40-45 listeners on a 128k stream (with a 1 Gbit connection to campus network). Network IT guys confirm it doesn't look like a bandwidth problem, and the CPUs average in 25% utilization. This is with a max-listeners of 300 set in the config.
From searching on this forum, I think I started to track down some clues. Running a netstat query, (netstat -tnp | grep icecast) shows pretty consistent Send-Q per connection.
About half the time I run the command, I see 1/3 - 2/3+ connections have a Send-Q of 1400, 2800, 7000, sometimes more. I'm not a network specialist, but it sure seems like something is misconfigured either in our router (a Cisco SG300), or up the chain in the campus network. I'm at a loss of how to troubleshoot any further, or what to document to ask for help from IT.
Help me Obi Wan, you're my only hope.
-ed |
|
Back to top |
|
|
dm8tbr
Joined: 09 Feb 2013 Posts: 45 Location: icecast.org
|
Posted: Tue Feb 17, 2015 8:05 am Post subject: |
|
|
For completeness it would be good to know which exact version of Icecast you are running.
If you suspect a network problem, then I'd recommend to do some synthetic load testing, as opposed to your current organic listeners.
We have done load testing of Icecast in the past. This includes the scripts we used and should be a good starting point.
Test both from your local network and the remote network, so that you can compare things.
In addition you might consider doing simple iperf and or http server based bandwidth and traffic tests. |
|
Back to top |
|
|
e3dward
Joined: 14 Jul 2014 Posts: 8
|
Posted: Tue Feb 17, 2015 5:12 pm Post subject: |
|
|
Thanks dm8tbr, good points! We're running Icecast 2.3.3-1Ubuntu (https://launchpad.net/ubuntu/+source/icecast2/2.3.3-1ubuntu1) installed from a repo. This is on a Linux Mint 16 box, which is based on *Ubuntu 13.10/Saucy.
I've done some of the load testing you referred to in the past. Based on that it looked like we could easily get over 500+ listeners. This was tested on our internal network, then on the campus wan, then from a remote location off-campus. From what I recall, when load testing from the off-campus connection it seemed like the limitation was the bandwidth of external connection (my home cable modem), but the server connection on campus had more bandwidth available.
I've run bandwidth tests & speed tests from the server in the past, and again it seems like the bandwidth and speed line up with what we should be seeing. Just tried running iperf3 with this output:
Code: |
[ 4] local 35.8.90.130 port 50662 connected to 173.230.156.66 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 8.19 MBytes 68.7 Mbits/sec 0 1.13 MBytes
[ 4] 1.00-2.00 sec 8.75 MBytes 73.4 Mbits/sec 134 635 KBytes
[ 4] 2.00-3.00 sec 10.0 MBytes 83.9 Mbits/sec 0 676 KBytes
[ 4] 3.00-4.00 sec 10.0 MBytes 83.9 Mbits/sec 0 704 KBytes
[ 4] 4.00-5.00 sec 11.2 MBytes 94.4 Mbits/sec 0 718 KBytes
[ 4] 5.00-6.00 sec 10.0 MBytes 83.9 Mbits/sec 0 724 KBytes
[ 4] 6.00-7.00 sec 11.2 MBytes 94.4 Mbits/sec 0 725 KBytes
[ 4] 7.00-8.00 sec 10.0 MBytes 83.9 Mbits/sec 0 725 KBytes
[ 4] 8.00-9.00 sec 11.2 MBytes 94.4 Mbits/sec 0 727 KBytes
[ 4] 9.00-10.00 sec 7.50 MBytes 62.9 Mbits/sec 2 403 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bandwidth Retr
[ 4] 0.00-10.00 sec 98.2 MBytes 82.4 Mbits/sec 136 sender
[ 4] 0.00-10.00 sec 95.8 MBytes 80.4 Mbits/sec receiver
|
The only other obvious error I could find (through running through whatever suggested troubleshooting I could find) related to TX_mac_errors on the interface:
Code: |
$ ifconfig
eth0 Link encap:Ethernet HWaddr ------------
inet addr:35.8.90.130 Bcast:35.8.255.255 Mask:255.255.0.0
inet6 addr: fe80::9eb6:54ff:fe04:593c/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:355838436 errors:0 dropped:587438 overruns:0 frame:0
TX packets:502220243 errors:108 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:25475389297 (25.4 GB) TX bytes:575717134200 (575.7 GB)
Interrupt:18
$ ethtool -S eth0
NIC statistics:
tx_octets: 575864572758
tx_collisions: 0
tx_xon_sent: 0
tx_xoff_sent: 0
tx_flow_control: 0
tx_mac_errors: 108
tx_single_collisions: 0
tx_mult_collisions: 0
tx_deferred: 0
tx_excessive_collisions: 0
tx_late_collisions: 0
tx_collide_2times: 0
tx_collide_3times: 0
tx_collide_4times: 0
tx_collide_5times: 0
tx_collide_6times: 0
tx_collide_7times: 0
tx_collide_8times: 0
tx_collide_9times: 0
tx_collide_10times: 0
tx_collide_11times: 0
tx_collide_12times: 0
tx_collide_13times: 0
tx_collide_14times: 0
tx_collide_15times: 0
tx_ucast_packets: 502336624
tx_mcast_packets: 3335
tx_bcast_packets: 7995
tx_carrier_sense_errors: 0
tx_discards: 0
tx_errors: 0
dma_writeq_full: 0
dma_write_prioq_full: 0
rxbds_empty: 0
rx_discards: 0
rx_errors: 0
rx_threshold_hit: 0
dma_readq_full: 0
dma_read_prioq_full: 0
tx_comp_queue_full: 0
ring_set_send_prod_index: 0
ring_status_update: 0
nic_irqs: 0
nic_avoided_irqs: 0
nic_tx_threshold_hit: 0
mbuf_lwm_thresh_hit: 0
|
Any other troubleshooting or advice would be welcome. I've been after this for months and months with no real leads other than the Send-Q from netstat. |
|
Back to top |
|
|
dm8tbr
Joined: 09 Feb 2013 Posts: 45 Location: icecast.org
|
Posted: Tue Feb 17, 2015 7:17 pm Post subject: |
|
|
Might be worth trying with a different/additional network card then.
If the Network card has a problem and loses packets, then it would at least explain stalling connections.
Or try a different machine. The good thing is that Icecast will run on pretty much anything and will saturate the network link long before you will see system load.
It might also help to get packet dumps from when you start seeing those stalling connections. Retransmissions and other things might help you understand. If you want to go advanced you could dump the traffic in parallel on the Linux machine running Icecast and externally and try to compare. |
|
Back to top |
|
|
dm8tbr
Joined: 09 Feb 2013 Posts: 45 Location: icecast.org
|
Posted: Tue Feb 17, 2015 7:57 pm Post subject: |
|
|
From a different angle. Could you try to see if there is any correllation between things like how long a client was connected and how much it's lagging?
Does the Icecast error.log list a lot of instances of clients that were disconnected by the server for "falling too far behind"?
In general it might be worth to use a different than the current machine and with a good sound card to feed Icecast, as a test. If that influences it then you might have a problem with clock offset in the original sound card.
I've tried listening to your stream, but didn't see any suspicious drift in either direction. It would show up as a slowly increasing or decreasing buffer fill level. |
|
Back to top |
|
|
e3dward
Joined: 14 Jul 2014 Posts: 8
|
Posted: Tue Feb 17, 2015 8:35 pm Post subject: |
|
|
There are definitely a decently high number of disconnections for "falling too far behind." Approximately 800 in the past week, it looks like. We've tried different hardware over the years, and I think its been a consistent issue. The only consistency is using the same USB sound card, a Lexicon interface (I want to say its a lambda?). Is it possible that is somehow related to the problem?
I'll see if I can correlate the slow listeners to the length of time connected.
I don't know anything about packet capturing, but I'll see if I can figure something out on that front as well.
Thanks again for the help (and for listening to the stream!) |
|
Back to top |
|
|
e3dward
Joined: 14 Jul 2014 Posts: 8
|
Posted: Tue Feb 17, 2015 11:39 pm Post subject: |
|
|
Ugh, another oddity, which has been rare but just happened again today, is that sometimes the stream will appear to play very slowly. Like the music has been slowed down 10-20% or something (ballparking). Thought it might be a sampling rate mismatch between source & stream, but its not clear why this would just happen suddenly on rare occasions. But I include it in case its useful. |
|
Back to top |
|
|
dm8tbr
Joined: 09 Feb 2013 Posts: 45 Location: icecast.org
|
Posted: Wed Feb 18, 2015 9:34 am Post subject: |
|
|
I said it implicitly, but to spell it out. Also try with a different sound card.
TBH: We (the Icecast team) don't see anything obvious here and it's hard to diagnose.
I'd also consider long running synthetic tests to a local network endpoint and in parallel an endpoint on the other side "campus network".
Please note that just using "curl" won't show if there is a timing problem at stream generation. For that you'd need to do more involved things. |
|
Back to top |
|
|
e3dward
Joined: 14 Jul 2014 Posts: 8
|
Posted: Mon Feb 23, 2015 3:44 pm Post subject: |
|
|
Thanks again, dm8tbr. I'm working on another soundcard solution without having to buy one. I'll let you know what I come up with, but it makes sense that it might be something related to soundcard we've been using, as its the only constant. I'll let you know what I come up with.
Thanks! |
|
Back to top |
|
|
e3dward
Joined: 14 Jul 2014 Posts: 8
|
Posted: Wed Feb 25, 2015 9:06 pm Post subject: |
|
|
Hi All,
So I've set up a second streaming server running in parallel with different soundcard (also running Ubuntu 14.04.1) and different encoder (butt - heh). Couple of immediate observations; I've got both streams running in my browser in 2 tabs and whereas there was a small delay between the 2 (probably on the order of 50-100 milliseconds or so), after about 2 hours there is now a much larger gap. And the original stream definitely appears to be the laggard.
At the same time, someone noticed that one of the office computers running the stream was significantly behind. I timed it and the stream (which had been connected for 6 hours, 56 minutes, and 30 seconds) was approximately 1 minute, 8 seconds behind the actual broadcast.
I've also set up our main method of connection (from our website) to flip connections between the existing stream & the one I setup today, so hopefully it will let me know if the 'new' stream has more listeners in a day or so.
Thanks again for the assistance thus far! |
|
Back to top |
|
|
dm8tbr
Joined: 09 Feb 2013 Posts: 45 Location: icecast.org
|
Posted: Sat Feb 28, 2015 11:32 am Post subject: |
|
|
That sounds quite suspicious, yes.
Really stable sound cards are hard to come by.
I'm still looking for volunteers who'd help me write some software to do proper sound card characterization and my other long term project a "hardware stream source". _________________ I maintain the Icecast project. If you want to show your appreciation. |
|
Back to top |
|
|
e3dward
Joined: 14 Jul 2014 Posts: 8
|
Posted: Thu Mar 05, 2015 11:22 pm Post subject: |
|
|
Just as a point of reference, it seems like the problem is on the hardware or network side, and we've tried some different hardware with similar results. At this point we're escalating the issue with campus network folks to have them troubleshoot the packet flows with us. When we track down the issue, I'll pot another update here if its relevant or useful.
Thanks again for the support assistance, and the continued development of a great piece of software. |
|
Back to top |
|
|
e3dward
Joined: 14 Jul 2014 Posts: 8
|
Posted: Thu Jul 23, 2015 4:55 pm Post subject: Update: July 2015 |
|
|
Update as of July, 2015: We've done a couple of things to keep working on this and seen our listener peak increase slightly. First, we ended up investing in an encoding box from telos, hoping that offloading the encoding process from the streaming CPU would help. Hard to say for sure if it did or not.
We've been working mainly on the network aspect, both on the configuration and hardware sides. This is actually in-process right now. Network operations did a little bit of analysis on packet captures, trying to find anything obvious in the packets that point to problems. They didn't really catch anything obvious, though. Ultimately, we had some cisco certified folks come in from network operations and clean up our config, which seemed to help a little. We're running on a Cisco SG300 SOHO switch right now.
Next we're going to move our gigabit feed from the building comm closet on our floor to the main switch that is connected to the campus fiber interconnect. We're also adding a second gigabit line from that switch, and we'll be upgrading to a switch managed by network ops.
I'll post another update here afterwards, to update on the effects of the network upgrades, but mostly in case its useful to someone in the future.
VIVA LA ICECAST! |
|
Back to top |
|
|
|