Tidal dropouts - probable cause

Posted by: Johnell on 14 September 2017

Apologies for starting yet another thread on this but I cannot find an open thread to post my findings on.

Tidal has been my main musical source for almost a year and it has been working perfectly well until the drop-outs started about a month ago.  I tried all the obvious things like swapping ports on the hub before a search on here showed me just how common the problem is.  Having just upgraded my Virgin 200 Mbs broadband to their 300Mbs service I was fairly certain it wasn't bandwidth related which was confirmed by Simon's posts stating that the problem is caused by latency.  Then the penny dropped - my problem started at the same time that Virgin changed my Superhub 2 to a Superhub 3 which was required for the 300 Mbs connection.

I spoke to Virgin and both their 1st and 2nd line support denied all knowledge of any problems with the Superhub 3.  However, I have found something on google which explains things.  It appears that some modem manufacturers have stopped using a Broadcom chipset in their cable modems and replaced it with the Intel Puma 6 and any cable modem that uses the Intel chipset could have the problem.  There's actually an ongoing lawsuit over this and the following extract comes from a website which explains the cause:

According to the lawsuits, the root of the problem is the cable modem manufacturers' decision to swap out the Broadcom chipset in their modems with the Puma 6 chipset from Intel Corporation. Arris told online technology websites that the problem stems from Intel’s Puma 6’s chipset, which causes cable modems to suffer from significant jitter and latency on their network connections. Reports on multiple websites and forums indicate that these cable modems suffer from "latency jitter so bad it ruins online gaming and other real-time connections." Intel has also confirmed the defect, stating that the company is "aware of an issue with the Puma 6 system-on-chip software that impacts latency," but after numerous months, it has failed to release any update that fixes the issue.   

I'm not sure what the forum rules are about posting links so if you google "Netgear Arris defective cable modems" then go to the classactionlawyers link dated 21 Apr 2017 you can read the article, there's also a list of the affected cable modems, the Virgin Superhub 3 being amongst them.

As a short term fix I've spoken to Virgin about reverting to the Superhub 2 and 200 Mbs broadband but they're sending an 'engineer' out tomorrow to see if he can do anything first..........  

Posted on: 21 September 2017 by Johnell

Maybe I should have used the word "possible" rather than "probable" in the thread title because as has been correctly pointed out by people in another thread, there are many factors to be taken into account.  However, the Intel chip issue is well known and documented, this is another excerpt from the webpage: 

According to The Register, "The problem appears to be that the x86 CPU in the modem is taking on too much work while processing network packets. Every couple of seconds or so, a high-priority maintenance task runs and it winds up momentarily hogging the processor, causing latency to increase by at least 200ms and, over time, about six per cent of packets to be dropped. It affects IPv4 and IPv6 – and it spoils internet gaming and other online real-time interaction that need fast response times."

Even users who do simple web browsing may be affected by the momentary high spikes in latency, causing websites to feel sluggish or not load.

We are actively investigating whether other cable modems containing the Puma 6 chipset, including modems from Linksys, Cisco, and Hitron, also suffer from the same severe network latency defect. Cable modems containing Intel's Puma 6 chipset that may be affected include:

Arris SB6190
Arris TG1672G
Arris TM1602
Super Hub 3 (Arris TG2492LG) (commonly, Virgin Media)
Hitron CGN3 / CDA / CGNV series modems:
Hitron CDA-32372
Hitron CDE-32372
Hitron CDA3-35
Hitron CGNV4
Hitron CGNM-3552 (commonly, Rogers)
Hitron CGN3 (eg CGN3-ACSMR)
Hitron CGNM-2250 (commonly, Shaw)
Linksys CM3024
Linksys CM3016
TP-Link CR7000
Netgear AC1750 C6300 AC1900
Netgear CM700
Telstra Gateway Max (Netgear AC1900 / C6300) (Australia)
Cisco DPC3848V
Cisco DPC3941B / DPC3941T (commonly, Comcast Xfinity XB3)
Cisco DPC3939
Compal CH7465-LG / Arris TG2492LG (commonly, Virgin Media Hub 3)
Samsung Home Media Server

Posted on: 21 September 2017 by Simon-in-Suffolk
Johnell posted:

You too mate....just make sure the BT modem isn't on the list!!!!!!

BT uses direct fibre or DSL technology - they don't use cable - different technology - BT also uses multicast for 4K/UHD IPTV streaming and such local processor induced delays would cause unacceptable performance glitches. BT tends to test its stuff  rather rigorously in this regard.  Its latest Smart Hub 6 device uses  a  purpose built  Broadcom BCM63137 processor running at 1 GHz with two cores 

Posted on: 23 September 2017 by Johnell

Thanks for the info Simon, it's something to bear in mind if BT ever manage to get their fibre technology up to Virgin speeds.     

As for BT testing stuff, I spent the last 16 years of my BT career at Martlesham Heath Research working in a couple of systems integration and software test teams.  They were happy days until the massive influx of.........best I stop there methinks, suffice to say it became intolerable so I opted out.  

It was during this time that I first met Alistair when he ran Signals from his home near the BT site.  

Posted on: 23 September 2017 by Simon-in-Suffolk

Hi Johnell, I guess it's not all about speed, contention has a big part to play as well... it's not so much the quantity it's the quality ... I wish Ofcom would look at this as well rather than unduly focussing on potentially misleading sync speeds.. I gues Joe Public can understand a simple 'marketing' type number more, but for higher bandwidths above ADSL2 type speeds I think it means less and less and latency and contention matter more at the CPE demarcation  from a typical user perspective.. in my humble opinion of course...

So you were in Suffolk then... by your name reference to the BT facility it sounds like a little while ago... but I suspect things have refreshingly changed for the better ... and a truly global workforce and ecosystem of tech partners is embraced on that campus  now... it's now one of the UK's top R&D facilities.

Posted on: 23 September 2017 by Johnell

I guess it's a balance between all the factors Simon as my own case proves.  On the SH2 I got connection speeds around 220 Mbps and even at busy periods everything ran smoothly and more importantly Tidal didn't drop out.  The same cannot be said of the SH3.

As for Martlesham, I left in October 2007 and as my friends who are still there have told me, it is indeed a very different place.  I actually contracted back to BT for an integration project in 2011/2012 but luckily was able to work from home.  

Posted on: 23 September 2017 by Simon-in-Suffolk

Johnell, thanks, so what is the latest from Virgin media on this? I guess if you have worked for the R&D facility at BT you might know your onions.. can you Wireshark the media transfer TCP on Tidal and see what is happening on the line? From reading between the lines it looks like the SH3 router CPU is not up to the task in hand.... rather than anything to do with the internet access itself... and in such scenarios it would be interesting to see what is happening.. if it it throwing everything away for a period of time - I can see that being an unfair challenge for the relatively small TCP buffers on the Naim streamers,

Posted on: 23 September 2017 by Johnell

The fault does appear to lie with the Intel Puma 6 cpu , this is an excerpt from an article in ISPreview dated March 2017.  Until I read this I wasn't aware that the problem is worse on 200Mbps+ connections.  

As far as I'm aware the firmware patch still hasn't been released:

A nasty Intel Puma 6 chipset (x86 SoC) bug, which has caused latency spikes and packet loss for owners of Virgin Media’s latest SuperHub 3 (ARRIS TG2492S/CE) cable broadband router in the United Kingdom, will soon be patched by a new firmware release.

At the end of last year we reported that various Intel Puma 6 based routers (e.g. Arris Surfboard SB6190, Hitron CGNV4 and Compal CH7465-LG etc.), not just theSuperHub 3, all appeared to be suffering from the aforementioned problem and this was particularly noticeable on ultrafast (200Mbps+) broadband connections (here).

In short, the Central Processing Unit (CPU) inside the modem component of the router was taking on too much work while processing network packets, which caused the chipset to run a high-priority maintenance task every few seconds. Owners noticed that this task was occupying the CPU a bit too much and causing momentary latency spikes (increases of 200 milliseconds+), including a little packet loss.

Posted on: 09 November 2017 by Johnell

A quick update: I finally got my Superhub 2ac refitted and as expected Tidal has behaved perfectly ever since.

During the conversation with Virgin tech support they actually admitted for the first time that there is a problem with the Superhub 3 and they expect to be releasing a firmware patch sometime soon but he couldn't / wouldn't say exactly when.  As soon as I hear the patch is available I'll refit the Superhub 3.......and cross my fingers.   

Posted on: 09 November 2017 by Bigfoot

I ended up switching to Zen while they had their big deal discount. All good ????

 

Posted on: 10 November 2017 by Simon-in-Suffolk
Johnell posted:

A quick update: I finally got my Superhub 2ac refitted and as expected Tidal has behaved perfectly ever since.

During the conversation with Virgin tech support they actually admitted for the first time that there is a problem with the Superhub 3 and they expect to be releasing a firmware patch sometime soon but he couldn't / wouldn't say exactly when.  As soon as I hear the patch is available I'll refit the Superhub 3.......and cross my fingers.   

Given what has been said here and eslsewhere, the issue looked like a hardware one, and this sort of issue often is hardware rather than software related.. if so firmware can try and minimise and workaround the issue, but rarely can resolve completely... I wouldn’t be surprised if it is hardware than a new Virgin router appears before too long.. alas not all faults can be ‘fixed’ with firmware. .. and hardware faults can be  a nightmare to workaround with unintended side effects and consequences.

Posted on: 10 November 2017 by Johnell

It definitely is a hardware problem as the following states but a firmware release that alters the priority or frequency of the tasks the CPU is running should cure it.......you would think..... 

In short, the Central Processing Unit (CPU) inside the modem component of the router was taking on too much work while processing network packets, which caused the chipset to run a high-priority maintenance task every few seconds. Owners noticed that this task was occupying the CPU a bit too much and causing momentary latency spikes (increases of 200 milliseconds+), including a little packet loss.

Posted on: 10 November 2017 by David Hendon

Presumably whether you can fix it in firmware will depend whether there is a) a way to change the way the chip prioritises its work by a firmware change and if so b) whether a firmware update for the router can itself update the firmware in the chip.

The fact it has taken so long and is not reported as fixed in any other cable modem with the same chip set suggests to me that the answer to both questions isn't yes and I also think a Superhub 4 (or maybe they will call it something else like "Superhub 3 gaming version" so as not to get overwhelmed by demand) is more likely than a successful firmware upgrade to the SH3.

best

David

Posted on: 10 November 2017 by Gazza

Well Ofcom announced the compensation that will be paid if your broadband provider cannot fix your problem. At £8 a day, I think they would have sorted this problem by now, if the compensation rates had applied when this story first surfaced.

Posted on: 10 November 2017 by Johnell
David Hendon posted:

Presumably whether you can fix it in firmware will depend whether there is a) a way to change the way the chip prioritises its work by a firmware change and if so b) whether a firmware update for the router can itself update the firmware in the chip.

The fact it has taken so long and is not reported as fixed in any other cable modem with the same chip set suggests to me that the answer to both questions isn't yes and I also think a Superhub 4 (or maybe they will call it something else like "Superhub 3 gaming version" so as not to get overwhelmed by demand) is more likely than a successful firmware upgrade to the SH3.

best

David

I agree that it has taken an inordinate amount of time to release what Virgin themselves said will be a firmware fix so maybe it's just more BS to keep customers quiet.  To be honest I would rather have a new mk4 hub than a strangled mk3 anyway.

The question of compensation is interesting.  I first reported this to Virgin over 3 months ago and downgrading my service to alleviate the problem doesn't constitute a fix in my book so it's now around £800 and counting........I'm obviously not holding my breath.......

Posted on: 10 November 2017 by Blackmorec

Really useful thread. I recently set up Qobuz and was suffering regular drop-outs which I can now correlate to ping time spikes. I currently have Superhub 3 which obviously isn’t super at all. It’s also no great shakes in the wireless dept. I’m wondering if I have to stick with my ISP for the modem or can I go for a superior non ISP product with better allround performance?

Posted on: 10 November 2017 by David Hendon

You have to use a VM-supplied modem because they won't authorise anything else to connect to their network. But you can certainly turn wifi off in it and use a wifi access point or switch the SH3 to modem only mode and use an external wireless router.

best

David