On Thu, Jul 7, 2011 at 5:16 PM, Florian Mickler florian@mickler.org wrote:
[adding Robert Hancock] see here for the whole thread: https://lkml.org/lkml/2011/7/7/150
2011/7/7 Daniel Mack zonque@gmail.com:
On Thu, Jul 7, 2011 at 2:33 PM, Oliver Neukum oliver@neukum.org wrote:
PS: Do you still see this if you enable 64bit DMA for EHCI?
The problem is that I personally don't see that issue at all. I even installed 4GB of RAM to my development machine last year to be able to reproduce this, but I can't, even when the memory allocator is under heavy load. The only people who see this effect are Pedro Ribeiro and William Light (both in Cc:), and both have been very helpful in trying patches and reporting back in detail. Which instructions could we probably give to these people to finally hunt this issue down?
Daniel
Did someone affected already try the suggestions from Robert : http://lkml.org/lkml/2010/4/10/66
That is checking if it really is a 64bit vs 32bit issue by booting with mem=4096M ?
So it seems we can finally close this issue. I got myself a new laptop recently, and was actually happy to see that this machine showed the bug as well. So I started investigating on what's going on and after hours of debugging in the wrong area, I stumbled over a bug in my own code. The patch attached fixes the problem for me, and William (one of the very few people who saw it, too) also reported green light.
However, it remains a puzzle to me where the acutal root cause is for this is, and I would much like to know. Let me briefly explain what I'm seeing.
The driver operates in a mode so that it copies the layout of its inbound stream to the output stream, thus guaranteeing the data rate of outbound data is equal to that of the input stream. For that, I walk the isochronous frames of each received urb and prepare an urb for sending that has the same iso packet structure in terms of lengths and offsets, jumping over and ignoring frames that are actually invalid (status != 0). So here was the problem: if there were frames with any different status field, the driver would calculate the wrong offset and the playback stream will have artefacts. So, a classic bug you might think, and the fix is trivial.
However, what I really don't understand is why this wasn't observed any earlier by any of the many users, and why iit makes a difference which type of memory I use for this. Recall that my original patch that allocates DMA coherent memory for the transfer buffers also fixes the problem just fine. Or put it that way: when allocating "normal" memory using kmalloc(), the stream appears to have "holes" in the iso packet structure, while with DMA coherent memory, the layout is different. And all this only happens on fairly new 64bit-enabled machines.
Can anyone explain this?
Many thanks to everyone involved in this - and especially to William and Pedro who spent a lot of time in trying my patches and sending me debug information.
Daniel