Responding to the rest inline, but unfortunately the point is moot; after two days of running well in production and believing it fixed, one of my boxes is once again starving UHCI DMA even with the patch. So I did not in fact fix it.
I've heard that the latest Intel architecture allows DMA to and from(?) the L3 cache.
FTR, the machines I'm testing on are Core Duo, Core2 Duo and first-gen Nehalem.
| The Host Controller fetches the next entry from the Frame List when | the millisecond allotted to the current frame expires.
... so it is not actually possible to prefetch packets in a useful way.
That was my conclusion as well. I could find no hooks to do so. it also says:
|Host Controller fetches and decodes the Transfer Descriptor. Assuming a Host-to-Function transaction, the |Host Controller delays committing to the USB Transaction until the FIFO fills to an appropriate “trigger point”. |When this threshold is reached, the Host Controller can then begin issuing the Transaction Token.
Unfortunately, the URB abstraction gives us no reliable way to do it in the HCD, because other drivers don't necessarily pass a valid CPU-side buffer address when using URB_NO_TRANSFER_DMA_MAP.
But the USB API requires this (any HCD might use PIO or use its own double-buffering scheme).
Oh, it does? I hadn't been able to come to that conclusion from the docs I read. If so, then that's good.
Anyway, going to keep trying.
Monty