[alsa-devel] HDA record fails with FIFO error
Hi Takashi,
Currently I encountered an HD audio issue reported by customer on a Lenovo x100e system. Sometimes, but not always, when starting a recording stream through the HDA controller, the controller generates a large amount of interrupts (~40 000 interrupts per second). After this has happened, jack sense (i e unsolicited events from the codec) stops working until the next system reboot.
SD0STS returns a FIFO error (0x28) in interrupt handler. The interrupt service routine acknowledges this error but does not do anything to counteract the root cause to the problem, so it appears again and again. Restarting the stream does not seem to help. Enable MSI or not does not help either.
The issue occurs on 2.6.35 and 2.6.38-rc8+, have not tried latest kernel yet but I think it's also there. FIFO error indicates FIFO overrun occurring while the RUN bit is set, but the driver simply acknowledge and clear the error. I wonder what the root cause and the right treatment are in this case. Any suggestions?
Thanks & Best regards, Andiry
At Thu, 7 Apr 2011 16:54:50 +0800, Xu, Andiry wrote:
Hi Takashi,
Currently I encountered an HD audio issue reported by customer on a Lenovo x100e system. Sometimes, but not always, when starting a recording stream through the HDA controller, the controller generates a large amount of interrupts (~40 000 interrupts per second). After this has happened, jack sense (i e unsolicited events from the codec) stops working until the next system reboot.
SD0STS returns a FIFO error (0x28) in interrupt handler. The interrupt service routine acknowledges this error but does not do anything to counteract the root cause to the problem, so it appears again and again. Restarting the stream does not seem to help. Enable MSI or not does not help either.
The issue occurs on 2.6.35 and 2.6.38-rc8+, have not tried latest kernel yet but I think it's also there. FIFO error indicates FIFO overrun occurring while the RUN bit is set, but the driver simply acknowledge and clear the error. I wonder what the root cause and the right treatment are in this case. Any suggestions?
FIFO_ERR is actually never handled, so basically we can ignore. Simply disabling it like below works around your problem?
Of course, a proper handling of FIFO error would be better, but this may need more code rewrites.
thanks,
Takashi
--- diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 70a9d32..a88baf4 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -285,7 +285,7 @@ enum { SDI0, SDI1, SDI2, SDI3, SDO0, SDO1, SDO2, SDO3 }; #define SD_INT_DESC_ERR 0x10 /* descriptor error interrupt */ #define SD_INT_FIFO_ERR 0x08 /* FIFO error interrupt */ #define SD_INT_COMPLETE 0x04 /* completion interrupt */ -#define SD_INT_MASK (SD_INT_DESC_ERR|SD_INT_FIFO_ERR|\ +#define SD_INT_MASK (SD_INT_DESC_ERR|/*SD_INT_FIFO_ERR|*/ \ SD_INT_COMPLETE)
/* SD_STS */
On 2011-04-07 12:10, Takashi Iwai wrote:
At Thu, 7 Apr 2011 16:54:50 +0800, Xu, Andiry wrote:
Hi Takashi,
Currently I encountered an HD audio issue reported by customer on a Lenovo x100e system. Sometimes, but not always, when starting a recording stream through the HDA controller, the controller generates a large amount of interrupts (~40 000 interrupts per second). After this has happened, jack sense (i e unsolicited events from the codec) stops working until the next system reboot.
This seems more reproducible now than it was a while ago, now it seems to happen more often than not.
SD0STS returns a FIFO error (0x28) in interrupt handler. The interrupt service routine acknowledges this error but does not do anything to counteract the root cause to the problem, so it appears again and again. Restarting the stream does not seem to help. Enable MSI or not does not help either.
The issue occurs on 2.6.35 and 2.6.38-rc8+, have not tried latest kernel yet but I think it's also there. FIFO error indicates FIFO overrun occurring while the RUN bit is set, but the driver simply acknowledge and clear the error. I wonder what the root cause and the right treatment are in this case. Any suggestions?
Brainstorming: 1) We could try adding udelays at random places. 2) There are a few workarounds for different chips already in hda_intel.c, maybe try applying them here as well.
Does that make sense to either of you?
FIFO_ERR is actually never handled, so basically we can ignore. Simply disabling it like below works around your problem?
Of course, a proper handling of FIFO error would be better, but this may need more code rewrites.
This does not fix the error in question, it removes the interrupts all right, but neither recording nor jack sense starts to work.
At Thu, 07 Apr 2011 16:30:00 +0200, David Henningsson wrote:
On 2011-04-07 12:10, Takashi Iwai wrote:
At Thu, 7 Apr 2011 16:54:50 +0800, Xu, Andiry wrote:
Hi Takashi,
Currently I encountered an HD audio issue reported by customer on a Lenovo x100e system. Sometimes, but not always, when starting a recording stream through the HDA controller, the controller generates a large amount of interrupts (~40 000 interrupts per second). After this has happened, jack sense (i e unsolicited events from the codec) stops working until the next system reboot.
This seems more reproducible now than it was a while ago, now it seems to happen more often than not.
SD0STS returns a FIFO error (0x28) in interrupt handler. The interrupt service routine acknowledges this error but does not do anything to counteract the root cause to the problem, so it appears again and again. Restarting the stream does not seem to help. Enable MSI or not does not help either.
The issue occurs on 2.6.35 and 2.6.38-rc8+, have not tried latest kernel yet but I think it's also there. FIFO error indicates FIFO overrun occurring while the RUN bit is set, but the driver simply acknowledge and clear the error. I wonder what the root cause and the right treatment are in this case. Any suggestions?
Brainstorming:
- We could try adding udelays at random places.
- There are a few workarounds for different chips already in
hda_intel.c, maybe try applying them here as well.
Does that make sense to either of you?
Hm, I don't think this would make any difference. For the DMA engine, it doesn't matter whether there is some delays in the code or not. It's set up for the free-wheel run. The driver just reads the current position, but changes anything else than the buffer contents. (And FIFO XRUN is irrelevant with the buffer contents.)
FIFO_ERR is actually never handled, so basically we can ignore. Simply disabling it like below works around your problem?
Of course, a proper handling of FIFO error would be better, but this may need more code rewrites.
This does not fix the error in question, it removes the interrupts all right, but neither recording nor jack sense starts to work.
Well, the unanswered question is why this interrupt is generated. We don't know whether this interrupt is really correctly generated. It might be wrongly triggered by some condition. If so, masking and ignoring the false error is the right fix, I guess.
Takashi
On 2011-04-07 16:30, David Henningsson wrote:
On 2011-04-07 12:10, Takashi Iwai wrote:
At Thu, 7 Apr 2011 16:54:50 +0800, Xu, Andiry wrote:
Hi Takashi,
Currently I encountered an HD audio issue reported by customer on a Lenovo x100e system. Sometimes, but not always, when starting a recording stream through the HDA controller, the controller generates a large amount of interrupts (~40 000 interrupts per second). After this has happened, jack sense (i e unsolicited events from the codec) stops working until the next system reboot.
This seems more reproducible now than it was a while ago, now it seems to happen more often than not.
SD0STS returns a FIFO error (0x28) in interrupt handler. The interrupt service routine acknowledges this error but does not do anything to counteract the root cause to the problem, so it appears again and again. Restarting the stream does not seem to help. Enable MSI or not does not help either.
The issue occurs on 2.6.35 and 2.6.38-rc8+, have not tried latest kernel yet but I think it's also there. FIFO error indicates FIFO overrun occurring while the RUN bit is set, but the driver simply acknowledge and clear the error. I wonder what the root cause and the right treatment are in this case. Any suggestions?
Brainstorming:
- We could try adding udelays at random places.
- There are a few workarounds for different chips already in
hda_intel.c, maybe try applying them here as well.
Does that make sense to either of you?
FIFO_ERR is actually never handled, so basically we can ignore. Simply disabling it like below works around your problem?
Of course, a proper handling of FIFO error would be better, but this may need more code rewrites.
This does not fix the error in question, it removes the interrupts all right, but neither recording nor jack sense starts to work.
Hi Andiry,
I'm still trying to get a grip of what the error could be. Having re-enabled the FIFO interrupt again and some debug printk's the first time the FIFO error happens, I notice that it happens after a while, where "a while" ranges from 100 ms to 4 s or so. CBL is 65536 and LPIB seems to be a reasonable value (a value below 65536, approximately corresponding to the time the stream has been running). BDL entries look correct.
Can it be something else than a chipset bug in this case? I'm trying to rule out every possible driver problem I can think of.
Btw - how can a recording FIFO overrun in the first place? I mean, if we have two BDLE buffers with 32768 bytes in each (assume bl_pos_adj=0 for this example), the FIFO should just write to the first, then the second, then the first again, and so on, without having overrun errors.
participants (3)
-
David Henningsson
-
Takashi Iwai
-
Xu, Andiry