[alsa-devel] snd-hda-intel - wallclk patch
Hi,
I would like to ask HDA gurus to check this patch (I will include it to my tree once acked). The patch uses WALLCLK from HDA chips (marked as required in the HDA specification) to check for bogus - too early - interrups which confuses the upper PCM layer (sound skipping issues).
More details about patch testing on problematic hardware can be found at:
https://bugzilla.kernel.org/show_bug.cgi?id=15912
Thanks, Jaroslav
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 236b4ca..714978f 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -174,7 +174,7 @@ MODULE_DESCRIPTION("Intel HDA driver"); #define ICH6_GSTS_FSTS (1 << 1) /* flush status */ #define ICH6_REG_INTCTL 0x20 #define ICH6_REG_INTSTS 0x24 -#define ICH6_REG_WALCLK 0x30 +#define ICH6_REG_WALLCLK 0x30 /* 24Mhz source */ #define ICH6_REG_SYNC 0x34 #define ICH6_REG_CORBLBASE 0x40 #define ICH6_REG_CORBUBASE 0x44 @@ -340,8 +340,8 @@ struct azx_dev { unsigned int period_bytes; /* size of the period in bytes */ unsigned int frags; /* number for period in the play buffer */ unsigned int fifo_size; /* FIFO size */ - unsigned long start_jiffies; /* start + minimum jiffies */ - unsigned long min_jiffies; /* minimum jiffies before position is valid */ + unsigned long start_wallclk; /* start + minimum wallclk */ + unsigned long period_wallclk; /* wallclk for period */
void __iomem *sd_addr; /* stream descriptor pointer */
@@ -361,7 +361,6 @@ struct azx_dev { unsigned int opened :1; unsigned int running :1; unsigned int irq_pending :1; - unsigned int start_flag: 1; /* stream full start flag */ /* * For VIA: * A flag to ensure DMA position is 0 @@ -1674,8 +1673,9 @@ static int azx_pcm_prepare(struct snd_pcm_substream *substream) return err; }
- azx_dev->min_jiffies = (runtime->period_size * HZ) / - (runtime->rate * 2); + /* wallclk has 24Mhz clock source */ + azx_dev->period_wallclk = (((runtime->period_size * 24000) / + runtime->rate) * 1000); azx_setup_controller(chip, azx_dev); if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) azx_dev->fifo_size = azx_sd_readw(azx_dev, SD_FIFOSIZE) + 1; @@ -1729,14 +1729,15 @@ static int azx_pcm_trigger(struct snd_pcm_substream *substream, int cmd) if (s->pcm->card != substream->pcm->card) continue; azx_dev = get_azx_dev(s); - if (rstart) { - azx_dev->start_flag = 1; - azx_dev->start_jiffies = jiffies + azx_dev->min_jiffies; - } - if (start) + if (start) { + azx_dev->start_wallclk = azx_readl(chip, WALLCLK); + if (!rstart) + azx_dev->start_wallclk -= + azx_dev->period_wallclk; azx_stream_start(chip, azx_dev); - else + } else { azx_stream_stop(chip, azx_dev); + } azx_dev->running = start; } spin_unlock(&chip->reg_lock); @@ -1880,12 +1881,13 @@ static snd_pcm_uframes_t azx_pcm_pointer(struct snd_pcm_substream *substream) */ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) { + u32 wallclk; unsigned int pos;
- if (azx_dev->start_flag && - time_before_eq(jiffies, azx_dev->start_jiffies)) + wallclk = azx_readl(chip, WALLCLK); + if ((wallclk - azx_dev->start_wallclk) < + (azx_dev->period_wallclk * 2) / 3) return -1; /* bogus (too early) interrupt */ - azx_dev->start_flag = 0;
pos = azx_get_position(chip, azx_dev); if (chip->position_fix == POS_FIX_AUTO) { @@ -1899,13 +1901,12 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) chip->position_fix = POS_FIX_POSBUF; }
- if (!bdl_pos_adj[chip->dev_index]) - return 1; /* no delayed ack */ if (WARN_ONCE(!azx_dev->period_bytes, "hda-intel: zero azx_dev->period_bytes")) return 0; /* this shouldn't happen! */ if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2) return 0; /* NG - it's below the period boundary */ + azx_dev->start_wallclk = wallclk; return 1; /* OK, it's fine */ }
@@ -1915,7 +1916,7 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) static void azx_irq_pending_work(struct work_struct *work) { struct azx *chip = container_of(work, struct azx, irq_pending_work); - int i, pending; + int i, pending, ok;
if (!chip->irq_pending_warned) { printk(KERN_WARNING @@ -1934,11 +1935,14 @@ static void azx_irq_pending_work(struct work_struct *work) !azx_dev->substream || !azx_dev->running) continue; - if (azx_position_ok(chip, azx_dev)) { + ok = azx_position_ok(chip, azx_dev); + if (ok > 0) { azx_dev->irq_pending = 0; spin_unlock(&chip->reg_lock); snd_pcm_period_elapsed(azx_dev->substream); spin_lock(&chip->reg_lock); + } else if (ok < 0) { + pending = 0; /* too early */ } else pending++; }
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
On Mon, 10 May 2010, Jaroslav Kysela wrote:
Hi,
I would like to ask HDA gurus to check this patch (I will include it to my tree once acked). The patch uses WALLCLK from HDA chips (marked as required in the HDA specification) to check for bogus - too early - interrups which confuses the upper PCM layer (sound skipping issues).
More details about patch testing on problematic hardware can be found at:
No objections received, so I pushed this code to my master/devel branches.
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
At Tue, 11 May 2010 10:28:50 +0200 (CEST), Jaroslav Kysela wrote:
On Mon, 10 May 2010, Jaroslav Kysela wrote:
Hi,
I would like to ask HDA gurus to check this patch (I will include it to my tree once acked). The patch uses WALLCLK from HDA chips (marked as required in the HDA specification) to check for bogus - too early - interrups which confuses the upper PCM layer (sound skipping issues).
More details about patch testing on problematic hardware can be found at:
No objections received, so I pushed this code to my master/devel branches.
Heh, that was quick, I had no time to review :)
The change looks good to me. Thanks for your fix (and rebasing).
One thing is that you removed bdl_pos_adj=0 check. This was there explicitly to avoid the delayed irq handling no matter whether the value is correct or not. I guess it's safer to take it back, or give any other way for user to avoid the delayed irq handling.
thanks,
Takashi
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:28:50 +0200 (CEST), Jaroslav Kysela wrote:
On Mon, 10 May 2010, Jaroslav Kysela wrote:
Hi,
I would like to ask HDA gurus to check this patch (I will include it to my tree once acked). The patch uses WALLCLK from HDA chips (marked as required in the HDA specification) to check for bogus - too early - interrups which confuses the upper PCM layer (sound skipping issues).
More details about patch testing on problematic hardware can be found at:
No objections received, so I pushed this code to my master/devel branches.
Heh, that was quick, I had no time to review :)
The change looks good to me. Thanks for your fix (and rebasing).
One thing is that you removed bdl_pos_adj=0 check. This was there explicitly to avoid the delayed irq handling no matter whether the value is correct or not. I guess it's safer to take it back, or give any other way for user to avoid the delayed irq handling.
I see. But I need the pos % period_bytes check too. What about this change?
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 0a6c55b..6674841 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1910,9 +1910,11 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev)
if (WARN_ONCE(!azx_dev->period_bytes, "hda-intel: zero azx_dev->period_bytes")) - return 0; /* this shouldn't happen! */ + /* this shouldn't happen! */ + return bdl_pos_adj[chip->dev_index] ? 0 : -1; if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2) - return 0; /* NG - it's below the period boundary */ + /* NG - it's below the period boundary */ + return bdl_pos_adj[chip->dev_index] ? 0 : -1; azx_dev->start_wallclk = wallclk; return 1; /* OK, it's fine */ }
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
At Tue, 11 May 2010 10:44:17 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:28:50 +0200 (CEST), Jaroslav Kysela wrote:
On Mon, 10 May 2010, Jaroslav Kysela wrote:
Hi,
I would like to ask HDA gurus to check this patch (I will include it to my tree once acked). The patch uses WALLCLK from HDA chips (marked as required in the HDA specification) to check for bogus - too early - interrups which confuses the upper PCM layer (sound skipping issues).
More details about patch testing on problematic hardware can be found at:
No objections received, so I pushed this code to my master/devel branches.
Heh, that was quick, I had no time to review :)
The change looks good to me. Thanks for your fix (and rebasing).
One thing is that you removed bdl_pos_adj=0 check. This was there explicitly to avoid the delayed irq handling no matter whether the value is correct or not. I guess it's safer to take it back, or give any other way for user to avoid the delayed irq handling.
I see. But I need the pos % period_bytes check too. What about this change?
With bdl_pos_adj==0, it should return 1 (non-error), but otherwise this should be OK, I guess.
thanks,
Takashi
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 0a6c55b..6674841 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1910,9 +1910,11 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev)
if (WARN_ONCE(!azx_dev->period_bytes, "hda-intel: zero azx_dev->period_bytes"))
return 0; /* this shouldn't happen! */
/* this shouldn't happen! */
if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)return bdl_pos_adj[chip->dev_index] ? 0 : -1;
return 0; /* NG - it's below the period boundary */
/* NG - it's below the period boundary */
azx_dev->start_wallclk = wallclk; return 1; /* OK, it's fine */ }return bdl_pos_adj[chip->dev_index] ? 0 : -1;
Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:44:17 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:28:50 +0200 (CEST), Jaroslav Kysela wrote:
On Mon, 10 May 2010, Jaroslav Kysela wrote:
Hi,
I would like to ask HDA gurus to check this patch (I will include it to my tree once acked). The patch uses WALLCLK from HDA chips (marked as required in the HDA specification) to check for bogus - too early - interrups which confuses the upper PCM layer (sound skipping issues).
More details about patch testing on problematic hardware can be found at:
No objections received, so I pushed this code to my master/devel branches.
Heh, that was quick, I had no time to review :)
The change looks good to me. Thanks for your fix (and rebasing).
One thing is that you removed bdl_pos_adj=0 check. This was there explicitly to avoid the delayed irq handling no matter whether the value is correct or not. I guess it's safer to take it back, or give any other way for user to avoid the delayed irq handling.
I see. But I need the pos % period_bytes check too. What about this change?
With bdl_pos_adj==0, it should return 1 (non-error), but otherwise this should be OK, I guess.
The problem with returning value 1 -> call snd_pcm_elapsed() - is that the upper layer will be most probably confused on broken hw. From logs in kernel bz#15912, the bad irq can be triggered at any time (I really don't know why INTSTS register is set).
The first wallclk check uses to check for first 2/3 of period and the pos % period_bytes check checks for second half of period. Writing this, I think that the extra check can be added to the second condition to do position related check for first next period only:
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 0a6c55b..d4d532a 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1890,9 +1890,8 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) unsigned int pos; int stream;
- wallclk = azx_readl(chip, WALLCLK); - if ((wallclk - azx_dev->start_wallclk) < - (azx_dev->period_wallclk * 2) / 3) + wallclk = azx_readl(chip, WALLCLK) - azx_dev->start_wallclk; + if (wallclk < (azx_dev->period_wallclk * 2) / 3) return -1; /* bogus (too early) interrupt */
stream = azx_dev->substream->stream; @@ -1910,9 +1909,12 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev)
if (WARN_ONCE(!azx_dev->period_bytes, "hda-intel: zero azx_dev->period_bytes")) - return 0; /* this shouldn't happen! */ - if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2) - return 0; /* NG - it's below the period boundary */ + /* this shouldn't happen! */ + return bdl_pos_adj[chip->dev_index] ? 0 : -1; + if (wallclk <= azx_dev->period_wallclk && + pos % azx_dev->period_bytes > azx_dev->period_bytes / 2) + /* NG - it's below the first next period boundary */ + return bdl_pos_adj[chip->dev_index] ? 0 : -1; azx_dev->start_wallclk = wallclk; return 1; /* OK, it's fine */ }
I don't see any reason to call snd_pcm_elapsed() in bad time. It's better to return -1 (ignore) and wait for next interrupt.
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
At Tue, 11 May 2010 11:07:05 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:44:17 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:28:50 +0200 (CEST), Jaroslav Kysela wrote:
On Mon, 10 May 2010, Jaroslav Kysela wrote:
Hi,
I would like to ask HDA gurus to check this patch (I will include it to my tree once acked). The patch uses WALLCLK from HDA chips (marked as required in the HDA specification) to check for bogus - too early - interrups which confuses the upper PCM layer (sound skipping issues).
More details about patch testing on problematic hardware can be found at:
No objections received, so I pushed this code to my master/devel branches.
Heh, that was quick, I had no time to review :)
The change looks good to me. Thanks for your fix (and rebasing).
One thing is that you removed bdl_pos_adj=0 check. This was there explicitly to avoid the delayed irq handling no matter whether the value is correct or not. I guess it's safer to take it back, or give any other way for user to avoid the delayed irq handling.
I see. But I need the pos % period_bytes check too. What about this change?
With bdl_pos_adj==0, it should return 1 (non-error), but otherwise this should be OK, I guess.
The problem with returning value 1 -> call snd_pcm_elapsed() - is that the upper layer will be most probably confused on broken hw. From logs in kernel bz#15912, the bad irq can be triggered at any time (I really don't know why INTSTS register is set).
The first wallclk check uses to check for first 2/3 of period and the pos % period_bytes check checks for second half of period. Writing this, I think that the extra check can be added to the second condition to do position related check for first next period only:
Isn't a similar check done in PCM core side? Filtering or delaying ack is perhaps better done here, though...
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 0a6c55b..d4d532a 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1890,9 +1890,8 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) unsigned int pos; int stream;
- wallclk = azx_readl(chip, WALLCLK);
- if ((wallclk - azx_dev->start_wallclk) <
(azx_dev->period_wallclk * 2) / 3)
wallclk = azx_readl(chip, WALLCLK) - azx_dev->start_wallclk;
if (wallclk < (azx_dev->period_wallclk * 2) / 3) return -1; /* bogus (too early) interrupt */
stream = azx_dev->substream->stream;
@@ -1910,9 +1909,12 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev)
if (WARN_ONCE(!azx_dev->period_bytes, "hda-intel: zero azx_dev->period_bytes"))
return 0; /* this shouldn't happen! */
- if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)
return 0; /* NG - it's below the period boundary */
/* this shouldn't happen! */
return bdl_pos_adj[chip->dev_index] ? 0 : -1;
- if (wallclk <= azx_dev->period_wallclk &&
pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)
/* NG - it's below the first next period boundary */
azx_dev->start_wallclk = wallclk; return 1; /* OK, it's fine */ }return bdl_pos_adj[chip->dev_index] ? 0 : -1;
I don't see any reason to call snd_pcm_elapsed() in bad time. It's better to return -1 (ignore) and wait for next interrupt.
Hm, judging from the current situation, a missing IRQ is harmless than the wrong position ack, indeed.
But delaying the period handling is a bit different from filtering bogus IRQs. I'm not entirely sure which is the cause in the bug above. If it's a bogus IRQ, could it be filtered out. If it's an IRQ at wrong time (too early), the delayed ack is needed. If it's the wrong position read, the delayed ack isn't needed but the position correction is needed.
So, I'm a bit afraid of adding too much checks that might give a reaction based on wrong information. We'd need to take a detailed look again at each cause...
BTW, now I looked back again, and think WARN_ONCE(!azx_dev->period_bytes) should be also returning -1 for normal cases, no? This is just a sanity check, and if this happens, the chance that you get the right IRQ at the next moment isn't high, since something got screwed up.
thanks,
Takashi
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 11:07:05 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:44:17 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:28:50 +0200 (CEST), Jaroslav Kysela wrote:
On Mon, 10 May 2010, Jaroslav Kysela wrote:
> Hi, > > I would like to ask HDA gurus to check this patch (I will include it > to my tree once acked). The patch uses WALLCLK from HDA chips (marked as > required in the HDA specification) to check for bogus - too early - interrups > which confuses the upper PCM layer (sound skipping issues). > > More details about patch testing on problematic hardware can be found > at: > > https://bugzilla.kernel.org/show_bug.cgi?id=15912
No objections received, so I pushed this code to my master/devel branches.
Heh, that was quick, I had no time to review :)
The change looks good to me. Thanks for your fix (and rebasing).
One thing is that you removed bdl_pos_adj=0 check. This was there explicitly to avoid the delayed irq handling no matter whether the value is correct or not. I guess it's safer to take it back, or give any other way for user to avoid the delayed irq handling.
I see. But I need the pos % period_bytes check too. What about this change?
With bdl_pos_adj==0, it should return 1 (non-error), but otherwise this should be OK, I guess.
The problem with returning value 1 -> call snd_pcm_elapsed() - is that the upper layer will be most probably confused on broken hw. From logs in kernel bz#15912, the bad irq can be triggered at any time (I really don't know why INTSTS register is set).
The first wallclk check uses to check for first 2/3 of period and the pos % period_bytes check checks for second half of period. Writing this, I think that the extra check can be added to the second condition to do position related check for first next period only:
Isn't a similar check done in PCM core side? Filtering or delaying ack is perhaps better done here, though...
Unfortunately, the check in PCM core expects that at least 'period_time' is elapsed between two snd_pcm_elapsed() calls:
if (in_interrupt) { /* we know that one period was processed */ /* delta = "expected next hw_ptr" for in_interrupt != 0 */ delta = runtime->hw_ptr_interrupt + runtime->period_size;
In situation, when the driver calls the snd_pcm_elapsed() too early, the pointers will be updated wrongly (whole ring buffer size will be added to hw_ptr).
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 0a6c55b..d4d532a 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1890,9 +1890,8 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) unsigned int pos; int stream;
- wallclk = azx_readl(chip, WALLCLK);
- if ((wallclk - azx_dev->start_wallclk) <
(azx_dev->period_wallclk * 2) / 3)
wallclk = azx_readl(chip, WALLCLK) - azx_dev->start_wallclk;
if (wallclk < (azx_dev->period_wallclk * 2) / 3) return -1; /* bogus (too early) interrupt */
stream = azx_dev->substream->stream;
@@ -1910,9 +1909,12 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev)
if (WARN_ONCE(!azx_dev->period_bytes, "hda-intel: zero azx_dev->period_bytes"))
return 0; /* this shouldn't happen! */
- if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)
return 0; /* NG - it's below the period boundary */
/* this shouldn't happen! */
return bdl_pos_adj[chip->dev_index] ? 0 : -1;
- if (wallclk <= azx_dev->period_wallclk &&
pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)
/* NG - it's below the first next period boundary */
azx_dev->start_wallclk = wallclk; return 1; /* OK, it's fine */ }return bdl_pos_adj[chip->dev_index] ? 0 : -1;
I don't see any reason to call snd_pcm_elapsed() in bad time. It's better to return -1 (ignore) and wait for next interrupt.
Hm, judging from the current situation, a missing IRQ is harmless than the wrong position ack, indeed.
But delaying the period handling is a bit different from filtering bogus IRQs. I'm not entirely sure which is the cause in the bug above. If it's a bogus IRQ, could it be filtered out. If it's an IRQ at wrong time (too early), the delayed ack is needed. If it's the wrong position read, the delayed ack isn't needed but the position correction is needed.
So, I'm a bit afraid of adding too much checks that might give a reaction based on wrong information. We'd need to take a detailed look again at each cause...
I'm also not so much happy to have this code in the driver, but it seems that HDA hardware does not respect the standard behaviour in some cases. Anyway, it's good to have the code robust.
BTW, now I looked back again, and think WARN_ONCE(!azx_dev->period_bytes) should be also returning -1 for normal cases, no? This is just a sanity check, and if this happens, the chance that you get the right IRQ at the next moment isn't high, since something got screwed up.
I agree. I put all suggested changes to my tree into "[ALSA] snd-hda-intel: Improve azx_position_ok()" commit.
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
At Tue, 11 May 2010 12:23:19 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 11:07:05 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:44:17 +0200 (CEST), Jaroslav Kysela wrote:
On Tue, 11 May 2010, Takashi Iwai wrote:
At Tue, 11 May 2010 10:28:50 +0200 (CEST), Jaroslav Kysela wrote: > > On Mon, 10 May 2010, Jaroslav Kysela wrote: > >> Hi, >> >> I would like to ask HDA gurus to check this patch (I will include it >> to my tree once acked). The patch uses WALLCLK from HDA chips (marked as >> required in the HDA specification) to check for bogus - too early - interrups >> which confuses the upper PCM layer (sound skipping issues). >> >> More details about patch testing on problematic hardware can be found >> at: >> >> https://bugzilla.kernel.org/show_bug.cgi?id=15912 > > No objections received, so I pushed this code to my master/devel branches.
Heh, that was quick, I had no time to review :)
The change looks good to me. Thanks for your fix (and rebasing).
One thing is that you removed bdl_pos_adj=0 check. This was there explicitly to avoid the delayed irq handling no matter whether the value is correct or not. I guess it's safer to take it back, or give any other way for user to avoid the delayed irq handling.
I see. But I need the pos % period_bytes check too. What about this change?
With bdl_pos_adj==0, it should return 1 (non-error), but otherwise this should be OK, I guess.
The problem with returning value 1 -> call snd_pcm_elapsed() - is that the upper layer will be most probably confused on broken hw. From logs in kernel bz#15912, the bad irq can be triggered at any time (I really don't know why INTSTS register is set).
The first wallclk check uses to check for first 2/3 of period and the pos % period_bytes check checks for second half of period. Writing this, I think that the extra check can be added to the second condition to do position related check for first next period only:
Isn't a similar check done in PCM core side? Filtering or delaying ack is perhaps better done here, though...
Unfortunately, the check in PCM core expects that at least 'period_time' is elapsed between two snd_pcm_elapsed() calls:
if (in_interrupt) { /* we know that one period was processed */ /* delta = "expected next hw_ptr" for in_interrupt != 0 */ delta = runtime->hw_ptr_interrupt + runtime->period_size;
In situation, when the driver calls the snd_pcm_elapsed() too early, the pointers will be updated wrongly (whole ring buffer size will be added to hw_ptr).
Yeah, that's bad.
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 0a6c55b..d4d532a 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1890,9 +1890,8 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev) unsigned int pos; int stream;
- wallclk = azx_readl(chip, WALLCLK);
- if ((wallclk - azx_dev->start_wallclk) <
(azx_dev->period_wallclk * 2) / 3)
wallclk = azx_readl(chip, WALLCLK) - azx_dev->start_wallclk;
if (wallclk < (azx_dev->period_wallclk * 2) / 3) return -1; /* bogus (too early) interrupt */
stream = azx_dev->substream->stream;
@@ -1910,9 +1909,12 @@ static int azx_position_ok(struct azx *chip, struct azx_dev *azx_dev)
if (WARN_ONCE(!azx_dev->period_bytes, "hda-intel: zero azx_dev->period_bytes"))
return 0; /* this shouldn't happen! */
- if (pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)
return 0; /* NG - it's below the period boundary */
/* this shouldn't happen! */
return bdl_pos_adj[chip->dev_index] ? 0 : -1;
- if (wallclk <= azx_dev->period_wallclk &&
pos % azx_dev->period_bytes > azx_dev->period_bytes / 2)
/* NG - it's below the first next period boundary */
azx_dev->start_wallclk = wallclk; return 1; /* OK, it's fine */ }return bdl_pos_adj[chip->dev_index] ? 0 : -1;
I don't see any reason to call snd_pcm_elapsed() in bad time. It's better to return -1 (ignore) and wait for next interrupt.
Hm, judging from the current situation, a missing IRQ is harmless than the wrong position ack, indeed.
But delaying the period handling is a bit different from filtering bogus IRQs. I'm not entirely sure which is the cause in the bug above. If it's a bogus IRQ, could it be filtered out. If it's an IRQ at wrong time (too early), the delayed ack is needed. If it's the wrong position read, the delayed ack isn't needed but the position correction is needed.
So, I'm a bit afraid of adding too much checks that might give a reaction based on wrong information. We'd need to take a detailed look again at each cause...
I'm also not so much happy to have this code in the driver, but it seems that HDA hardware does not respect the standard behaviour in some cases. Anyway, it's good to have the code robust.
BTW, now I looked back again, and think WARN_ONCE(!azx_dev->period_bytes) should be also returning -1 for normal cases, no? This is just a sanity check, and if this happens, the chance that you get the right IRQ at the next moment isn't high, since something got screwed up.
I agree. I put all suggested changes to my tree into "[ALSA] snd-hda-intel: Improve azx_position_ok()" commit.
Thanks! I pulled now this, too.
Takashi
participants (2)
-
Jaroslav Kysela
-
Takashi Iwai