[alsa-devel] 2.6.30: xruns caused by "ALSA: pcm - Safer boundary checks"
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0) ALSA sound/core/pcm_lib.c:337: hda_codec: hw_ptr skipping! (pos=11730, delta=1490, period=1024, jdelta=10/31) ALSA sound/core/pcm_lib.c:337: hda_codec: hw_ptr skipping! (pos=11734, delta=1494, period=1024, jdelta=10/31) ALSA sound/core/pcm_lib.c:337: hda_codec: hw_ptr skipping! (pos=11734, delta=1494, period=1024, jdelta=10/31) ALSA sound/core/pcm_lib.c:337: hda_codec: hw_ptr skipping! (pos=11735, delta=1495, period=1024, jdelta=10/31) ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=12287, delta=2047, period=1024, jdelta=21/42/0) ALSA sound/core/pcm_lib.c:337: hda_codec: hw_ptr skipping! (pos=12449, delta=2209, period=1024, jdelta=4/46)
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote:
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
thanks,
Takashi
--- diff --git a/include/sound/pcm.h b/include/sound/pcm.h index c172968..7366910 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -266,6 +266,7 @@ struct snd_pcm_runtime { struct snd_pcm_substream *trigger_master; struct timespec trigger_tstamp; /* trigger timestamp */ int overrange; + unsigned int jiffies_set:1; /* hw_ptr_jiffies is set */ snd_pcm_uframes_t avail_max; snd_pcm_uframes_t hw_ptr_base; /* Position at buffer restart */ snd_pcm_uframes_t hw_ptr_interrupt; /* Position at interrupt time */ diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index a2a792c..78edd6e 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -249,6 +249,8 @@ static int snd_pcm_update_hw_ptr_interrupt(struct snd_pcm_substream *substream) new_hw_ptr = hw_base + pos; } } + if (!runtime->jiffies_set) + goto no_jiffies_check; /* Skip the jiffies check for hardwares with BATCH flag. * Such hardware usually just increases the position at each IRQ, * thus it can't give any strange position. @@ -296,6 +298,7 @@ static int snd_pcm_update_hw_ptr_interrupt(struct snd_pcm_substream *substream) runtime->hw_ptr_base = hw_base; runtime->status->hw_ptr = new_hw_ptr; runtime->hw_ptr_jiffies = jiffies; + runtime->jiffies_set = 1; runtime->hw_ptr_interrupt = hw_ptr_interrupt;
return snd_pcm_update_hw_ptr_post(substream, runtime); @@ -336,7 +339,8 @@ int snd_pcm_update_hw_ptr(struct snd_pcm_substream *substream) hw_base = 0; new_hw_ptr = hw_base + pos; } - if (((delta * HZ) / runtime->rate) > jdelta + HZ/100) { + if (runtime->jiffies_set && + ((delta * HZ) / runtime->rate) > jdelta + HZ/100) { hw_ptr_error(substream, "hw_ptr skipping! " "(pos=%ld, delta=%ld, period=%ld, jdelta=%lu/%lu)\n", @@ -352,6 +356,7 @@ int snd_pcm_update_hw_ptr(struct snd_pcm_substream *substream) runtime->hw_ptr_base = hw_base; runtime->status->hw_ptr = new_hw_ptr; runtime->hw_ptr_jiffies = jiffies; + runtime->jiffies_set = 1;
return snd_pcm_update_hw_ptr_post(substream, runtime); } @@ -1478,7 +1483,6 @@ static int snd_pcm_lib_ioctl_reset(struct snd_pcm_substream *substream, runtime->status->hw_ptr %= runtime->buffer_size; else runtime->status->hw_ptr = 0; - runtime->hw_ptr_jiffies = jiffies; snd_pcm_stream_unlock_irqrestore(substream, flags); return 0; } diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index fc6f98e..5cf67a5 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -978,6 +978,7 @@ static void snd_pcm_post_pause(struct snd_pcm_substream *substream, int push) { struct snd_pcm_runtime *runtime = substream->runtime; snd_pcm_trigger_tstamp(substream); + runtime->jiffies_set = 0; if (push) { runtime->status->state = SNDRV_PCM_STATE_PAUSED; if (substream->timer) @@ -1041,6 +1042,7 @@ static void snd_pcm_post_suspend(struct snd_pcm_substream *substream, int state) &runtime->trigger_tstamp); runtime->status->suspended_state = runtime->status->state; runtime->status->state = SNDRV_PCM_STATE_SUSPENDED; + runtime->jiffies_set = 0; wake_up(&runtime->sleep); }
@@ -1234,6 +1236,7 @@ static int snd_pcm_do_reset(struct snd_pcm_substream *substream, int state) runtime->status->hw_ptr % runtime->period_size; runtime->silence_start = runtime->status->hw_ptr; runtime->silence_filled = 0; + runtime->jiffies_set = 0; return 0; }
On Fri, 22 May 2009, Takashi Iwai wrote:
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote:
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
This patch does not look as a proper fix. I'll look to this problem on Monday.
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
At Sat, 23 May 2009 14:31:57 +0200 (CEST), Jaroslav Kysela wrote:
On Fri, 22 May 2009, Takashi Iwai wrote:
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote:
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
This patch does not look as a proper fix.
Hm, at which point? It just skips the first unreliable jiffies check. At least, it's interesting whether it works around the problem.
thanks,
Takashi
On Sat, May 23, 2009 at 2:22 PM, Takashi Iwai tiwai@suse.de wrote:
At Sat, 23 May 2009 14:31:57 +0200 (CEST), Jaroslav Kysela wrote:
On Fri, 22 May 2009, Takashi Iwai wrote:
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote:
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
This patch does not look as a proper fix.
Hm, at which point? It just skips the first unreliable jiffies check. At least, it's interesting whether it works around the problem.
thanks,
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Takashi:
I have submitted https://bugtrack.alsa-project.org/alsa-bug/view.php?id=4523 in regards to this problem. I can recreated the xrun messages 100% of the time and the routine in question never recovers. With a workaround suggested by Jaroslav, the xruns go away, but the calling process ramps up to 100% CPU utilization and unless you are on a dual or quad core machine there is quite a lockup. I revert to the 1.0.19 pcmlib.c in the 1.0.20 driver and I see neither problem. Thanks for all of your help Jaroslav. If there is any way that I can help debug this problem, please let me know.
Best Regards,
At Sat, 23 May 2009 17:34:54 -0400, Robert Krakora wrote:
On Sat, May 23, 2009 at 2:22 PM, Takashi Iwai tiwai@suse.de wrote:
At Sat, 23 May 2009 14:31:57 +0200 (CEST), Jaroslav Kysela wrote:
On Fri, 22 May 2009, Takashi Iwai wrote:
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote:
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
This patch does not look as a proper fix.
Hm, at which point? It just skips the first unreliable jiffies check. At least, it's interesting whether it works around the problem.
thanks,
Takashi _______________________________________________ Alsa-devel mailing list Alsa-devel@alsa-project.org http://mailman.alsa-project.org/mailman/listinfo/alsa-devel
Takashi:
I have submitted https://bugtrack.alsa-project.org/alsa-bug/view.php?id=4523 in regards to this problem. I can recreated the xrun messages 100% of the time and the routine in question never recovers. With a workaround suggested by Jaroslav, the xruns go away, but the calling process ramps up to 100% CPU utilization and unless you are on a dual or quad core machine there is quite a lockup. I revert to the 1.0.19 pcmlib.c in the 1.0.20 driver and I see neither problem. Thanks for all of your help Jaroslav. If there is any way that I can help debug this problem, please let me know.
To be sure, did you try the latest alsa-driver snapshot (not 1.0.20)? ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/snapshot/alsa-driver-snapshot.tar.gz There are already many fixes since 1.0.20, so...
Your problem seems not quite as same as the problem stated in this thread. But, anyway testing my patch is helpful.
thanks,
Takashi
To be sure, did you try the latest alsa-driver snapshot (not 1.0.20)? ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/snapshot/alsa-driver-snapshot.tar.gz There are already many fixes since 1.0.20, so...
Your problem seems not quite as same as the problem stated in this thread. But, anyway testing my patch is helpful.
I am the original bug reporter to Fedora. Is there anything I can do to help troubleshoot this further?
At Sun, 24 May 2009 22:13:38 -0700, Nathan Grennan wrote:
To be sure, did you try the latest alsa-driver snapshot (not 1.0.20)? ftp://ftp.kernel.org/pub/linux/kernel/people/tiwai/snapshot/alsa-driver-snapshot.tar.gz There are already many fixes since 1.0.20, so...
Your problem seems not quite as same as the problem stated in this thread. But, anyway testing my patch is helpful.
I am the original bug reporter to Fedora. Is there anything I can do to help troubleshoot this further?
Testing the patch would be helpful :)
thanks,
Takashi
At Sun, 24 May 2009 23:56:40 -0700, Nathan Grennan wrote:
Takashi Iwai wrote:
Testing the patch would be helpful :)
Will do
Did you get any positive / negative result?
We are in the late rc stage, so a fix should be merged ASAP...
thanks,
Takashi
On Sat, 23 May 2009, Takashi Iwai wrote:
At Sat, 23 May 2009 14:31:57 +0200 (CEST), Jaroslav Kysela wrote:
On Fri, 22 May 2009, Takashi Iwai wrote:
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote:
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
This patch does not look as a proper fix.
Hm, at which point? It just skips the first unreliable jiffies check. At least, it's interesting whether it works around the problem.
I meant that it is not necessary to introduce a next variable. This patch is more elegant:
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index a2a792c..3eea98a 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1478,7 +1478,6 @@ static int snd_pcm_lib_ioctl_reset(struct snd_pcm_substream *substream, runtime->status->hw_ptr %= runtime->buffer_size; else runtime->status->hw_ptr = 0; - runtime->hw_ptr_jiffies = jiffies; snd_pcm_stream_unlock_irqrestore(substream, flags); return 0; } diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index fc6f98e..7faec82 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -848,6 +848,7 @@ static void snd_pcm_post_start(struct snd_pcm_substream *substream, int state) { struct snd_pcm_runtime *runtime = substream->runtime; snd_pcm_trigger_tstamp(substream); + runtime->hw_ptr_jiffies = jiffies; runtime->status->state = state; if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK && runtime->silence_size > 0) @@ -961,6 +962,8 @@ static int snd_pcm_do_pause(struct snd_pcm_substream *substream, int push) { if (substream->runtime->trigger_master != substream) return 0; + /* skip one jiffies check */ + runtime->hw_ptr_jiffies = jiffies - HZ * 1000; return substream->ops->trigger(substream, push ? SNDRV_PCM_TRIGGER_PAUSE_PUSH : SNDRV_PCM_TRIGGER_PAUSE_RELEASE);
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
At Mon, 25 May 2009 12:19:19 +0200 (CEST), Jaroslav Kysela wrote:
On Sat, 23 May 2009, Takashi Iwai wrote:
At Sat, 23 May 2009 14:31:57 +0200 (CEST), Jaroslav Kysela wrote:
On Fri, 22 May 2009, Takashi Iwai wrote:
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote:
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
This patch does not look as a proper fix.
Hm, at which point? It just skips the first unreliable jiffies check. At least, it's interesting whether it works around the problem.
I meant that it is not necessary to introduce a next variable. This patch is more elegant:
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index a2a792c..3eea98a 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1478,7 +1478,6 @@ static int snd_pcm_lib_ioctl_reset(struct snd_pcm_substream *substream, runtime->status->hw_ptr %= runtime->buffer_size; else runtime->status->hw_ptr = 0;
- runtime->hw_ptr_jiffies = jiffies; snd_pcm_stream_unlock_irqrestore(substream, flags); return 0; }
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index fc6f98e..7faec82 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -848,6 +848,7 @@ static void snd_pcm_post_start(struct snd_pcm_substream *substream, int state) { struct snd_pcm_runtime *runtime = substream->runtime; snd_pcm_trigger_tstamp(substream);
- runtime->hw_ptr_jiffies = jiffies; runtime->status->state = state; if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK && runtime->silence_size > 0)
@@ -961,6 +962,8 @@ static int snd_pcm_do_pause(struct snd_pcm_substream *substream, int push) { if (substream->runtime->trigger_master != substream) return 0;
- /* skip one jiffies check */
- runtime->hw_ptr_jiffies = jiffies - HZ * 1000;
This part is a bit tricky from readability POV, but I agree it's better to avoid a new field.
However, one thing I thought in my patch is that it's safer *not* to check the jiffies even for the first update after snd_pcm_start(). The potential problem we have now is that the first position check might be too early when the device has a large FIFO or so. By skipping the first jiffies check, this could be worked around. Usually the stream is stabilized and give periodic position updates once when it's started. But, after the first start, it's not always stable enough.
So, I think the second hunk in your patch could be also reset hw_ptr_jiffies to skip. (If so, we should define a macro to serve that to improve the readability, BTW.)
thanks,
Takashi
On Mon, 25 May 2009, Takashi Iwai wrote:
At Mon, 25 May 2009 12:19:19 +0200 (CEST), Jaroslav Kysela wrote:
On Sat, 23 May 2009, Takashi Iwai wrote:
At Sat, 23 May 2009 14:31:57 +0200 (CEST), Jaroslav Kysela wrote:
On Fri, 22 May 2009, Takashi Iwai wrote:
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote:
Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858
To reproduce, pause a video in mplayer and then hit play:
ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
This patch does not look as a proper fix.
Hm, at which point? It just skips the first unreliable jiffies check. At least, it's interesting whether it works around the problem.
I meant that it is not necessary to introduce a next variable. This patch is more elegant:
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index a2a792c..3eea98a 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1478,7 +1478,6 @@ static int snd_pcm_lib_ioctl_reset(struct snd_pcm_substream *substream, runtime->status->hw_ptr %= runtime->buffer_size; else runtime->status->hw_ptr = 0;
- runtime->hw_ptr_jiffies = jiffies; snd_pcm_stream_unlock_irqrestore(substream, flags); return 0; }
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index fc6f98e..7faec82 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -848,6 +848,7 @@ static void snd_pcm_post_start(struct snd_pcm_substream *substream, int state) { struct snd_pcm_runtime *runtime = substream->runtime; snd_pcm_trigger_tstamp(substream);
- runtime->hw_ptr_jiffies = jiffies; runtime->status->state = state; if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK && runtime->silence_size > 0)
@@ -961,6 +962,8 @@ static int snd_pcm_do_pause(struct snd_pcm_substream *substream, int push) { if (substream->runtime->trigger_master != substream) return 0;
- /* skip one jiffies check */
- runtime->hw_ptr_jiffies = jiffies - HZ * 1000;
This part is a bit tricky from readability POV, but I agree it's better to avoid a new field.
Perhaps a better comment might improve readability ;-)
However, one thing I thought in my patch is that it's safer *not* to check the jiffies even for the first update after snd_pcm_start(). The potential problem we have now is that the first position check might be too early when the device has a large FIFO or so. By skipping the first jiffies check, this could be worked around. Usually the stream is stabilized and give periodic position updates once when it's started. But, after the first start, it's not always stable enough.
I would use runtime->delay to update the check margin (hdelta -= runtime->delay) for such hardware instead this assumption. The runtime->delay can be updated in the lowlevel driver - pointer() callback. It will work for both fixed and variable FIFOs.
Jaroslav
----- Jaroslav Kysela perex@perex.cz Linux Kernel Sound Maintainer ALSA Project, Red Hat, Inc.
At Mon, 25 May 2009 13:24:46 +0200 (CEST), Jaroslav Kysela wrote:
On Mon, 25 May 2009, Takashi Iwai wrote:
At Mon, 25 May 2009 12:19:19 +0200 (CEST), Jaroslav Kysela wrote:
On Sat, 23 May 2009, Takashi Iwai wrote:
At Sat, 23 May 2009 14:31:57 +0200 (CEST), Jaroslav Kysela wrote:
On Fri, 22 May 2009, Takashi Iwai wrote:
At Fri, 22 May 2009 16:27:09 -0400, Chuck Ebbert wrote: > > Bug report: https://bugzilla.redhat.com/show_bug.cgi?id=498858 > > To reproduce, pause a video in mplayer and then hit play: > > ALSA sound/core/pcm_lib.c:263: hda_codec: hw_ptr skipping! [Q] (pos=11263, > delta=17400, period=1024, jdelta=21/362/0)
Does the patch fix the problem?
This patch does not look as a proper fix.
Hm, at which point? It just skips the first unreliable jiffies check. At least, it's interesting whether it works around the problem.
I meant that it is not necessary to introduce a next variable. This patch is more elegant:
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index a2a792c..3eea98a 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -1478,7 +1478,6 @@ static int snd_pcm_lib_ioctl_reset(struct snd_pcm_substream *substream, runtime->status->hw_ptr %= runtime->buffer_size; else runtime->status->hw_ptr = 0;
- runtime->hw_ptr_jiffies = jiffies; snd_pcm_stream_unlock_irqrestore(substream, flags); return 0; }
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index fc6f98e..7faec82 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -848,6 +848,7 @@ static void snd_pcm_post_start(struct snd_pcm_substream *substream, int state) { struct snd_pcm_runtime *runtime = substream->runtime; snd_pcm_trigger_tstamp(substream);
- runtime->hw_ptr_jiffies = jiffies; runtime->status->state = state; if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK && runtime->silence_size > 0)
@@ -961,6 +962,8 @@ static int snd_pcm_do_pause(struct snd_pcm_substream *substream, int push) { if (substream->runtime->trigger_master != substream) return 0;
- /* skip one jiffies check */
- runtime->hw_ptr_jiffies = jiffies - HZ * 1000;
This part is a bit tricky from readability POV, but I agree it's better to avoid a new field.
Perhaps a better comment might improve readability ;-)
Yep :)
However, one thing I thought in my patch is that it's safer *not* to check the jiffies even for the first update after snd_pcm_start(). The potential problem we have now is that the first position check might be too early when the device has a large FIFO or so. By skipping the first jiffies check, this could be worked around. Usually the stream is stabilized and give periodic position updates once when it's started. But, after the first start, it's not always stable enough.
I would use runtime->delay to update the check margin (hdelta -= runtime->delay) for such hardware instead this assumption. The runtime->delay can be updated in the lowlevel driver - pointer() callback. It will work for both fixed and variable FIFOs.
Right, but it's always hard to guess and set proper FIFO values to all possible devices, and it's too late for 2.6.30. As a working practice for 2.6.30, I'd vote for taking a safer option.
thanks,
Takashi
participants (5)
-
Chuck Ebbert
-
Jaroslav Kysela
-
Nathan Grennan
-
Robert Krakora
-
Takashi Iwai