Re: [alsa-devel] Linux 2.6.38 freeze because of sound/core/pcm_lib.c commit 59ff878ffb26bc0be812ca8295799164f413ae88
Christopher K. wrote:
Kernel 2.6.38 has been freezing ever since rc1 on my asus p7p55d. After booting up, the kernel seems to use a lot of cpu-time and after 20-30 seconds the system comes to a complete halt.
I just finished bisecting 2.6.37 - 2.6.38-rc1 and found this commit to be responsible:
59ff878ffb26bc0be812ca8295799164f413ae88
Please try this patch:
--- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -425,6 +425,8 @@ static int snd_pcm_hw_params(struct snd_ runtime->no_period_wakeup = (params->info & SNDRV_PCM_INFO_NO_PERIOD_WAKEUP) && (params->flags & SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP); + runtime->hw_ptr_buffer_jiffies = + max(1uL, (runtime->buffer_size * HZ) / runtime->rate);
bits = snd_pcm_format_physical_width(runtime->format); runtime->sample_bits = bits; @@ -866,8 +868,6 @@ static void snd_pcm_post_start(struct sn struct snd_pcm_runtime *runtime = substream->runtime; snd_pcm_trigger_tstamp(substream); runtime->hw_ptr_jiffies = jiffies; - runtime->hw_ptr_buffer_jiffies = (runtime->buffer_size * HZ) / - runtime->rate; runtime->status->state = state; if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK && runtime->silence_size > 0)
On 31 March 2011 13:58, Clemens Ladisch clemens@ladisch.de wrote:
Please try this patch:
I'm afraid the patch you provided did not change anything for me. I applied it to 2.6.38.2, still freezes 15 - 20 seconds after startup (Could be around the time pulseaudio initializes). Is there any way to provide you with more debug information?
Onboard: VIA VT1828S
HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 HDA Intel 0000:00:1b.0: irq 47 for MSI/MSI-X HDA Intel 0000:00:1b.0: setting latency timer to 64 nvidia: module license 'NVIDIA' taints kernel. hda-codec: no NID for mapping control Independent HP:0:0 HDA Intel 0000:01:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17 hda_intel: Disable MSI for Nvidia chipset HDA Intel 0000:01:00.1: setting latency timer to 64 ... hda_intel: Disable MSI for Nvidia chipset nvidia 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 nvidia 0000:01:00.0: setting latency timer to 64 NVRM: loading NVIDIA UNIX x86_64 Kernel Module 270.30 Fri Feb 25 14:34:55 PST 2011
I use a Nvidia gtx 460, could it be related (as it features sound capabilities)? Using nouveau does not change anything, neither does 2.6.39-rc1.
On 03/31/11 10:09, Christopher K. wrote:
On 31 March 2011 13:58, Clemens Ladischclemens@ladisch.de wrote:
Please try this patch:
I'm afraid the patch you provided did not change anything for me. I applied it to 2.6.38.2, still freezes 15 - 20 seconds after startup (Could be around the time pulseaudio initializes). Is there any way to provide you with more debug information?
Onboard: VIA VT1828S
HDA Intel 0000:00:1b.0: PCI INT A -> GSI 22 (level, low) -> IRQ 22 HDA Intel 0000:00:1b.0: irq 47 for MSI/MSI-X HDA Intel 0000:00:1b.0: setting latency timer to 64 nvidia: module license 'NVIDIA' taints kernel. hda-codec: no NID for mapping control Independent HP:0:0 HDA Intel 0000:01:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17 hda_intel: Disable MSI for Nvidia chipset HDA Intel 0000:01:00.1: setting latency timer to 64 ... hda_intel: Disable MSI for Nvidia chipset nvidia 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 nvidia 0000:01:00.0: setting latency timer to 64 NVRM: loading NVIDIA UNIX x86_64 Kernel Module 270.30 Fri Feb 25 14:34:55 PST 2011
I use a Nvidia gtx 460, could it be related (as it features sound capabilities)? Using nouveau does not change anything, neither does 2.6.39-rc1.
Clemens,
Just to let you know. I'm using a gtx460 also. I haven't had a chance to reboot into the newly patched kernel yet. It is interesting that we both have gtx460's. I have a machine with a gtx430 that doesn't seem to exhibit the problem.
cat /etc/modprobe.d/snd_hda_intel.conf
options snd-hda-intel probe_mask=-1,0xa
On 03/31/11 15:50, Kelly Anderson wrote:
Just to let you know. I'm using a gtx460 also. I haven't had a chance to reboot into the newly patched kernel yet. It is interesting that we both have gtx460's. I have a machine with a gtx430 that doesn't seem to exhibit the problem.
cat /etc/modprobe.d/snd_hda_intel.conf
options snd-hda-intel probe_mask=-1,0xa
OK, I debugged the problem and came up with a solution. The bottom line is that it's a problem with a signed to unsigned compare. First the fix:
--- ./sound/core/pcm_lib.c.orig 2011-03-27 12:37:20.000000000 -0600 +++ ./sound/core/pcm_lib.c 2011-03-31 19:01:35.392739127 -0600 @@ -379,17 +379,17 @@ static int snd_pcm_update_hw_ptr0(struct * Without regular period interrupts, we have to check * the elapsed time to detect xruns. */ - jdelta = jiffies - runtime->hw_ptr_jiffies; - if (jdelta < runtime->hw_ptr_buffer_jiffies / 2) + jdelta = (snd_pcm_sframes_t)(jiffies - runtime->hw_ptr_jiffies); + if (jdelta < (snd_pcm_sframes_t)runtime->hw_ptr_buffer_jiffies / 2) goto no_delta_check; hdelta = jdelta - delta * HZ / runtime->rate; - while (hdelta > runtime->hw_ptr_buffer_jiffies / 2 + 1) { + while (hdelta > (snd_pcm_sframes_t)runtime->hw_ptr_buffer_jiffies / 2 + 1) { delta += runtime->buffer_size; hw_base += runtime->buffer_size; if (hw_base >= runtime->boundary) hw_base = 0; new_hw_ptr = hw_base + pos; - hdelta -= runtime->hw_ptr_buffer_jiffies; + hdelta -= (snd_pcm_sframes_t)runtime->hw_ptr_buffer_jiffies; } goto no_delta_check; }
And here's the patch I used to debug the problem:
--- ./sound/core/pcm_lib.c.orig 2011-03-27 12:37:20.000000000 -0600 +++ ./sound/core/pcm_lib.c 2011-03-31 18:44:23.701411841 -0600 @@ -379,18 +379,31 @@ static int snd_pcm_update_hw_ptr0(struct * Without regular period interrupts, we have to check * the elapsed time to detect xruns. */ + int loopCount = 0; + BUG_ON(runtime->hw_ptr_jiffies < 0); + BUG_ON(runtime->hw_ptr_jiffies == 0); + BUG_ON(runtime->hw_ptr_jiffies > jiffies); jdelta = jiffies - runtime->hw_ptr_jiffies; if (jdelta < runtime->hw_ptr_buffer_jiffies / 2) goto no_delta_check; hdelta = jdelta - delta * HZ / runtime->rate; - while (hdelta > runtime->hw_ptr_buffer_jiffies / 2 + 1) { + printk(KERN_INFO "Before loop: jdelta=%ld, hdelta=%ld runtime->hw_ptr_buffer_jiffies=%lu\n" + , jdelta , hdelta , runtime->hw_ptr_buffer_jiffies); + while (hdelta > runtime->hw_ptr_buffer_jiffies / 2 + 1) { delta += runtime->buffer_size; hw_base += runtime->buffer_size; if (hw_base >= runtime->boundary) hw_base = 0; new_hw_ptr = hw_base + pos; - hdelta -= runtime->hw_ptr_buffer_jiffies; + hdelta -= runtime->hw_ptr_buffer_jiffies; + if (++loopCount >= 1024*1024*4) + { + printk(KERN_INFO "stopped the madness at 4 thousand\n"); + break; + } } + printk(KERN_INFO "After loop: jdelta=%ld, hdelta=%ld runtime->hw_ptr_buffer_jiffies=%lu\n" + , jdelta , hdelta , runtime->hw_ptr_buffer_jiffies); goto no_delta_check; }
The kernel output with the testing patch applied:
Mar 31 18:37:02 speedy kernel: Before loop: jdelta=352, hdelta=352 runtime->hw_ptr_buffer_jiffies=371 Mar 31 18:37:02 speedy kernel: stopped the madness at 4 million Mar 31 18:37:02 speedy kernel: After loop: jdelta=352, hdelta=-1556086432 runtime->hw_ptr_buffer_jiffies=371
On 03/31/11 15:50, Kelly Anderson wrote:
Just to let you know. I'm using a gtx460 also. I haven't had a chance to reboot into the newly patched kernel yet. It is interesting that we both have gtx460's. I have a machine with a gtx430 that doesn't seem to exhibit the problem.
cat /etc/modprobe.d/snd_hda_intel.conf
options snd-hda-intel probe_mask=-1,0xa
OK, I debugged the problem and came up with a solution. The bottom line is that it's a problem with a signed to unsigned compare. First the fix:
I can confirm that this patch fixes the issue. Thanks a lot!
Kelly Anderson wrote:
OK, I debugged the problem and came up with a solution. The bottom line is that it's a problem with a signed to unsigned compare.
Aaargh, I didn't suspect this.
First the fix:
--- ./sound/core/pcm_lib.c.orig 2011-03-27 12:37:20.000000000 -0600 +++ ./sound/core/pcm_lib.c 2011-03-31 19:01:35.392739127 -0600 @@ -379,17 +379,17 @@ static int snd_pcm_update_hw_ptr0(struct * Without regular period interrupts, we have to check * the elapsed time to detect xruns. */
jdelta = jiffies - runtime->hw_ptr_jiffies;
if (jdelta < runtime->hw_ptr_buffer_jiffies / 2)
jdelta = (snd_pcm_sframes_t)(jiffies - runtime->hw_ptr_jiffies);
if (jdelta < (snd_pcm_sframes_t)runtime->hw_ptr_buffer_jiffies / 2) goto no_delta_check; hdelta = jdelta - delta * HZ / runtime->rate;
while (hdelta > runtime->hw_ptr_buffer_jiffies / 2 + 1) {
while (hdelta > (snd_pcm_sframes_t)runtime->hw_ptr_buffer_jiffies / 2 + 1) { delta += runtime->buffer_size; hw_base += runtime->buffer_size; if (hw_base >= runtime->boundary) hw_base = 0; new_hw_ptr = hw_base + pos;
hdelta -= runtime->hw_ptr_buffer_jiffies;
hdelta -= (snd_pcm_sframes_t)runtime->hw_ptr_buffer_jiffies; } goto no_delta_check; }
Most of these casts have no effect; only the loop condition is important. Please test the patch below.
Kelly, to apply this patch I need a Signed-off-by tag from you (see Documentation/SubmittingPatches).
--8<---------------------------------------------------------------->8-- ALSA: pcm: fix infinite loop in snd_pcm_update_hw_ptr0()
When period interrupts are disabled, snd_pcm_update_hw_ptr0() compares the current time against the time estimated for the current hardware pointer to detect xruns. The somewhat fuzzy threshold in the while loop makes it possible that hdelta becomes negative; the comparison being done with unsigned types then makes the loop go through the entire 2^63 negative range, and, depending on the value, never reaching an unsigned value that is small enough to stop the loop. Doing this with interrupts disabled results in the machine locking up.
To prevent this, ensure that the loop condition uses signed types for both operands so that the comparison is correctly done.
Many thanks to Kelly Anderson for debugging this.
Reported-by: Nix nix@esperi.org.uk Reported-by: "Christopher K." c.krooss@googlemail.com Reported-by: Kelly Anderson kelly@silka.with-linux.com [cl: remove unneeded casts; use a temp variable] Signed-off-by: Clemens Ladisch clemens@ladisch.de
--- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -375,6 +375,7 @@ static int snd_pcm_update_hw_ptr0(struct snd_pcm_substream *substream, }
if (runtime->no_period_wakeup) { + snd_pcm_sframes_t xrun_threshold; /* * Without regular period interrupts, we have to check * the elapsed time to detect xruns. @@ -383,7 +384,8 @@ static int snd_pcm_update_hw_ptr0(struct snd_pcm_substream *substream, if (jdelta < runtime->hw_ptr_buffer_jiffies / 2) goto no_delta_check; hdelta = jdelta - delta * HZ / runtime->rate; + xrun_threshold = runtime->hw_ptr_buffer_jiffies / 2 + 1; - while (hdelta > runtime->hw_ptr_buffer_jiffies / 2 + 1) { + while (hdelta > xrun_threshold) { delta += runtime->buffer_size; hw_base += runtime->buffer_size; if (hw_base >= runtime->boundary)
On 04/01/11 01:47, Clemens Ladisch wrote:
Kelly, to apply this patch I need a Signed-off-by tag from you (see Documentation/SubmittingPatches).
It works fine Clemens.
--8<---------------------------------------------------------------->8--
ALSA: pcm: fix infinite loop in snd_pcm_update_hw_ptr0()
When period interrupts are disabled, snd_pcm_update_hw_ptr0() compares the current time against the time estimated for the current hardware pointer to detect xruns. The somewhat fuzzy threshold in the while loop makes it possible that hdelta becomes negative; the comparison being done with unsigned types then makes the loop go through the entire 263 negative range, and, depending on the value, never reaching an unsigned value that is small enough to stop the loop. Doing this with interrupts disabled results in the machine locking up.
To prevent this, ensure that the loop condition uses signed types for both operands so that the comparison is correctly done.
Many thanks to Kelly Anderson for debugging this.
Reported-by: Nix nix@esperi.org.uk Reported-by: "Christopher K." c.krooss@googlemail.com Reported-by: Kelly Anderson kelly@silka.with-linux.com [cl: remove unneeded casts; use a temp variable] Signed-off-by: Clemens Ladisch clemens@ladisch.de Signed-off-by: Kelly Anderson kelly@silka.with-linux.com
--- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -375,6 +375,7 @@ static int snd_pcm_update_hw_ptr0(struct snd_pcm_substream *substream, }
if (runtime->no_period_wakeup) { + snd_pcm_sframes_t xrun_threshold; /* * Without regular period interrupts, we have to check * the elapsed time to detect xruns. @@ -383,7 +384,8 @@ static int snd_pcm_update_hw_ptr0(struct snd_pcm_substream *substream, if (jdelta < runtime->hw_ptr_buffer_jiffies / 2) goto no_delta_check; hdelta = jdelta - delta * HZ / runtime->rate; + xrun_threshold = runtime->hw_ptr_buffer_jiffies / 2 + 1; - while (hdelta > runtime->hw_ptr_buffer_jiffies / 2 + 1) { + while (hdelta > xrun_threshold) { delta += runtime->buffer_size; hw_base += runtime->buffer_size; if (hw_base >= runtime->boundary) --
participants (3)
-
Christopher K.
-
Clemens Ladisch
-
Kelly Anderson