Re: [GIT PULL] sound updates for 5.14-rc1
On Fri, Jul 2, 2021 at 9:37 PM Linus Torvalds torvalds@linux-foundation.org wrote:
But I thought I'd report this as a likely candidate.
Confirmed. The watchdog hang bisects right down to commit 9ce650a75a3b ("ALSA: usb-audio: Reduce latency at playback start").
And reverting it on top of my tree also fixes the hang, so it's not some bisection fluke.
I have no idea what is actually wrong with that commit, but it most definitely is the problem, and I have reverted it in my tree so that I can continue merging stuff tomorrow.
Linus
Hi,
On Fri, Jul 02, 2021 at 10:19:46PM -0700, Linus Torvalds wrote:
On Fri, Jul 2, 2021 at 9:37 PM Linus Torvalds torvalds@linux-foundation.org wrote:
But I thought I'd report this as a likely candidate.
Confirmed. The watchdog hang bisects right down to commit 9ce650a75a3b ("ALSA: usb-audio: Reduce latency at playback start").
And reverting it on top of my tree also fixes the hang, so it's not some bisection fluke.
I have no idea what is actually wrong with that commit, but it most definitely is the problem, and I have reverted it in my tree so that I can continue merging stuff tomorrow.
The cause seems to be the attempt to lock PCM substream recursively introduced by the issued commit.
Would I ask you to test with below patch? I apologize that the patch is still untested in my side since at present I have no preparation to debug USB stuffs instantly (I'm just a maintainer for ALSA firewire stack...), so I'm glad if getting your cooperation for the issue.
======== 8< --------
From f7ab449f10152635ad7083aa73d80e3fb1adabb4 Mon Sep 17 00:00:00 2001
From: Takashi Sakamoto o-takashi@sakamocchi.jp Date: Sat, 3 Jul 2021 15:23:25 +0900 Subject: [PATCH] ALSA: usb-audio: fix recursive lock of PCM substream when starting playback PCM substream
A commit 9ce650a75a3b ("ALSA: usb-audio: Reduce latency at playback start") unfortunately introduced the call of snd_pcm_period_elapsed() under acquired lock of PCM substream. This causes recursive lock and results in dead-lock.
->ioctl(2) (sound/core/pcm_native.c) ->snd_pcm_stream_lock_irqsave() <- ... ->struct snd_pcm_ops.trigger() (sound/usb/pcm.c) = snd_usb_substream_playback_trigger() ->start_endpoints() (sound/usb/endpoint.c) ->snd_usb_endpoint_start() ->prepare_outbound_urb() ->struct snd_usb_endpoint.prepare_data_urb() (sound/usb/pcm.c) = prepare_playback_urb() (sound/core/pcm_lib.c) ->snd_pcm_period_elapsed() (sound/core/pcm_native.c) ->snd_pcm_stream_lock_irqsave() <-
This commit fixes the issue to use newly added function; snd_pcm_period_elapsed_under_stream_lock() with condition to check running context.
Reported-by: Hector Martin marcan@marcan.st Fixes: 9ce650a75a3b ("ALSA: usb-audio: Reduce latency at playback start") Signed-off-by: Takashi Sakamoto o-takashi@sakamocchi.jp --- sound/usb/pcm.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index c66831ee15f9..235070f0236a 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -1395,8 +1395,16 @@ static void prepare_playback_urb(struct snd_usb_substream *subs,
spin_unlock_irqrestore(&subs->lock, flags); urb->transfer_buffer_length = bytes; - if (period_elapsed) - snd_pcm_period_elapsed(subs->pcm_substream); + if (period_elapsed) { + // The callback of struct snd_pcm_ops.trigger with SNDRV_PCM_TRIGGER_START command + // can reach here, under acquired lock of PCM substream. To avoid dead-lock, check + // current context and call corresponding function. + if (in_softirq()) { + snd_pcm_period_elapsed(subs->pcm_substream); + } else { + snd_pcm_period_elapsed_under_stream_lock(subs->pcm_substream); + } + } }
/*
On Sat, 03 Jul 2021 08:38:48 +0200, Takashi Sakamoto wrote:
Hi,
On Fri, Jul 02, 2021 at 10:19:46PM -0700, Linus Torvalds wrote:
On Fri, Jul 2, 2021 at 9:37 PM Linus Torvalds torvalds@linux-foundation.org wrote:
But I thought I'd report this as a likely candidate.
Confirmed. The watchdog hang bisects right down to commit 9ce650a75a3b ("ALSA: usb-audio: Reduce latency at playback start").
And reverting it on top of my tree also fixes the hang, so it's not some bisection fluke.
I have no idea what is actually wrong with that commit, but it most definitely is the problem, and I have reverted it in my tree so that I can continue merging stuff tomorrow.
The cause seems to be the attempt to lock PCM substream recursively introduced by the issued commit.
Would I ask you to test with below patch? I apologize that the patch is still untested in my side since at present I have no preparation to debug USB stuffs instantly (I'm just a maintainer for ALSA firewire stack...), so I'm glad if getting your cooperation for the issue.
That's no ideal workaround because it'll call snd_pcm_period_elapsed() before the stream actually gets started. That said, it's not only about the lock but also about the state change, too.
Below is another possible fix. This moves conditionally the snd_pcm_period_elapsed() call to the complete callback, so that it'll be processed in a different context.
Unfortunately I can't test much right now in my side as I'm traveling (until the next Tuesday). So, Linus, Hector, please let me know if this works. Once when it's confirmed to work, I'll prepare the new PR including the fix later in today.
thanks,
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: usb-audio: Fix possible deadlock at playback start
The recent change for the PCM playback trigger caused an unexpected deadlock due to the period-elapsed handling at prepare_playback_urb(). This hasn't been a problem until now because the stream got started before the trigger call, but now this callback is called at the trigger, too, hence the problem surfaced.
As a workaround, this patch introduces a flag for delaying the snd_pcm_period_elapsed() call to the retire_playback_urb(), which is set when the hwptr reaches to the period boundary already at the PCM playback start time.
Fixes: 9ce650a75a3b ("ALSA: usb-audio: Reduce latency at playback start") Reported-by: Hector Martin marcan@marcan.st Reported-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/usb/card.h | 1 + sound/usb/pcm.c | 10 ++++++++++ 2 files changed, 11 insertions(+)
diff --git a/sound/usb/card.h b/sound/usb/card.h index 5577a776561b..f309a5fafc1d 100644 --- a/sound/usb/card.h +++ b/sound/usb/card.h @@ -158,6 +158,7 @@ struct snd_usb_substream { unsigned int stream_offset_adj; /* Bytes to drop from beginning of stream (for non-compliant devices) */
unsigned int running: 1; /* running status */ + unsigned int period_elapsed_pending; /* issue at retire callback */
unsigned int buffer_bytes; /* buffer size in bytes */ unsigned int inflight_bytes; /* in-flight data bytes on buffer (for playback) */ diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index c66831ee15f9..903f5d7e33e3 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -611,6 +611,7 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) subs->hwptr_done = 0; subs->transfer_done = 0; subs->last_frame_number = 0; + subs->period_elapsed_pending = 0; runtime->delay = 0;
unlock: @@ -1393,6 +1394,10 @@ static void prepare_playback_urb(struct snd_usb_substream *subs, subs->trigger_tstamp_pending_update = false; }
+ if (period_elapsed && !subs->running) { + subs->period_elapsed_pending = 1; + period_elapsed = 0; + } spin_unlock_irqrestore(&subs->lock, flags); urb->transfer_buffer_length = bytes; if (period_elapsed) @@ -1408,6 +1413,7 @@ static void retire_playback_urb(struct snd_usb_substream *subs, { unsigned long flags; struct snd_urb_ctx *ctx = urb->context; + bool period_elapsed;
spin_lock_irqsave(&subs->lock, flags); if (ctx->queued) { @@ -1418,7 +1424,11 @@ static void retire_playback_urb(struct snd_usb_substream *subs, }
subs->last_frame_number = usb_get_current_frame_number(subs->dev); + period_elapsed = subs->period_elapsed_pending; + subs->period_elapsed_pending = 0; spin_unlock_irqrestore(&subs->lock, flags); + if (period_elapsed) + snd_pcm_period_elapsed(subs->pcm_substream); }
static int snd_usb_substream_playback_trigger(struct snd_pcm_substream *substream,
On 03/07/2021 16.56, Takashi Iwai wrote:
Unfortunately I can't test much right now in my side as I'm traveling (until the next Tuesday). So, Linus, Hector, please let me know if this works. Once when it's confirmed to work, I'll prepare the new PR including the fix later in today.
Works for me on top of the for-next branch that was previously deadlocking. I can't get it to crash any more.
Tested-by: Hector Martin marcan@marcan.st
On Sat, 03 Jul 2021 14:06:36 +0200, Hector Martin wrote:
On 03/07/2021 16.56, Takashi Iwai wrote:
Unfortunately I can't test much right now in my side as I'm traveling (until the next Tuesday). So, Linus, Hector, please let me know if this works. Once when it's confirmed to work, I'll prepare the new PR including the fix later in today.
Works for me on top of the for-next branch that was previously deadlocking. I can't get it to crash any more.
Tested-by: Hector Martin marcan@marcan.st
Great, thanks for quick testing!
Takashi
On Sat, 03 Jul 2021 20:34:44 +0200, Takashi Iwai wrote:
On Sat, 03 Jul 2021 14:06:36 +0200, Hector Martin wrote:
On 03/07/2021 16.56, Takashi Iwai wrote:
Unfortunately I can't test much right now in my side as I'm traveling (until the next Tuesday). So, Linus, Hector, please let me know if this works. Once when it's confirmed to work, I'll prepare the new PR including the fix later in today.
Works for me on top of the for-next branch that was previously deadlocking. I can't get it to crash any more.
Tested-by: Hector Martin marcan@marcan.st
Great, thanks for quick testing!
And now I saw that Linus reverted the commit already in his tree, so no hurry for pushing the fix up for now.
I'll try to check the fix more closely in my side in the next week. Hopefully everything will be sorted out and the new feature is enabled again in the next PR.
thanks,
Takashi
participants (4)
-
Hector Martin
-
Linus Torvalds
-
Takashi Iwai
-
Takashi Sakamoto