Re: [alsa-devel] [PATCH] USB-sound: prevent kernel panic on disconnect
Alan Stern wrote:
On Fri, 11 Feb 2011, Sarah Sharp wrote:
I suspect it might be the audio driver, as someone on the alsa mailing list suggested the USB sound driver can't handle when one isochronous buffer in an URB has an error, but the URB status is 0.
I don't think that can happen. The URB status won't be 0 unless all the individual buffers have 0 status.
In any case, the driver would then just copy garbage samples out of the buffer; this wouldn't affect the driver's data structures.
From: Pierre-Louis Bossart bossart.nospam@gmail.com
Note that this is only a work-around, it does not address the root cause of this inconsistency between urbs and PCM states. The dmesg below shows two calls to snd_urb_complete, the substream is NULL and the state is either running or stopped. This doesn't make any sense. ... ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:186: NULL substream (subs->running 1) <- How is this possible? ALSA urb.c:186: NULL substream (subs->running 0)
It's most likely a matter of the device being disconnected but the device file still being open.
subs->pcm_substream == NULL happens only when the device file _has_ been closed.
Perhaps not everything gets cleaned up the way it should when that happens.
There seems to be a race between snd_usb_pcm_close() and snd_usb_stream_disconnect(). I think a mutex taken by both functions should fix this; and all functions that check 'shutdown' probably need serializing.
Regards, Clemens
At Mon, 14 Feb 2011 09:54:01 +0100, Clemens Ladisch wrote:
Alan Stern wrote:
On Fri, 11 Feb 2011, Sarah Sharp wrote:
I suspect it might be the audio driver, as someone on the alsa mailing list suggested the USB sound driver can't handle when one isochronous buffer in an URB has an error, but the URB status is 0.
I don't think that can happen. The URB status won't be 0 unless all the individual buffers have 0 status.
In any case, the driver would then just copy garbage samples out of the buffer; this wouldn't affect the driver's data structures.
From: Pierre-Louis Bossart bossart.nospam@gmail.com
Note that this is only a work-around, it does not address the root cause of this inconsistency between urbs and PCM states. The dmesg below shows two calls to snd_urb_complete, the substream is NULL and the state is either running or stopped. This doesn't make any sense. ... ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:186: NULL substream (subs->running 1) <- How is this possible? ALSA urb.c:186: NULL substream (subs->running 0)
It's most likely a matter of the device being disconnected but the device file still being open.
subs->pcm_substream == NULL happens only when the device file _has_ been closed.
More exactly, it's cleared the PCM close callback, so during it's being closed :) Looks like a race as you mentioned below, indeed.
Perhaps not everything gets cleaned up the way it should when that happens.
There seems to be a race between snd_usb_pcm_close() and snd_usb_stream_disconnect(). I think a mutex taken by both functions should fix this; and all functions that check 'shutdown' probably need serializing.
Or, rather make sure that snd_usb_release_substreams() is finished before PCM close; actually it should be called in hw_free callback, but currently it's skipped by the shutdown flag check. I guess this check isn't right.
The serialization of shutdown check is a good point, but this won't hit actual bugs, I suppose.
thanks,
Takashi
On Mon, Feb 14, 2011 at 06:39:00PM +0100, Takashi Iwai wrote:
At Mon, 14 Feb 2011 09:54:01 +0100, Clemens Ladisch wrote:
Alan Stern wrote:
On Fri, 11 Feb 2011, Sarah Sharp wrote:
I suspect it might be the audio driver, as someone on the alsa mailing list suggested the USB sound driver can't handle when one isochronous buffer in an URB has an error, but the URB status is 0.
I don't think that can happen. The URB status won't be 0 unless all the individual buffers have 0 status.
In any case, the driver would then just copy garbage samples out of the buffer; this wouldn't affect the driver's data structures.
From: Pierre-Louis Bossart bossart.nospam@gmail.com
Note that this is only a work-around, it does not address the root cause of this inconsistency between urbs and PCM states. The dmesg below shows two calls to snd_urb_complete, the substream is NULL and the state is either running or stopped. This doesn't make any sense. ... ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:186: NULL substream (subs->running 1) <- How is this possible? ALSA urb.c:186: NULL substream (subs->running 0)
It's most likely a matter of the device being disconnected but the device file still being open.
subs->pcm_substream == NULL happens only when the device file _has_ been closed.
More exactly, it's cleared the PCM close callback, so during it's being closed :) Looks like a race as you mentioned below, indeed.
Perhaps not everything gets cleaned up the way it should when that happens.
There seems to be a race between snd_usb_pcm_close() and snd_usb_stream_disconnect(). I think a mutex taken by both functions should fix this; and all functions that check 'shutdown' probably need serializing.
Or, rather make sure that snd_usb_release_substreams() is finished before PCM close; actually it should be called in hw_free callback, but currently it's skipped by the shutdown flag check. I guess this check isn't right.
The serialization of shutdown check is a good point, but this won't hit actual bugs, I suppose.
Ok. I don't have the audio expertise to make a patch; can you or Clemens make one for Pierre to try out?
Sarah Sharp
Sarah Sharp wrote:
On Mon, Feb 14, 2011 at 06:39:00PM +0100, Takashi Iwai wrote:
Clemens Ladisch wrote:
There seems to be a race between snd_usb_pcm_close() and snd_usb_stream_disconnect(). I think a mutex taken by both functions should fix this; and all functions that check 'shutdown' probably need serializing.
Or, rather make sure that snd_usb_release_substreams() is finished before PCM close; actually it should be called in hw_free callback, but currently it's skipped by the shutdown flag check. I guess this check isn't right.
The serialization of shutdown check is a good point, but this won't hit actual bugs, I suppose.
Ok. I don't have the audio expertise to make a patch; can you or Clemens make one for Pierre to try out?
I'll try to make a patch until tomorrow.
Regards, Clemens
At Wed, 16 Feb 2011 15:54:22 -0800, Sarah Sharp wrote:
On Mon, Feb 14, 2011 at 06:39:00PM +0100, Takashi Iwai wrote:
At Mon, 14 Feb 2011 09:54:01 +0100, Clemens Ladisch wrote:
Alan Stern wrote:
On Fri, 11 Feb 2011, Sarah Sharp wrote:
I suspect it might be the audio driver, as someone on the alsa mailing list suggested the USB sound driver can't handle when one isochronous buffer in an URB has an error, but the URB status is 0.
I don't think that can happen. The URB status won't be 0 unless all the individual buffers have 0 status.
In any case, the driver would then just copy garbage samples out of the buffer; this wouldn't affect the driver's data structures.
From: Pierre-Louis Bossart bossart.nospam@gmail.com
Note that this is only a work-around, it does not address the root cause of this inconsistency between urbs and PCM states. The dmesg below shows two calls to snd_urb_complete, the substream is NULL and the state is either running or stopped. This doesn't make any sense. ... ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:186: NULL substream (subs->running 1) <- How is this possible? ALSA urb.c:186: NULL substream (subs->running 0)
It's most likely a matter of the device being disconnected but the device file still being open.
subs->pcm_substream == NULL happens only when the device file _has_ been closed.
More exactly, it's cleared the PCM close callback, so during it's being closed :) Looks like a race as you mentioned below, indeed.
Perhaps not everything gets cleaned up the way it should when that happens.
There seems to be a race between snd_usb_pcm_close() and snd_usb_stream_disconnect(). I think a mutex taken by both functions should fix this; and all functions that check 'shutdown' probably need serializing.
Or, rather make sure that snd_usb_release_substreams() is finished before PCM close; actually it should be called in hw_free callback, but currently it's skipped by the shutdown flag check. I guess this check isn't right.
The serialization of shutdown check is a good point, but this won't hit actual bugs, I suppose.
Ok. I don't have the audio expertise to make a patch; can you or Clemens make one for Pierre to try out?
Well, a simple one like the follow is good to check at first. If this works, it's indeed a race between disconnect callback and PCM close (hw_free is always called before close).
We can refine it in a better form later once after it's confirmed :)
thanks,
Takashi
--- diff --git a/sound/usb/card.c b/sound/usb/card.c index 800f7cb..c0f8270 100644 --- a/sound/usb/card.c +++ b/sound/usb/card.c @@ -323,6 +323,7 @@ static int snd_usb_audio_create(struct usb_device *dev, int idx, return -ENOMEM; }
+ mutex_init(&chip->shutdown_mutex); chip->index = idx; chip->dev = dev; chip->card = card; @@ -531,6 +532,7 @@ static void snd_usb_audio_disconnect(struct usb_device *dev, void *ptr) chip = ptr; card = chip->card; mutex_lock(®ister_mutex); + mutex_lock(&chip->shutdown_mutex); chip->shutdown = 1; chip->num_interfaces--; if (chip->num_interfaces <= 0) { @@ -548,9 +550,11 @@ static void snd_usb_audio_disconnect(struct usb_device *dev, void *ptr) snd_usb_mixer_disconnect(p); } usb_chip[chip->index] = NULL; + mutex_unlock(&chip->shutdown_mutex); mutex_unlock(®ister_mutex); snd_card_free_when_closed(card); } else { + mutex_unlock(&chip->shutdown_mutex); mutex_unlock(®ister_mutex); } } diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 4132522..ca4de49 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -385,8 +385,10 @@ static int snd_usb_hw_free(struct snd_pcm_substream *substream) subs->cur_audiofmt = NULL; subs->cur_rate = 0; subs->period_bytes = 0; + mutex_lock(&subs->stream->chip->shutdown_mutex); if (!subs->stream->chip->shutdown) snd_usb_release_substream_urbs(subs, 0); + mutex_unlock(&subs->stream->chip->shutdown_mutex); return snd_pcm_lib_free_vmalloc_buffer(substream); }
diff --git a/sound/usb/usbaudio.h b/sound/usb/usbaudio.h index db3eb21..6e66fff 100644 --- a/sound/usb/usbaudio.h +++ b/sound/usb/usbaudio.h @@ -36,6 +36,7 @@ struct snd_usb_audio { struct snd_card *card; u32 usb_id; int shutdown; + struct mutex shutdown_mutex; unsigned int txfr_quirk:1; /* Subframe boundaries on transfers */ int num_interfaces; int num_suspended_intf;
From: Takashi Iwai tiwai@suse.de
When a USB audio device is disconnected, snd_usb_audio_disconnect() kills all audio URBs. At the same time, the application, after being notified of the disconnection, might close the device, in which case ALSA calls the .hw_free callback, which should free the URBs too.
Commit de1b8b93a0ba prevented snd_usb_hw_free() from freeing the URBs to avoid a hang that resulted from this race, but this introduced another race because the URB callbacks could now be executed after snd_usb_hw_free() has returned, and try to access already freed data.
Fix the first race by introducing a mutex to serialize the disconnect callback and all PCM callbacks that manage URBs (hw_free and hw_params).
Reported-by: Pierre-Louis Bossart pierre-louis.bossart@intel.com [CL: also serialize hw_params callback] Signed-off-by: Clemens Ladisch clemens@ladisch.de
diff --git a/sound/usb/card.c b/sound/usb/card.c index 800f7cb..c0f8270 100644 --- a/sound/usb/card.c +++ b/sound/usb/card.c @@ -323,6 +323,7 @@ static int snd_usb_audio_create(struct usb_device *dev, int idx, return -ENOMEM; }
+ mutex_init(&chip->shutdown_mutex); chip->index = idx; chip->dev = dev; chip->card = card; @@ -531,6 +532,7 @@ static void snd_usb_audio_disconnect(struct usb_device *dev, void *ptr) chip = ptr; card = chip->card; mutex_lock(®ister_mutex); + mutex_lock(&chip->shutdown_mutex); chip->shutdown = 1; chip->num_interfaces--; if (chip->num_interfaces <= 0) { @@ -548,9 +550,11 @@ static void snd_usb_audio_disconnect(struct usb_device *dev, void *ptr) snd_usb_mixer_disconnect(p); } usb_chip[chip->index] = NULL; + mutex_unlock(&chip->shutdown_mutex); mutex_unlock(®ister_mutex); snd_card_free_when_closed(card); } else { + mutex_unlock(&chip->shutdown_mutex); mutex_unlock(®ister_mutex); } } diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 4132522..e3f6805 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -361,6 +361,7 @@ static int snd_usb_hw_params(struct snd_pcm_substream *substream, }
if (changed) { + mutex_lock(&subs->stream->chip->shutdown_mutex); /* format changed */ snd_usb_release_substream_urbs(subs, 0); /* influenced: period_bytes, channels, rate, format, */ @@ -368,6 +369,7 @@ static int snd_usb_hw_params(struct snd_pcm_substream *substream, params_rate(hw_params), snd_pcm_format_physical_width(params_format(hw_params)) * params_channels(hw_params)); + mutex_unlock(&subs->stream->chip->shutdown_mutex); }
return ret; @@ -385,8 +387,9 @@ static int snd_usb_hw_free(struct snd_pcm_substream *substream) subs->cur_audiofmt = NULL; subs->cur_rate = 0; subs->period_bytes = 0; - if (!subs->stream->chip->shutdown) - snd_usb_release_substream_urbs(subs, 0); + mutex_lock(&subs->stream->chip->shutdown_mutex); + snd_usb_release_substream_urbs(subs, 0); + mutex_unlock(&subs->stream->chip->shutdown_mutex); return snd_pcm_lib_free_vmalloc_buffer(substream); }
diff --git a/sound/usb/usbaudio.h b/sound/usb/usbaudio.h index db3eb21..6e66fff 100644 --- a/sound/usb/usbaudio.h +++ b/sound/usb/usbaudio.h @@ -36,6 +36,7 @@ struct snd_usb_audio { struct snd_card *card; u32 usb_id; int shutdown; + struct mutex shutdown_mutex; unsigned int txfr_quirk:1; /* Subframe boundaries on transfers */ int num_interfaces; int num_suspended_intf;
At Tue, 22 Feb 2011 10:21:18 +0100, Clemens Ladisch wrote:
From: Takashi Iwai tiwai@suse.de
When a USB audio device is disconnected, snd_usb_audio_disconnect() kills all audio URBs. At the same time, the application, after being notified of the disconnection, might close the device, in which case ALSA calls the .hw_free callback, which should free the URBs too.
Commit de1b8b93a0ba prevented snd_usb_hw_free() from freeing the URBs to avoid a hang that resulted from this race, but this introduced another race because the URB callbacks could now be executed after snd_usb_hw_free() has returned, and try to access already freed data.
Fix the first race by introducing a mutex to serialize the disconnect callback and all PCM callbacks that manage URBs (hw_free and hw_params).
Reported-by: Pierre-Louis Bossart pierre-louis.bossart@intel.com [CL: also serialize hw_params callback] Signed-off-by: Clemens Ladisch clemens@ladisch.de
Ah, I forgot to put my sign-off since I was waiting for any test result... Has anyone tried it?
thanks,
Takashi
diff --git a/sound/usb/card.c b/sound/usb/card.c index 800f7cb..c0f8270 100644 --- a/sound/usb/card.c +++ b/sound/usb/card.c @@ -323,6 +323,7 @@ static int snd_usb_audio_create(struct usb_device *dev, int idx, return -ENOMEM; }
- mutex_init(&chip->shutdown_mutex); chip->index = idx; chip->dev = dev; chip->card = card;
@@ -531,6 +532,7 @@ static void snd_usb_audio_disconnect(struct usb_device *dev, void *ptr) chip = ptr; card = chip->card; mutex_lock(®ister_mutex);
- mutex_lock(&chip->shutdown_mutex); chip->shutdown = 1; chip->num_interfaces--; if (chip->num_interfaces <= 0) {
@@ -548,9 +550,11 @@ static void snd_usb_audio_disconnect(struct usb_device *dev, void *ptr) snd_usb_mixer_disconnect(p); } usb_chip[chip->index] = NULL;
mutex_unlock(®ister_mutex); snd_card_free_when_closed(card); } else {mutex_unlock(&chip->shutdown_mutex);
mutex_unlock(®ister_mutex); }mutex_unlock(&chip->shutdown_mutex);
} diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 4132522..e3f6805 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -361,6 +361,7 @@ static int snd_usb_hw_params(struct snd_pcm_substream *substream, }
if (changed) {
/* format changed */ snd_usb_release_substream_urbs(subs, 0); /* influenced: period_bytes, channels, rate, format, */mutex_lock(&subs->stream->chip->shutdown_mutex);
@@ -368,6 +369,7 @@ static int snd_usb_hw_params(struct snd_pcm_substream *substream, params_rate(hw_params), snd_pcm_format_physical_width(params_format(hw_params)) * params_channels(hw_params));
mutex_unlock(&subs->stream->chip->shutdown_mutex);
}
return ret;
@@ -385,8 +387,9 @@ static int snd_usb_hw_free(struct snd_pcm_substream *substream) subs->cur_audiofmt = NULL; subs->cur_rate = 0; subs->period_bytes = 0;
- if (!subs->stream->chip->shutdown)
snd_usb_release_substream_urbs(subs, 0);
- mutex_lock(&subs->stream->chip->shutdown_mutex);
- snd_usb_release_substream_urbs(subs, 0);
- mutex_unlock(&subs->stream->chip->shutdown_mutex); return snd_pcm_lib_free_vmalloc_buffer(substream);
}
diff --git a/sound/usb/usbaudio.h b/sound/usb/usbaudio.h index db3eb21..6e66fff 100644 --- a/sound/usb/usbaudio.h +++ b/sound/usb/usbaudio.h @@ -36,6 +36,7 @@ struct snd_usb_audio { struct snd_card *card; u32 usb_id; int shutdown;
- struct mutex shutdown_mutex; unsigned int txfr_quirk:1; /* Subframe boundaries on transfers */ int num_interfaces; int num_suspended_intf;
When a USB audio device is disconnected, snd_usb_audio_disconnect() kills all audio URBs. At the same time, the application, after being notified of the disconnection, might close the device, in which case ALSA calls the .hw_free callback, which should free the URBs too.
Commit de1b8b93a0ba prevented snd_usb_hw_free() from freeing the URBs to avoid a hang that resulted from this race, but this introduced another race because the URB callbacks could now be executed after snd_usb_hw_free() has returned, and try to access already freed data.
Fix the first race by introducing a mutex to serialize the disconnect callback and all PCM callbacks that manage URBs (hw_free and hw_params).
Reported-by: Pierre-Louis Bossart pierre-louis.bossart@intel.com [CL: also serialize hw_params callback] Signed-off-by: Clemens Ladisch clemens@ladisch.de
Ah, I forgot to put my sign-off since I was waiting for any test result... Has anyone tried it?
Doesn't work for me. same oops as before. Either I made a mistake when merging this patch manually (there were some rejects), or the problem is still there. -Pierre
Ah, I forgot to put my sign-off since I was waiting for any test result... Has anyone tried it?
Doesn't work for me. same oops as before. Either I made a mistake when merging this patch manually (there were some rejects), or the problem is still there.
Added some additional traces, the dereference takes place before the mutex is unlocked.
usb 3-1: USB disconnect, address 5 ALSA card.c:556: locking USB shutdown mutex ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:186: NULL substream (subs->running 0) ALSA urb.c:186: NULL substream (subs->running 0) ALSA card.c:575: UNlocking USB shutdown mutex
At Tue, 22 Feb 2011 16:19:20 -0600, pl bossart wrote:
Ah, I forgot to put my sign-off since I was waiting for any test result... Has anyone tried it?
Doesn't work for me. same oops as before. Either I made a mistake when merging this patch manually (there were some rejects), or the problem is still there.
Added some additional traces, the dereference takes place before the mutex is unlocked.
usb 3-1: USB disconnect, address 5 ALSA card.c:556: locking USB shutdown mutex ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:492: frame 0 active: -84 ALSA urb.c:197: cannot submit urb (err = -19) ALSA urb.c:186: NULL substream (subs->running 0) ALSA urb.c:186: NULL substream (subs->running 0) ALSA card.c:575: UNlocking USB shutdown mutex
Hmm. Is snd_usb_release_substream_urbs() is called properly before these errors? Is force=1 set?
Also, in your log, it shows subs->running = 0. So, there is no NULL dereference. But, it's still strange since all active URLs must have been killed beforehand in deactivate_urbs() and the succeeding call of wait_clear_urbs()...
Takashi
Ah, I forgot to put my sign-off since I was waiting for any test result... Has anyone tried it?
Tested-by: Pierre-Louis Bossartpierre-louis.bossart@intel.com
Discard my other messages, the patch seems to work fine when I don't skip any steps... I need more coffee and stop multitasking... I spent 15mn pluging and unpluging my headset without any issues. Thanks Takashi and Clemens for the fix -Pierre
At Tue, 22 Feb 2011 17:10:05 -0600, pl bossart wrote:
Ah, I forgot to put my sign-off since I was waiting for any test result... Has anyone tried it?
Tested-by: Pierre-Louis Bossartpierre-louis.bossart@intel.com
Discard my other messages, the patch seems to work fine when I don't skip any steps... I need more coffee and stop multitasking... I spent 15mn pluging and unpluging my headset without any issues. Thanks Takashi and Clemens for the fix
OK, I committed the patch to sound git tree now. Thanks for testing!
Takashi
participants (4)
-
Clemens Ladisch
-
pl bossart
-
Sarah Sharp
-
Takashi Iwai