[alsa-devel] [PATCH 0/3 v1] usb-misc fix
This patch set contains the following patches
Andreas Pape (1): ALSA: usb-audio: more tolerant packetsize
Daniel Girnus (1): ALSA: usb-audio: avoid setting of sample rate multiple times on bus
Mark Craske (1): ALSA: usb-audio: fix race in snd_usb_endpoint_stop
sound/usb/endpoint.c | 24 ++++++++++++++++-------- sound/usb/pcm.c | 21 +++++++++++---------- 2 files changed, 27 insertions(+), 18 deletions(-)
From: Andreas Pape apape@de.adit-jv.com
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to nominal + 25%. It was discovered, that some devices have a much higher jitter in used packetsizes than 25% which would result in BABBLE condition and dropping of packets. A better solution is so assume the jitter to be the nominal packetsize: -one nearly empty packet followed by a almost double sized one.
Signed-off-by: Andreas Pape apape@de.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com --- sound/usb/endpoint.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index c470251..2f592dd 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -632,8 +632,8 @@ static int data_ep_set_params(struct snd_usb_endpoint *ep, ep->stride = frame_bits >> 3; ep->silence_value = pcm_format == SNDRV_PCM_FORMAT_U8 ? 0x80 : 0;
- /* assume max. frequency is 25% higher than nominal */ - ep->freqmax = ep->freqn + (ep->freqn >> 2); + /* assume max. frequency is double than nominal */ + ep->freqmax = ep->freqn * 2; /* Round up freqmax to nearest integer in order to calculate maximum * packet size, which must represent a whole number of frames. * This is accomplished by adding 0x0.ffff before converting the
On Wed, 30 Nov 2016 08:59:21 +0100, Jiada Wang wrote:
From: Andreas Pape apape@de.adit-jv.com
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to
Please use a form with 12 chars SHA ID plus the commit subject, e.g. 1234567890ab ("blah blah...")
nominal + 25%. It was discovered, that some devices have a much higher jitter in used packetsizes than 25% which would result in BABBLE condition and dropping of packets. A better solution is so assume the jitter to be the nominal packetsize: -one nearly empty packet followed by a almost double sized one.
The increase of the max frequency is supposedly OK. A remaining question is whether this should be included in stable kernel. It fixes in one side, but it's quite untested in another side. Maybe we queue this for 4.10, and later on notify to stable maintainer once when it's confirmed to work and be harmless.
thanks,
Takashi
Signed-off-by: Andreas Pape apape@de.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com
sound/usb/endpoint.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index c470251..2f592dd 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -632,8 +632,8 @@ static int data_ep_set_params(struct snd_usb_endpoint *ep, ep->stride = frame_bits >> 3; ep->silence_value = pcm_format == SNDRV_PCM_FORMAT_U8 ? 0x80 : 0;
- /* assume max. frequency is 25% higher than nominal */
- ep->freqmax = ep->freqn + (ep->freqn >> 2);
- /* assume max. frequency is double than nominal */
- ep->freqmax = ep->freqn * 2; /* Round up freqmax to nearest integer in order to calculate maximum
- packet size, which must represent a whole number of frames.
- This is accomplished by adding 0x0.ffff before converting the
-- 2.9.3
Hello Takashi
On 11/30/2016 05:54 PM, Takashi Iwai wrote:
On Wed, 30 Nov 2016 08:59:21 +0100, Jiada Wang wrote:
From: Andreas Pape apape@de.adit-jv.com
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to
Please use a form with 12 chars SHA ID plus the commit subject, e.g. 1234567890ab ("blah blah...")
I will update changelog as you have suggested in v2.
nominal + 25%. It was discovered, that some devices have a much higher jitter in used packetsizes than 25% which would result in BABBLE condition and dropping of packets. A better solution is so assume the jitter to be the nominal packetsize: -one nearly empty packet followed by a almost double sized one.
The increase of the max frequency is supposedly OK. A remaining question is whether this should be included in stable kernel. It fixes in one side, but it's quite untested in another side. Maybe we queue this for 4.10, and later on notify to stable maintainer once when it's confirmed to work and be harmless.
this makes sense to me
Thanks, Jiada
thanks,
Takashi
Signed-off-by: Andreas Pape apape@de.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com
sound/usb/endpoint.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index c470251..2f592dd 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -632,8 +632,8 @@ static int data_ep_set_params(struct snd_usb_endpoint *ep, ep->stride = frame_bits >> 3; ep->silence_value = pcm_format == SNDRV_PCM_FORMAT_U8 ? 0x80 : 0;
- /* assume max. frequency is 25% higher than nominal */
- ep->freqmax = ep->freqn + (ep->freqn >> 2);
- /* assume max. frequency is double than nominal */
- ep->freqmax = ep->freqn * 2; /* Round up freqmax to nearest integer in order to calculate maximum
- packet size, which must represent a whole number of frames.
- This is accomplished by adding 0x0.ffff before converting the
-- 2.9.3
Jiada Wang wrote:
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to nominal + 25%. It was discovered, that some devices
Which devices?
have a much higher jitter in used packetsizes than 25%
How high? (Please note that the USB specification restricts the jitter to at most one frame in consecutive packets.)
which would result in BABBLE condition and dropping of packets. A better solution is so assume the jitter to be the nominal packetsize
This solution is better for this one particular device, but how does it affect normal devices, or the Scarlett 2i4 on EHCI affected?
Regards, Clemens
On Thu, 01 Dec 2016 08:41:17 +0100, Clemens Ladisch wrote:
Jiada Wang wrote:
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to nominal + 25%. It was discovered, that some devices
Which devices?
have a much higher jitter in used packetsizes than 25%
How high? (Please note that the USB specification restricts the jitter to at most one frame in consecutive packets.)
which would result in BABBLE condition and dropping of packets. A better solution is so assume the jitter to be the nominal packetsize
This solution is better for this one particular device, but how does it affect normal devices, or the Scarlett 2i4 on EHCI affected?
Actually, which value does this affected device in ep->maxpacksize? In the commit mentioned above, we changed the logic to take +25% frequency as the basis, and it my *reduce* if ep->maxpacksize is lower than that.
OTOH, if ep->maxpacksize is sane, we can rely on it rather than the implicit +25% frequency. That said, maybe we can check ep->maxpacksize whether it fits within the expected range, then adapt it, or take +25% freq as fallback?
thanks,
Takashi
Takashi Iwai wrote:
Clemens Ladisch wrote:
Jiada Wang wrote:
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to nominal + 25%. It was discovered, that some devices
Which devices?
have a much higher jitter in used packetsizes than 25%
How high? (Please note that the USB specification restricts the jitter to at most one frame in consecutive packets.)
which would result in BABBLE condition and dropping of packets. A better solution is so assume the jitter to be the nominal packetsize
This solution is better for this one particular device, but how does it affect normal devices, or the Scarlett 2i4 on EHCI affected?
Actually, which value does this affected device in ep->maxpacksize? In the commit mentioned above, we changed the logic to take +25% frequency as the basis, and it my *reduce* if ep->maxpacksize is lower than that.
OTOH, if ep->maxpacksize is sane, we can rely on it rather than the implicit +25% frequency. That said, maybe we can check ep->maxpacksize whether it fits within the expected range, then adapt it, or take +25% freq as fallback?
You are describing how the current code behaves. The +25% limit _is_ what the code takes as the expected range.
I'm wondering if that unknown device just declares a wrong interval value.
Regards, Clemens
On Thu, 01 Dec 2016 12:16:47 +0100, Clemens Ladisch wrote:
Takashi Iwai wrote:
Clemens Ladisch wrote:
Jiada Wang wrote:
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to nominal + 25%. It was discovered, that some devices
Which devices?
have a much higher jitter in used packetsizes than 25%
How high? (Please note that the USB specification restricts the jitter to at most one frame in consecutive packets.)
which would result in BABBLE condition and dropping of packets. A better solution is so assume the jitter to be the nominal packetsize
This solution is better for this one particular device, but how does it affect normal devices, or the Scarlett 2i4 on EHCI affected?
Actually, which value does this affected device in ep->maxpacksize? In the commit mentioned above, we changed the logic to take +25% frequency as the basis, and it my *reduce* if ep->maxpacksize is lower than that.
OTOH, if ep->maxpacksize is sane, we can rely on it rather than the implicit +25% frequency. That said, maybe we can check ep->maxpacksize whether it fits within the expected range, then adapt it, or take +25% freq as fallback?
You are describing how the current code behaves. The +25% limit _is_ what the code takes as the expected range.
Well, the question is what is the "sane" range. +25% doesn't fit for some devices. If maxpacksize fits without +100% as this patch suggests, can we rely on it instead?
Takashi
I'm wondering if that unknown device just declares a wrong interval value.
Regards, Clemens
Takashi Iwai wrote:
Clemens Ladisch wrote:
Takashi Iwai wrote:
[...] In the commit mentioned above, we changed the logic to take +25% frequency as the basis, and it my *reduce* if ep->maxpacksize is lower than that.
OTOH, if ep->maxpacksize is sane, we can rely on it rather than the implicit +25% frequency. That said, maybe we can check ep->maxpacksize whether it fits within the expected range, then adapt it, or take +25% freq as fallback?
You are describing how the current code behaves. The +25% limit _is_ what the code takes as the expected range.
Well, the question is what is the "sane" range. +25% doesn't fit for some devices.
The USB audio specification _requires_ that there is as little jitter as possible.
It's no surprise that some device violates the specification. But we don't know what the actual error is; whether we could adjust the packet size for this particular device only, or increase the limit for all devices, or use a completely different workaround.
If maxpacksize fits without +100% as this patch suggests, can we rely on it instead?
The packet size affect the following computations, like the number of packets per URB. I don't know how bad the effects would be.
Regards, Clemens
Hello Clemens
On 11/30/2016 11:41 PM, Clemens Ladisch wrote:
Jiada Wang wrote:
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to nominal + 25%. It was discovered, that some devices
Which devices?
It was a LG nexus
have a much higher jitter in used packetsizes than 25%
How high? (Please note that the USB specification restricts the jitter to at most one frame in consecutive packets.)
the nominal packet size was somewhere around 176bytes +25% would result in max expected packets to be ~220bytes We observed some packets exceeding this size (256byte) which caused the babble and dropping of that packets.
Thanks, Jiada
which would result in BABBLE condition and dropping of packets. A better solution is so assume the jitter to be the nominal packetsize
This solution is better for this one particular device, but how does it affect normal devices, or the Scarlett 2i4 on EHCI affected?
Regards, Clemens
Jiada Wang wrote:
On 11/30/2016 11:41 PM, Clemens Ladisch wrote:
Jiada Wang wrote:
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to nominal + 25%. It was discovered, that some devices
Which devices?
It was a LG nexus
So it was the Android audio accessory mode.
have a much higher jitter in used packetsizes than 25%
How high?
the nominal packet size was somewhere around 176bytes +25% would result in max expected packets to be ~220bytes We observed some packets exceeding this size (256byte)
256 bytes per USB frame would correspond to 64 kHz, instead of the nominal 44.1 kHz.
The audio accessory sample format is fixed, and that mode is no longer developed, so increasing the limit to +50% would be sufficient to work around this problem.
I don't know if this is a bug in Google's generic AOA code, or if LG did some changes; I have not heard any other report so far ...
Regards, Clemens
Hello Clemens
On 12/01/2016 04:15 AM, Clemens Ladisch wrote:
Jiada Wang wrote:
On 11/30/2016 11:41 PM, Clemens Ladisch wrote:
Jiada Wang wrote:
since commit 57e6dae1087bbaa6b33d3dd8a8e90b63888939a3 the expected packetsize is always limited to nominal + 25%. It was discovered, that some devices
Which devices?
It was a LG nexus
So it was the Android audio accessory mode.
have a much higher jitter in used packetsizes than 25%
How high?
the nominal packet size was somewhere around 176bytes +25% would result in max expected packets to be ~220bytes We observed some packets exceeding this size (256byte)
256 bytes per USB frame would correspond to 64 kHz, instead of the nominal 44.1 kHz.
The audio accessory sample format is fixed, and that mode is no longer developed, so increasing the limit to +50% would be sufficient to work around this problem.
I don't know if this is a bug in Google's generic AOA code, or if LG did some changes; I have not heard any other report so far ...
We also reproduced the same issue with following android devices, * Samsung S4, Android 4.2.2, model GT-I9505 * Sony Xperia Z, Android 4.2.2, model number CC6603 * LGE Nexus 4, Android 4.2.2 so it most likely not a LG specific issue.
I agree to increase to +50% would be sufficient to avoid this issue.
Thanks, Jiada
Regards, Clemens
From: Daniel Girnus dgirnus@de.adit-jv.com
ALSA usually calls the prepare function twice before starting the playback: 1. On hw_params call from userland and 2. internally when starting the stream. Some device are not able to manage this and they will stop playback if the sample rate will be configured several times over USB protocol.
Signed-off-by: Jens Lorenz jlorenz@de.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com --- sound/usb/pcm.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 44d178e..a522c9a 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -806,17 +806,18 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (ret < 0) goto unlock;
- iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface); - alts = &iface->altsetting[subs->cur_audiofmt->altset_idx]; - ret = snd_usb_init_sample_rate(subs->stream->chip, - subs->cur_audiofmt->iface, - alts, - subs->cur_audiofmt, - subs->cur_rate); - if (ret < 0) - goto unlock; - if (subs->need_setup_ep) { + + iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface); + alts = &iface->altsetting[subs->cur_audiofmt->altset_idx]; + ret = snd_usb_init_sample_rate(subs->stream->chip, + subs->cur_audiofmt->iface, + alts, + subs->cur_audiofmt, + subs->cur_rate); + if (ret < 0) + goto unlock; + ret = configure_endpoint(subs); if (ret < 0) goto unlock;
On Wed, 30 Nov 2016 08:59:22 +0100, Jiada Wang wrote:
From: Daniel Girnus dgirnus@de.adit-jv.com
ALSA usually calls the prepare function twice before starting the playback:
- On hw_params call from userland and
- internally when starting the stream.
Some device are not able to manage this and they will stop playback if the sample rate will be configured several times over USB protocol.
Signed-off-by: Jens Lorenz jlorenz@de.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com
The sign-off from Daniel seems missing?
The code change looks OK, but it'd be nice to mention in the changelog that, after this patch, snd_usb_init_sample_rate() is still called properly whenever the parameter is changed since ep->need_setup_ep is set in snd_hsb_hw_params().
thanks,
Takashi
sound/usb/pcm.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 44d178e..a522c9a 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -806,17 +806,18 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (ret < 0) goto unlock;
- iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface);
- alts = &iface->altsetting[subs->cur_audiofmt->altset_idx];
- ret = snd_usb_init_sample_rate(subs->stream->chip,
subs->cur_audiofmt->iface,
alts,
subs->cur_audiofmt,
subs->cur_rate);
- if (ret < 0)
goto unlock;
- if (subs->need_setup_ep) {
iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface);
alts = &iface->altsetting[subs->cur_audiofmt->altset_idx];
ret = snd_usb_init_sample_rate(subs->stream->chip,
subs->cur_audiofmt->iface,
alts,
subs->cur_audiofmt,
subs->cur_rate);
if (ret < 0)
goto unlock;
- ret = configure_endpoint(subs); if (ret < 0) goto unlock;
-- 2.9.3
Hi Takashi
On 11/30/2016 05:51 PM, Takashi Iwai wrote:
On Wed, 30 Nov 2016 08:59:22 +0100, Jiada Wang wrote:
From: Daniel Girnus dgirnus@de.adit-jv.com
ALSA usually calls the prepare function twice before starting the playback:
- On hw_params call from userland and
- internally when starting the stream.
Some device are not able to manage this and they will stop playback if the sample rate will be configured several times over USB protocol.
Signed-off-by: Jens Lorenz jlorenz@de.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com
The sign-off from Daniel seems missing?
The code change looks OK, but it'd be nice to mention in the changelog that, after this patch, snd_usb_init_sample_rate() is still called properly whenever the parameter is changed since ep->need_setup_ep is set in snd_hsb_hw_params().
I will add missing sign-off and related information in changelog in v2
Thanks, Jiada
thanks,
Takashi
sound/usb/pcm.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 44d178e..a522c9a 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -806,17 +806,18 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (ret < 0) goto unlock;
- iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface);
- alts = &iface->altsetting[subs->cur_audiofmt->altset_idx];
- ret = snd_usb_init_sample_rate(subs->stream->chip,
subs->cur_audiofmt->iface,
alts,
subs->cur_audiofmt,
subs->cur_rate);
- if (ret < 0)
goto unlock;
- if (subs->need_setup_ep) {
iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface);
alts = &iface->altsetting[subs->cur_audiofmt->altset_idx];
ret = snd_usb_init_sample_rate(subs->stream->chip,
subs->cur_audiofmt->iface,
alts,
subs->cur_audiofmt,
subs->cur_rate);
if (ret < 0)
goto unlock;
- ret = configure_endpoint(subs); if (ret < 0) goto unlock;
-- 2.9.3
Hi Jiada,
I don't oppose this patch. Nevertheless, your description is not necessarily correct.
On Nov 30 2016 16:59, Jiada Wang wrote:
From: Daniel Girnus dgirnus@de.adit-jv.com
ALSA usually calls the prepare function twice before starting the playback:
- On hw_params call from userland and
- internally when starting the stream.
ALSA PCM core in kernel land doesn't perform like this.
In alsa-lib, 'snd_pcm_hw_params()' calls 'snd_pcm_hw_params_internal()' and 'snd_pcm_prepare()' sequentially. http://git.alsa-project.org/?p=alsa-lib.git;a=blob;f=src/pcm/pcm.c;h=cd87bc7...
In system call level (e.g. see by strace(1)), this looks like two ioctl(2)s with 'SNDRV_PCM_IOCTL_HW_PARAMS' and 'SNDRV_PCM_IOCTL_PREPARE'.
Well, when applications are written to execute 'snd_pcm_hw_params()' and 'snd_pcm_hw_prepare()' sequentially, additional ioctl(2) with 'SNDRV_PCM_IOCTL_PREPARE' appears. PulseAudio is this kind of application. I indicated the useless in 2014, but it still remains: https://lists.freedesktop.org/archives/pulseaudio-discuss/2014-January/01977...
You have the misunderstanding due to a nature of alsa-lib and tendency of major applications, from my point of view.
Some device are not able to manage this and they will stop playback if the sample rate will be configured several times over USB protocol.
Signed-off-by: Jens Lorenz jlorenz@de.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com
sound/usb/pcm.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 44d178e..a522c9a 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -806,17 +806,18 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (ret < 0) goto unlock;
- iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface);
- alts = &iface->altsetting[subs->cur_audiofmt->altset_idx];
- ret = snd_usb_init_sample_rate(subs->stream->chip,
subs->cur_audiofmt->iface,
alts,
subs->cur_audiofmt,
subs->cur_rate);
- if (ret < 0)
goto unlock;
- if (subs->need_setup_ep) {
iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface);
alts = &iface->altsetting[subs->cur_audiofmt->altset_idx];
ret = snd_usb_init_sample_rate(subs->stream->chip,
subs->cur_audiofmt->iface,
alts,
subs->cur_audiofmt,
subs->cur_rate);
if (ret < 0)
goto unlock;
- ret = configure_endpoint(subs); if (ret < 0) goto unlock;
Regards
Takashi Sakamoto
On Nov 30 2016 19:45, Takashi Sakamoto wrote:
Hi Jiada,
I don't oppose this patch. Nevertheless, your description is not necessarily correct.
On Nov 30 2016 16:59, Jiada Wang wrote:
From: Daniel Girnus dgirnus@de.adit-jv.com
ALSA usually calls the prepare function twice before starting the playback:
- On hw_params call from userland and
- internally when starting the stream.
ALSA PCM core in kernel land doesn't perform like this.
In alsa-lib, 'snd_pcm_hw_params()' calls 'snd_pcm_hw_params_internal()' and 'snd_pcm_prepare()' sequentially. http://git.alsa-project.org/?p=alsa-lib.git;a=blob;f=src/pcm/pcm.c;h=cd87bc7...
In system call level (e.g. see by strace(1)), this looks like two ioctl(2)s with 'SNDRV_PCM_IOCTL_HW_PARAMS' and 'SNDRV_PCM_IOCTL_PREPARE'.
Well, when applications are written to execute 'snd_pcm_hw_params()' and 'snd_pcm_hw_prepare()' sequentially, additional ioctl(2) with 'SNDRV_PCM_IOCTL_PREPARE' appears. PulseAudio is this kind of application. I indicated the useless in 2014, but it still remains: https://lists.freedesktop.org/archives/pulseaudio-discuss/2014-January/01977...
You have the misunderstanding due to a nature of alsa-lib and tendency of major applications, from my point of view.
So here you should mention that current USB Audio device class driver somewhat ignores state machine of ALSA PCM runtime.
In ALSA PCM core, state of the runtime is described in 'struct snd_pcm_runtime.status.state' with macros of 'SNDRV_PCM_STATE_XXX'. Applications are allowed to handle the runtime according to the state.
In your issue, the driver is programmed ignoring a case that double calls of snd_pcm_prepare(), in short, ioctl(PREPARE) is called in 'PREPARED' state. This is not only an issue for snd-usb-audio, but also for snd-usb-hiface. http://mailman.alsa-project.org/pipermail/alsa-devel/2016-November/115174.ht...
For these issue, I have no patch proposals because I have few test devices, sorry.
Regards
Takashi Sakamoto
Hi Sakamoto
On 11/30/2016 02:45 AM, Takashi Sakamoto wrote:
Hi Jiada,
I don't oppose this patch. Nevertheless, your description is not necessarily correct.
On Nov 30 2016 16:59, Jiada Wang wrote:
From: Daniel Girnus dgirnus@de.adit-jv.com
ALSA usually calls the prepare function twice before starting the playback:
- On hw_params call from userland and
- internally when starting the stream.
ALSA PCM core in kernel land doesn't perform like this.
In alsa-lib, 'snd_pcm_hw_params()' calls 'snd_pcm_hw_params_internal()' and 'snd_pcm_prepare()' sequentially. http://git.alsa-project.org/?p=alsa-lib.git;a=blob;f=src/pcm/pcm.c;h=cd87bc7...
In system call level (e.g. see by strace(1)), this looks like two ioctl(2)s with 'SNDRV_PCM_IOCTL_HW_PARAMS' and 'SNDRV_PCM_IOCTL_PREPARE'.
Well, when applications are written to execute 'snd_pcm_hw_params()' and 'snd_pcm_hw_prepare()' sequentially, additional ioctl(2) with 'SNDRV_PCM_IOCTL_PREPARE' appears. PulseAudio is this kind of application. I indicated the useless in 2014, but it still remains: https://lists.freedesktop.org/archives/pulseaudio-discuss/2014-January/01977...
You have the misunderstanding due to a nature of alsa-lib and tendency of major applications, from my point of view.
Thanks for your indication, so because some of userland applications call 'snd_pcm_hw_params()' and 'snd_pcm_hw_prepare()' sequentially, means the second 'SNDRV_PCM_IOCTL_PREPARE' be called in 'SNDRV_PCM_STATE_PREPARED' state, some devices are unable to manage this and stop working.
I will update Changelog in v2 Patchset.
Thanks, Jiada
Some device are not able to manage this and they will stop playback if the sample rate will be configured several times over USB protocol.
Signed-off-by: Jens Lorenz jlorenz@de.adit-jv.com Signed-off-by: Jiada Wang jiada_wang@mentor.com
sound/usb/pcm.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 44d178e..a522c9a 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -806,17 +806,18 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) if (ret < 0) goto unlock;
- iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface);
- alts = &iface->altsetting[subs->cur_audiofmt->altset_idx];
- ret = snd_usb_init_sample_rate(subs->stream->chip,
subs->cur_audiofmt->iface,
alts,
subs->cur_audiofmt,
subs->cur_rate);
- if (ret < 0)
goto unlock;
- if (subs->need_setup_ep) {
iface = usb_ifnum_to_if(subs->dev, subs->cur_audiofmt->iface);
alts = &iface->altsetting[subs->cur_audiofmt->altset_idx];
ret = snd_usb_init_sample_rate(subs->stream->chip,
subs->cur_audiofmt->iface,
alts,
subs->cur_audiofmt,
subs->cur_rate);
if (ret < 0)
goto unlock;
ret = configure_endpoint(subs); if (ret < 0) goto unlock;
Regards
Takashi Sakamoto
On Dec 5 2016 16:32, Jiada Wang wrote:
Hi Sakamoto
On 11/30/2016 02:45 AM, Takashi Sakamoto wrote:
Hi Jiada,
I don't oppose this patch. Nevertheless, your description is not necessarily correct.
On Nov 30 2016 16:59, Jiada Wang wrote:
From: Daniel Girnus dgirnus@de.adit-jv.com
ALSA usually calls the prepare function twice before starting the playback:
- On hw_params call from userland and
- internally when starting the stream.
ALSA PCM core in kernel land doesn't perform like this.
In alsa-lib, 'snd_pcm_hw_params()' calls 'snd_pcm_hw_params_internal()' and 'snd_pcm_prepare()' sequentially. http://git.alsa-project.org/?p=alsa-lib.git;a=blob;f=src/pcm/pcm.c;h=cd87bc7...
In system call level (e.g. see by strace(1)), this looks like two ioctl(2)s with 'SNDRV_PCM_IOCTL_HW_PARAMS' and 'SNDRV_PCM_IOCTL_PREPARE'.
Well, when applications are written to execute 'snd_pcm_hw_params()' and 'snd_pcm_hw_prepare()' sequentially, additional ioctl(2) with 'SNDRV_PCM_IOCTL_PREPARE' appears. PulseAudio is this kind of application. I indicated the useless in 2014, but it still remains: https://lists.freedesktop.org/archives/pulseaudio-discuss/2014-January/01977...
You have the misunderstanding due to a nature of alsa-lib and tendency of major applications, from my point of view.
Thanks for your indication, so because some of userland applications call 'snd_pcm_hw_params()' and 'snd_pcm_hw_prepare()' sequentially, means the second 'SNDRV_PCM_IOCTL_PREPARE' be called in 'SNDRV_PCM_STATE_PREPARED' state,
Exactly. Furthermore, ALSA PCM core has no code to call .prepare() in contexts unrelated to SNDRV_PCM_IOCTL_PREPARE.
some devices are unable to manage this and stop working. I will update Changelog in v2 Patchset.
Regards
Takashi Sakamoto
From: Mark Craske Mark_Craske@mentor.com
Kernel crash seen once:
Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = a1d7c000 [00000008] *pgd=31c93831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM CPU: 0 PID: 250 Comm: dbus-daemon Not tainted 3.14.51-03479-gf50bdf4 #1 task: a3ae61c0 ti: a08c8000 task.ti: a08c8000 PC is at retire_capture_urb+0x10/0x1f4 [snd_usb_audio] LR is at snd_complete_urb+0x140/0x1f0 [snd_usb_audio] pc : [<7f0eb22c>] lr : [<7f0e57fc>] psr: 200e0193 sp : a08c9c98 ip : a08c9ce8 fp : a08c9ce4 r10: 0000000a r9 : 00000102 r8 : 94cb3000 r7 : 94cb3000 r6 : 94d0f000 r5 : 94d0e8e8 r4 : 94d0e000 r3 : 7f0eb21c r2 : 00000000 r1 : 94cb3000 r0 : 00000000 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 31d7c04a DAC: 00000015 Process dbus-daemon (pid: 250, stack limit = 0xa08c8238) Stack: (0xa08c9c98 to 0xa08ca000) ... Backtrace: [<7f0eb21c>] (retire_capture_urb [snd_usb_audio]) from [<7f0e57fc>] (snd_complete_urb+0x140/0x1f0 [snd_usb_audio]) [<7f0e56bc>] (snd_complete_urb [snd_usb_audio]) from [<80371118>] (__usb_hcd_giveback_urb+0x78/0xf4) [<803710a0>] (__usb_hcd_giveback_urb) from [<80371514>] (usb_giveback_urb_bh+0x8c/0xc0) [<80371488>] (usb_giveback_urb_bh) from [<80028e3c>] (tasklet_hi_action+0xc4/0x148) [<80028d78>] (tasklet_hi_action) from [<80028358>] (__do_softirq+0x190/0x380) [<800281c8>] (__do_softirq) from [<80028858>] (irq_exit+0x8c/0xfc) [<800287cc>] (irq_exit) from [<8000ea88>] (handle_IRQ+0x8c/0xc8) [<8000e9fc>] (handle_IRQ) from [<800085e8>] (gic_handle_irq+0xbc/0xf8) [<8000852c>] (gic_handle_irq) from [<80509044>] (__irq_svc+0x44/0x78) [<80508820>] (_raw_spin_unlock_irq) from [<8004b880>] (finish_task_switch+0x5c/0x100) [<8004b824>] (finish_task_switch) from [<805052f0>] (__schedule+0x48c/0x6d8) [<80504e64>] (__schedule) from [<805055d4>] (schedule+0x98/0x9c) [<8050553c>] (schedule) from [<800116c8>] (do_work_pending+0x30/0xd0) [<80011698>] (do_work_pending) from [<8000e160>] (work_pending+0xc/0x20) Code: e1a0c00d e92ddff0 e24cb004 e24dd024 (e5902008) Kernel panic - not syncing: Fatal exception in interrupt
There is a race between retire_capture_urb() & stop_endpoints() which is setting ep->data_subs to NULL. The underlying cause is in snd_usb_endpoint_stop(), which sets ep->data_subs = NULL; ep->sync_slave = NULL; ep->retire_data_urb = NULL; ep->prepare_data_urb = NULL;
An improvement, suggested by Andreas Pape (ADIT) is to read parameters into local variables. This should solve race between stop and retire where it is legal to store the pointers to local variable as they are not freed in stop path but just set to NULL. However, additional races may still happen in close+hw_free against retire, as there pointers may be freed in addition. For example, while in retire_capture_urb() runtime->dma_area might be freed/nulled.
Signed-off-by: Mark Craske Mark_Craske@mentor.com Signed-off-by: Jiada Wang jiada_wang@mentor.com --- sound/usb/endpoint.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index 2f592dd..853cb79 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -162,25 +162,33 @@ int snd_usb_endpoint_next_packet_size(struct snd_usb_endpoint *ep) static void retire_outbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *urb_ctx) { - if (ep->retire_data_urb) - ep->retire_data_urb(ep->data_subs, urb_ctx->urb); + struct snd_usb_substream *subs = ep->data_subs; + void (*retire)(struct snd_usb_substream *subs, struct urb *urb) + = ep->retire_data_urb; + + if (subs && retire) + retire(subs, urb_ctx->urb); }
static void retire_inbound_urb(struct snd_usb_endpoint *ep, struct snd_urb_ctx *urb_ctx) { struct urb *urb = urb_ctx->urb; + struct snd_usb_endpoint *slave = ep->sync_slave; + struct snd_usb_substream *subs = ep->data_subs; + void (*retire)(struct snd_usb_substream *subs, struct urb *urb) + = ep->retire_data_urb;
if (unlikely(ep->skip_packets > 0)) { ep->skip_packets--; return; }
- if (ep->sync_slave) - snd_usb_handle_sync_urb(ep->sync_slave, ep, urb); + if (slave) + snd_usb_handle_sync_urb(slave, ep, urb);
- if (ep->retire_data_urb) - ep->retire_data_urb(ep->data_subs, urb); + if (subs && retire) + retire(subs, urb); }
static void prepare_silent_urb(struct snd_usb_endpoint *ep,
On Wed, 30 Nov 2016 08:59:23 +0100, Jiada Wang wrote:
From: Mark Craske Mark_Craske@mentor.com
Kernel crash seen once:
Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = a1d7c000 [00000008] *pgd=31c93831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM CPU: 0 PID: 250 Comm: dbus-daemon Not tainted 3.14.51-03479-gf50bdf4 #1 task: a3ae61c0 ti: a08c8000 task.ti: a08c8000 PC is at retire_capture_urb+0x10/0x1f4 [snd_usb_audio] LR is at snd_complete_urb+0x140/0x1f0 [snd_usb_audio] pc : [<7f0eb22c>] lr : [<7f0e57fc>] psr: 200e0193 sp : a08c9c98 ip : a08c9ce8 fp : a08c9ce4 r10: 0000000a r9 : 00000102 r8 : 94cb3000 r7 : 94cb3000 r6 : 94d0f000 r5 : 94d0e8e8 r4 : 94d0e000 r3 : 7f0eb21c r2 : 00000000 r1 : 94cb3000 r0 : 00000000 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 31d7c04a DAC: 00000015 Process dbus-daemon (pid: 250, stack limit = 0xa08c8238) Stack: (0xa08c9c98 to 0xa08ca000) ... Backtrace: [<7f0eb21c>] (retire_capture_urb [snd_usb_audio]) from [<7f0e57fc>] (snd_complete_urb+0x140/0x1f0 [snd_usb_audio]) [<7f0e56bc>] (snd_complete_urb [snd_usb_audio]) from [<80371118>] (__usb_hcd_giveback_urb+0x78/0xf4) [<803710a0>] (__usb_hcd_giveback_urb) from [<80371514>] (usb_giveback_urb_bh+0x8c/0xc0) [<80371488>] (usb_giveback_urb_bh) from [<80028e3c>] (tasklet_hi_action+0xc4/0x148) [<80028d78>] (tasklet_hi_action) from [<80028358>] (__do_softirq+0x190/0x380) [<800281c8>] (__do_softirq) from [<80028858>] (irq_exit+0x8c/0xfc) [<800287cc>] (irq_exit) from [<8000ea88>] (handle_IRQ+0x8c/0xc8) [<8000e9fc>] (handle_IRQ) from [<800085e8>] (gic_handle_irq+0xbc/0xf8) [<8000852c>] (gic_handle_irq) from [<80509044>] (__irq_svc+0x44/0x78) [<80508820>] (_raw_spin_unlock_irq) from [<8004b880>] (finish_task_switch+0x5c/0x100) [<8004b824>] (finish_task_switch) from [<805052f0>] (__schedule+0x48c/0x6d8) [<80504e64>] (__schedule) from [<805055d4>] (schedule+0x98/0x9c) [<8050553c>] (schedule) from [<800116c8>] (do_work_pending+0x30/0xd0) [<80011698>] (do_work_pending) from [<8000e160>] (work_pending+0xc/0x20) Code: e1a0c00d e92ddff0 e24cb004 e24dd024 (e5902008) Kernel panic - not syncing: Fatal exception in interrupt
There is a race between retire_capture_urb() & stop_endpoints() which is setting ep->data_subs to NULL. The underlying cause is in snd_usb_endpoint_stop(), which sets ep->data_subs = NULL; ep->sync_slave = NULL; ep->retire_data_urb = NULL; ep->prepare_data_urb = NULL;
An improvement, suggested by Andreas Pape (ADIT) is to read parameters into local variables. This should solve race between stop and retire where it is legal to store the pointers to local variable as they are not freed in stop path but just set to NULL.
Well, it's tricky. A saner way would be to clear the stuff really after all users are gone.
An untested patch is below. Let me know if it really works.
However, additional races may still happen in close+hw_free against retire, as there pointers may be freed in addition. For example, while in retire_capture_urb() runtime->dma_area might be freed/nulled.
This shouldn't be a problem, as stop_endpoints() waits until all active URBS get killed.
thanks,
Takashi
--- diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index c470251cea4b..f3fce9abece9 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -384,6 +384,9 @@ static void snd_complete_urb(struct urb *urb) if (unlikely(atomic_read(&ep->chip->shutdown))) goto exit_clear;
+ if (unlikely(!test_bit(EP_FLAG_RUNNING, &ep->flags))) + goto exit_clear; + if (usb_pipeout(ep->pipe)) { retire_outbound_urb(ep, ctx); /* can be stopped during retire callback */ @@ -534,6 +537,11 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) alive, ep->ep_num); clear_bit(EP_FLAG_STOPPING, &ep->flags);
+ ep->data_subs = NULL; + ep->sync_slave = NULL; + ep->retire_data_urb = NULL; + ep->prepare_data_urb = NULL; + return 0; }
@@ -1006,10 +1014,6 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep)
if (--ep->use_count == 0) { deactivate_urbs(ep, false); - ep->data_subs = NULL; - ep->sync_slave = NULL; - ep->retire_data_urb = NULL; - ep->prepare_data_urb = NULL; set_bit(EP_FLAG_STOPPING, &ep->flags); } }
Hi, Takashi
On 11/30/2016 01:00 AM, Takashi Iwai wrote:
On Wed, 30 Nov 2016 08:59:23 +0100, Jiada Wang wrote:
From: Mark CraskeMark_Craske@mentor.com
Kernel crash seen once:
Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = a1d7c000 [00000008] *pgd=31c93831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM CPU: 0 PID: 250 Comm: dbus-daemon Not tainted 3.14.51-03479-gf50bdf4 #1 task: a3ae61c0 ti: a08c8000 task.ti: a08c8000 PC is at retire_capture_urb+0x10/0x1f4 [snd_usb_audio] LR is at snd_complete_urb+0x140/0x1f0 [snd_usb_audio] pc : [<7f0eb22c>] lr : [<7f0e57fc>] psr: 200e0193 sp : a08c9c98 ip : a08c9ce8 fp : a08c9ce4 r10: 0000000a r9 : 00000102 r8 : 94cb3000 r7 : 94cb3000 r6 : 94d0f000 r5 : 94d0e8e8 r4 : 94d0e000 r3 : 7f0eb21c r2 : 00000000 r1 : 94cb3000 r0 : 00000000 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 31d7c04a DAC: 00000015 Process dbus-daemon (pid: 250, stack limit = 0xa08c8238) Stack: (0xa08c9c98 to 0xa08ca000) ... Backtrace: [<7f0eb21c>] (retire_capture_urb [snd_usb_audio]) from [<7f0e57fc>] (snd_complete_urb+0x140/0x1f0 [snd_usb_audio]) [<7f0e56bc>] (snd_complete_urb [snd_usb_audio]) from [<80371118>] (__usb_hcd_giveback_urb+0x78/0xf4) [<803710a0>] (__usb_hcd_giveback_urb) from [<80371514>] (usb_giveback_urb_bh+0x8c/0xc0) [<80371488>] (usb_giveback_urb_bh) from [<80028e3c>] (tasklet_hi_action+0xc4/0x148) [<80028d78>] (tasklet_hi_action) from [<80028358>] (__do_softirq+0x190/0x380) [<800281c8>] (__do_softirq) from [<80028858>] (irq_exit+0x8c/0xfc) [<800287cc>] (irq_exit) from [<8000ea88>] (handle_IRQ+0x8c/0xc8) [<8000e9fc>] (handle_IRQ) from [<800085e8>] (gic_handle_irq+0xbc/0xf8) [<8000852c>] (gic_handle_irq) from [<80509044>] (__irq_svc+0x44/0x78) [<80508820>] (_raw_spin_unlock_irq) from [<8004b880>] (finish_task_switch+0x5c/0x100) [<8004b824>] (finish_task_switch) from [<805052f0>] (__schedule+0x48c/0x6d8) [<80504e64>] (__schedule) from [<805055d4>] (schedule+0x98/0x9c) [<8050553c>] (schedule) from [<800116c8>] (do_work_pending+0x30/0xd0) [<80011698>] (do_work_pending) from [<8000e160>] (work_pending+0xc/0x20) Code: e1a0c00d e92ddff0 e24cb004 e24dd024 (e5902008) Kernel panic - not syncing: Fatal exception in interrupt
There is a race between retire_capture_urb()& stop_endpoints() which is setting ep->data_subs to NULL. The underlying cause is in snd_usb_endpoint_stop(), which sets ep->data_subs = NULL; ep->sync_slave = NULL; ep->retire_data_urb = NULL; ep->prepare_data_urb = NULL;
An improvement, suggested by Andreas Pape (ADIT) is to read parameters into local variables. This should solve race between stop and retire where it is legal to store the pointers to local variable as they are not freed in stop path but just set to NULL.
Well, it's tricky. A saner way would be to clear the stuff really after all users are gone.
An untested patch is below. Let me know if it really works.
Thanks for your proposal, I am afraid, we only found the race issue once during our automation test, so I can't test for your patch, but from what I can see, your patch makes sense to me.
The only difference when apply with your patch is, now in case stop_endpoints() is called from TRIGGER_STOP, these stuff won't get cleared, but I think this isn't an issue, is it?
Thanks, Jiada
However, additional races may still happen in close+hw_free against retire, as there pointers may be freed in addition. For example, while in retire_capture_urb() runtime->dma_area might be freed/nulled.
This shouldn't be a problem, as stop_endpoints() waits until all active URBS get killed.
thanks,
Takashi
diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index c470251cea4b..f3fce9abece9 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -384,6 +384,9 @@ static void snd_complete_urb(struct urb *urb) if (unlikely(atomic_read(&ep->chip->shutdown))) goto exit_clear;
- if (unlikely(!test_bit(EP_FLAG_RUNNING,&ep->flags)))
goto exit_clear;
- if (usb_pipeout(ep->pipe)) { retire_outbound_urb(ep, ctx); /* can be stopped during retire callback */
@@ -534,6 +537,11 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) alive, ep->ep_num); clear_bit(EP_FLAG_STOPPING,&ep->flags);
- ep->data_subs = NULL;
- ep->sync_slave = NULL;
- ep->retire_data_urb = NULL;
- ep->prepare_data_urb = NULL;
- return 0; }
@@ -1006,10 +1014,6 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep)
if (--ep->use_count == 0) { deactivate_urbs(ep, false);
ep->data_subs = NULL;
ep->sync_slave = NULL;
ep->retire_data_urb = NULL;
set_bit(EP_FLAG_STOPPING,&ep->flags); } }ep->prepare_data_urb = NULL;
On Mon, 05 Dec 2016 11:10:59 +0100, Jiada Wang wrote:
Hi, Takashi
On 11/30/2016 01:00 AM, Takashi Iwai wrote:
On Wed, 30 Nov 2016 08:59:23 +0100, Jiada Wang wrote:
From: Mark CraskeMark_Craske@mentor.com
Kernel crash seen once:
Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = a1d7c000 [00000008] *pgd=31c93831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM CPU: 0 PID: 250 Comm: dbus-daemon Not tainted 3.14.51-03479-gf50bdf4 #1 task: a3ae61c0 ti: a08c8000 task.ti: a08c8000 PC is at retire_capture_urb+0x10/0x1f4 [snd_usb_audio] LR is at snd_complete_urb+0x140/0x1f0 [snd_usb_audio] pc : [<7f0eb22c>] lr : [<7f0e57fc>] psr: 200e0193 sp : a08c9c98 ip : a08c9ce8 fp : a08c9ce4 r10: 0000000a r9 : 00000102 r8 : 94cb3000 r7 : 94cb3000 r6 : 94d0f000 r5 : 94d0e8e8 r4 : 94d0e000 r3 : 7f0eb21c r2 : 00000000 r1 : 94cb3000 r0 : 00000000 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 31d7c04a DAC: 00000015 Process dbus-daemon (pid: 250, stack limit = 0xa08c8238) Stack: (0xa08c9c98 to 0xa08ca000) ... Backtrace: [<7f0eb21c>] (retire_capture_urb [snd_usb_audio]) from [<7f0e57fc>] (snd_complete_urb+0x140/0x1f0 [snd_usb_audio]) [<7f0e56bc>] (snd_complete_urb [snd_usb_audio]) from [<80371118>] (__usb_hcd_giveback_urb+0x78/0xf4) [<803710a0>] (__usb_hcd_giveback_urb) from [<80371514>] (usb_giveback_urb_bh+0x8c/0xc0) [<80371488>] (usb_giveback_urb_bh) from [<80028e3c>] (tasklet_hi_action+0xc4/0x148) [<80028d78>] (tasklet_hi_action) from [<80028358>] (__do_softirq+0x190/0x380) [<800281c8>] (__do_softirq) from [<80028858>] (irq_exit+0x8c/0xfc) [<800287cc>] (irq_exit) from [<8000ea88>] (handle_IRQ+0x8c/0xc8) [<8000e9fc>] (handle_IRQ) from [<800085e8>] (gic_handle_irq+0xbc/0xf8) [<8000852c>] (gic_handle_irq) from [<80509044>] (__irq_svc+0x44/0x78) [<80508820>] (_raw_spin_unlock_irq) from [<8004b880>] (finish_task_switch+0x5c/0x100) [<8004b824>] (finish_task_switch) from [<805052f0>] (__schedule+0x48c/0x6d8) [<80504e64>] (__schedule) from [<805055d4>] (schedule+0x98/0x9c) [<8050553c>] (schedule) from [<800116c8>] (do_work_pending+0x30/0xd0) [<80011698>] (do_work_pending) from [<8000e160>] (work_pending+0xc/0x20) Code: e1a0c00d e92ddff0 e24cb004 e24dd024 (e5902008) Kernel panic - not syncing: Fatal exception in interrupt
There is a race between retire_capture_urb()& stop_endpoints() which is setting ep->data_subs to NULL. The underlying cause is in snd_usb_endpoint_stop(), which sets ep->data_subs = NULL; ep->sync_slave = NULL; ep->retire_data_urb = NULL; ep->prepare_data_urb = NULL;
An improvement, suggested by Andreas Pape (ADIT) is to read parameters into local variables. This should solve race between stop and retire where it is legal to store the pointers to local variable as they are not freed in stop path but just set to NULL.
Well, it's tricky. A saner way would be to clear the stuff really after all users are gone.
An untested patch is below. Let me know if it really works.
Thanks for your proposal, I am afraid, we only found the race issue once during our automation test, so I can't test for your patch, but from what I can see, your patch makes sense to me.
The only difference when apply with your patch is, now in case stop_endpoints() is called from TRIGGER_STOP, these stuff won't get cleared, but I think this isn't an issue, is it?
Right. After the trigger-stop, the PCM state goes to SETUP, so it can't be played from that point immediately, thus the race won't happen.
FWIW, below is the patch I'm going to queue.
thanks,
Takashi
-- 8< -- From: Takashi Iwai tiwai@suse.de Subject: [PATCH] ALSA: usb-audio: Fix race at stopping the stream
We've got a kernel crash report showing like:
Unable to handle kernel NULL pointer dereference at virtual address 00000008 pgd = a1d7c000 [00000008] *pgd=31c93831, *pte=00000000, *ppte=00000000 Internal error: Oops: 17 [#1] PREEMPT SMP ARM CPU: 0 PID: 250 Comm: dbus-daemon Not tainted 3.14.51-03479-gf50bdf4 #1 task: a3ae61c0 ti: a08c8000 task.ti: a08c8000 PC is at retire_capture_urb+0x10/0x1f4 [snd_usb_audio] LR is at snd_complete_urb+0x140/0x1f0 [snd_usb_audio] pc : [<7f0eb22c>] lr : [<7f0e57fc>] psr: 200e0193 sp : a08c9c98 ip : a08c9ce8 fp : a08c9ce4 r10: 0000000a r9 : 00000102 r8 : 94cb3000 r7 : 94cb3000 r6 : 94d0f000 r5 : 94d0e8e8 r4 : 94d0e000 r3 : 7f0eb21c r2 : 00000000 r1 : 94cb3000 r0 : 00000000 Flags: nzCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 31d7c04a DAC: 00000015 Process dbus-daemon (pid: 250, stack limit = 0xa08c8238) Stack: (0xa08c9c98 to 0xa08ca000) ... Backtrace: [<7f0eb21c>] (retire_capture_urb [snd_usb_audio]) from [<7f0e57fc>] (snd_complete_urb+0x140/0x1f0 [snd_usb_audio]) [<7f0e56bc>] (snd_complete_urb [snd_usb_audio]) from [<80371118>] (__usb_hcd_giveback_urb+0x78/0xf4) [<803710a0>] (__usb_hcd_giveback_urb) from [<80371514>] (usb_giveback_urb_bh+0x8c/0xc0) [<80371488>] (usb_giveback_urb_bh) from [<80028e3c>] (tasklet_hi_action+0xc4/0x148) [<80028d78>] (tasklet_hi_action) from [<80028358>] (__do_softirq+0x190/0x380) [<800281c8>] (__do_softirq) from [<80028858>] (irq_exit+0x8c/0xfc) [<800287cc>] (irq_exit) from [<8000ea88>] (handle_IRQ+0x8c/0xc8) [<8000e9fc>] (handle_IRQ) from [<800085e8>] (gic_handle_irq+0xbc/0xf8) [<8000852c>] (gic_handle_irq) from [<80509044>] (__irq_svc+0x44/0x78) [<80508820>] (_raw_spin_unlock_irq) from [<8004b880>] (finish_task_switch+0x5c/0x100) [<8004b824>] (finish_task_switch) from [<805052f0>] (__schedule+0x48c/0x6d8) [<80504e64>] (__schedule) from [<805055d4>] (schedule+0x98/0x9c) [<8050553c>] (schedule) from [<800116c8>] (do_work_pending+0x30/0xd0) [<80011698>] (do_work_pending) from [<8000e160>] (work_pending+0xc/0x20) Code: e1a0c00d e92ddff0 e24cb004 e24dd024 (e5902008) Kernel panic - not syncing: Fatal exception in interrupt
There is a race between retire_capture_urb() and stop_endpoints(). The latter is called at stopping the stream and it sets some endpoint fields to NULL. But its call is asynchronous, thus the pending complete callback might get called after these NULL clears, and it leads the NULL dereference like the above.
The fix is to move the NULL clearance after the synchronization, i.e. wait_clear_urbs(). This is called at prepare and hw_free callbacks, so it's assured to be called before the restart of the stream or the release of the stream.
Also, while we're at it, put the EP_FLAG_RUNNING flag check at the beginning of snd_complete_urb() to skip the pending complete after the stream is stopped.
Fixes: b2eb950de2f0 ("ALSA: usb-audio: stop both data and sync...") Reported-by: Jiada Wang jiada_wang@mentor.com Reported-by: Mark Craske Mark_Craske@mentor.com Cc: stable@vger.kernel.org Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/usb/endpoint.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index c470251cea4b..f3fce9abece9 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -384,6 +384,9 @@ static void snd_complete_urb(struct urb *urb) if (unlikely(atomic_read(&ep->chip->shutdown))) goto exit_clear;
+ if (unlikely(!test_bit(EP_FLAG_RUNNING, &ep->flags))) + goto exit_clear; + if (usb_pipeout(ep->pipe)) { retire_outbound_urb(ep, ctx); /* can be stopped during retire callback */ @@ -534,6 +537,11 @@ static int wait_clear_urbs(struct snd_usb_endpoint *ep) alive, ep->ep_num); clear_bit(EP_FLAG_STOPPING, &ep->flags);
+ ep->data_subs = NULL; + ep->sync_slave = NULL; + ep->retire_data_urb = NULL; + ep->prepare_data_urb = NULL; + return 0; }
@@ -1006,10 +1014,6 @@ void snd_usb_endpoint_stop(struct snd_usb_endpoint *ep)
if (--ep->use_count == 0) { deactivate_urbs(ep, false); - ep->data_subs = NULL; - ep->sync_slave = NULL; - ep->retire_data_urb = NULL; - ep->prepare_data_urb = NULL; set_bit(EP_FLAG_STOPPING, &ep->flags); } }
participants (4)
-
Clemens Ladisch
-
Jiada Wang
-
Takashi Iwai
-
Takashi Sakamoto