On Tue, 22 Nov 2022 15:19:13 +0100, Takashi Iwai wrote:
On Tue, 22 Nov 2022 12:16:47 +0100, Carl Hetherington wrote:
Hi Takashi,
Thank you for getting back to me!
On Tue, 22 Nov 2022, Takashi Iwai wrote:
[snip]
Now, snd_usb_endpoint_start() is called on 0x2 and that is fine. Next, snd_usb_endpoint_start() is called on 0x82 and that fails because its state is still STOPPING.
At this point things seem broken.
Does anyone have a hint about where in this sequence things are going wrong, and maybe even why?
The problem is that it's treating XRUNs on the both streams individually. It's correct to recover only the PCM stream when an XRUN is reported to the PCM stream. However, for an XRUN on the capture stream that serves as a sync source, it should stop and recover not only the capture PCM stream but also the playback stream as a sync sink as well.
Below is a possible test fix (totally untested!). This may give XRUNs twice eventually, which is a bit confusing, but it aligns with the actual hardware behavior, at least.
[snip fix]
Makes sense, thank you! Sadly, the fix doesn't seem to work because (I think) the xruns I'm seeing come via a different path (not though notify_xrun()). Mine arrive via this trace:
__snd_pcm_xrun snd_pcm_update_state snd_pcm_update_hw_ptr usb_hcd_giveback_urb snd_pcm_period_elapsed_under_stream_lock snd_pcm_period_elapsed retire_capture_urb snd_complete_urb
I'll see if can apply a similar fix to this case, though to my naive eyes it looks a little trickier as the xrun is found in the snd_pcm code rather than the USB code. Any suggestions most welcome!
OK, then it's a bit different problem, and not so trivial to fix in the kernel side alone, I'm afraid. Basically it's a race between start and stop of two streams. The key point is that, for stopping a (USB) stream, a sync-stop operation is needed, and this can't be performed at the PCM trigger itself (which is an atomic operation). So, the kernel trigger may at most return an error there.
I assume that it's from snd_usb_endpoint_start() and it returning -EPIPE error. If so, we may change the PCM core code to set the PCM state again XRUN in such an error case, so that application may repeat the standard recovery process. Something like below.
Also, it might be slightly better if we swap the starting order of two streams: sync at first, then data. A race can still happen, though.
Takashi
-- 8< -- --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -241,19 +241,19 @@ static int start_endpoints(struct snd_usb_substream *subs) if (!subs->data_endpoint) return -EINVAL;
- if (!test_and_set_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags)) { - err = snd_usb_endpoint_start(subs->data_endpoint); + if (subs->sync_endpoint && + !test_and_set_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags)) { + err = snd_usb_endpoint_start(subs->sync_endpoint); if (err < 0) { - clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags); + clear_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags); goto error; } }
- if (subs->sync_endpoint && - !test_and_set_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags)) { - err = snd_usb_endpoint_start(subs->sync_endpoint); + if (!test_and_set_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags)) { + err = snd_usb_endpoint_start(subs->data_endpoint); if (err < 0) { - clear_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags); + clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags); goto error; } }