Query about xrun on usb/pcm

Tue Nov 22 15:22:50 CET 2022

On Tue, 22 Nov 2022 15:19:13 +0100,
Takashi Iwai wrote:
> 
> On Tue, 22 Nov 2022 12:16:47 +0100,
> Carl Hetherington wrote:
> > 
> > Hi Takashi,
> > 
> > Thank you for getting back to me!
> > 
> > On Tue, 22 Nov 2022, Takashi Iwai wrote:
> > 
> > [snip]
> > 
> > > > Now, snd_usb_endpoint_start() is called on 0x2 and that is fine.  Next,
> > > > snd_usb_endpoint_start() is called on 0x82 and that fails because its
> > > > state is still STOPPING.
> > > >
> > > > At this point things seem broken.
> > > >
> > > > Does anyone have a hint about where in this sequence things are going
> > > > wrong, and maybe even why?
> > >
> > > The problem is that it's treating XRUNs on the both streams
> > > individually.  It's correct to recover only the PCM stream when an
> > > XRUN is reported to the PCM stream.  However, for an XRUN on the
> > > capture stream that serves as a sync source, it should stop and
> > > recover not only the capture PCM stream but also the playback stream
> > > as a sync sink as well.
> > >
> > > Below is a possible test fix (totally untested!).
> > > This may give XRUNs twice eventually, which is a bit confusing, but it
> > > aligns with the actual hardware behavior, at least.
> > 
> > [snip fix]
> > 
> > Makes sense, thank you!  Sadly, the fix doesn't seem to work because (I
> > think) the xruns I'm seeing come via a different path (not though
> > notify_xrun()).  Mine arrive via this trace:
> > 
> > __snd_pcm_xrun
> > snd_pcm_update_state
> > snd_pcm_update_hw_ptr
> > usb_hcd_giveback_urb
> > snd_pcm_period_elapsed_under_stream_lock
> > snd_pcm_period_elapsed
> > retire_capture_urb
> > snd_complete_urb
> > 
> > I'll see if can apply a similar fix to this case, though to my naive
> > eyes it looks a little trickier as the xrun is found in the snd_pcm
> > code rather than the USB code.  Any suggestions most welcome!
> 
> OK, then it's a bit different problem, and not so trivial to fix in
> the kernel side alone, I'm afraid.  Basically it's a race between
> start and stop of two streams.  The key point is that, for stopping a
> (USB) stream, a sync-stop operation is needed, and this can't be
> performed at the PCM trigger itself (which is an atomic operation).
> So, the kernel trigger may at most return an error there.
> 
> I assume that it's from snd_usb_endpoint_start() and it returning
> -EPIPE error.  If so, we may change the PCM core code to set the PCM
> state again XRUN in such an error case, so that application may repeat
> the standard recovery process.  Something like below.

Also, it might be slightly better if we swap the starting order of two
streams: sync at first, then data.  A race can still happen, though.


Takashi

-- 8< --

--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -241,19 +241,19 @@ static int start_endpoints(struct snd_usb_substream *subs)
 	if (!subs->data_endpoint)
 		return -EINVAL;
 
-	if (!test_and_set_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags)) {
-		err = snd_usb_endpoint_start(subs->data_endpoint);
+	if (subs->sync_endpoint &&
+	    !test_and_set_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags)) {
+		err = snd_usb_endpoint_start(subs->sync_endpoint);
 		if (err < 0) {
-			clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags);
+			clear_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags);
 			goto error;
 		}
 	}
 
-	if (subs->sync_endpoint &&
-	    !test_and_set_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags)) {
-		err = snd_usb_endpoint_start(subs->sync_endpoint);
+	if (!test_and_set_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags)) {
+		err = snd_usb_endpoint_start(subs->data_endpoint);
 		if (err < 0) {
-			clear_bit(SUBSTREAM_FLAG_SYNC_EP_STARTED, &subs->flags);
+			clear_bit(SUBSTREAM_FLAG_DATA_EP_STARTED, &subs->flags);
 			goto error;
 		}
 	}