[PATCH v2] sound: rawmidi: Add framing mode

Sat Apr 10 13:41:38 CEST 2021

On 2021-04-06 14:01, Takashi Iwai wrote:
> On Mon, 05 Apr 2021 14:13:27 +0200,
> David Henningsson wrote:
>>
>> On 2021-03-31 09:40, Takashi Iwai wrote:
>>> On Tue, 30 Mar 2021 21:35:11 +0200,
>>> David Henningsson wrote:
>>>> Well, I believe that rawmidi provides less jitter than seq is not a
>>>> theoretical problem but a known fact (see e g [1]), so I haven't tried
>>>> to "prove" it myself. And I cannot read your mind well enough to know
>>>> what you would consider a sufficient proof - are you expecting to see
>>>> differences on a default or RT kernel, on a Threadripper or a
>>>> Beaglebone, idle system or while running RT torture tests? Etc.
>>> There is certainly difference, and it might be interesting to see the
>>> dependency on the hardware or on the configuration.  But, again, my
>>> primary question is: have you measured how *your patch* really
>>> provides the improvement?  If yes, please show the numbers in the
>>> patch description.
>> As you requested, I have now performed such testing.
>>
>> Results:
>>
>> Seq - idle: 5.0 ms
>>
>> Seq - hackbench: 1.3 s (yes, above one second)
>>
>> Raw + framing - idle: 2.8 ms
>>
>> Raw + framing - hackbench: 2.8 ms
>>
>> Setup / test description:
>>
>> I had an external midi sequencer connected through USB. The system
>> under test was a Celeron N3150 with internal graphics. The sequencer
>> was set to generate note on/note off commands exactly 10 times per
>> second.
>>
>> For the seq tests I used "arecordmidi" and analyzed the delta values
>> of resulting midi file. For the raw + framing tests I used a home-made
>> application to write a midi file. The monotonic clock option was used
>> to rule out differences between monotonic and monotonic_raw. The
>> result shown above is the maximum amount of delta value, converted to
>> milliseconds, minus the expected 100 ms between notes. Each test was
>> run for a minute or two.
>>
>> For the "idle" test, the machine was idle (running a normal desktop),
>> and for the "hackbench" test, "chrt -r 10 hackbench" was run a few
>> times in parallel with the midi recording application (which was run
>> with "chrt -r 15").
>>
>> I also tried a few other stress tests but hackbench was the one that
>> stood out as totally destroying the timestamps of seq midi. (E g,
>> running "rt-migrate-test" in parallel with "arecordmidi" gave a max
>> jitter value of 13 ms.)
>>
>> Conclusion:
>>
>> I still believe the proposed raw + framing mode is a valuable
>> improvement in the normal/idle case, but even more so because it is
>> more stable in stressed conditions. Do you agree?
> Thanks for the tests.  Yes, that's an interesting and convincing
> result.
>         
> Could you do a couple of favors in addition?

Okay, now done. Enjoy :-)

>
> 1) Check the other workqueue
>
> It's interesting to see whether the hiprio system workqueue may give a
> better latency.  A oneliner patch is like below.
>
> -- 8< --
> --- a/sound/core/rawmidi.c
> +++ b/sound/core/rawmidi.c
> @@ -1028,7 +1028,7 @@ int snd_rawmidi_receive(struct snd_rawmidi_substream *substream,
>   	}
>   	if (result > 0) {
>   		if (runtime->event)
> -			schedule_work(&runtime->event_work);
> +			queue_work(system_highpri_wq, &runtime->event_work);
>   		else if (__snd_rawmidi_ready(runtime))
>   			wake_up(&runtime->sleep);
>   	}
> -- 8< --

Result: idle: 5.0 ms

hackbench > 1 s

I e, same as original.

>
> Also, system_unbound_wq can be another interesting test case instead
> of system_highpri_wq.
>
> 2) Direct sequencer event process
>
> If a chance of workqueue doesn't give significant improvement, we
> might need to check the direct invocation of the sequencer
> dispatcher.  A totally untested patch is like below.
>
> -- 8< --
> --- a/sound/core/rawmidi.c
> +++ b/sound/core/rawmidi.c
> @@ -979,6 +979,7 @@ int snd_rawmidi_receive(struct snd_rawmidi_substream *substream,
>   	unsigned long flags;
>   	int result = 0, count1;
>   	struct snd_rawmidi_runtime *runtime = substream->runtime;
> +	bool call_event = false;
>   
>   	if (!substream->opened)
>   		return -EBADFD;
> @@ -1028,11 +1029,13 @@ int snd_rawmidi_receive(struct snd_rawmidi_substream *substream,
>   	}
>   	if (result > 0) {
>   		if (runtime->event)
> -			schedule_work(&runtime->event_work);
> +			call_event = true;
>   		else if (__snd_rawmidi_ready(runtime))
>   			wake_up(&runtime->sleep);
>   	}
>   	spin_unlock_irqrestore(&runtime->lock, flags);
> +	if (call_event)
> +		runtime->event(runtime->substream);
>   	return result;
>   }
>   EXPORT_SYMBOL(snd_rawmidi_receive);
>
> -- 8< --

Result:

Idle: 3.0 ms

Hackbench still > 1s.

The reason that this is 3.0 and not 2.8 is probably during to some 
rounding to whole ms somewhere in either seq or arecordmidi - I'd say 
this is likely the same 2.8 ms as we see from the rawmidi+framing test.

> In theory, this should bring to the same level of latency as the
> rawmidi timestamping.  Of course, this doesn't mean we can go straight
> to this way, but it's some material for consideration.

I don't know why the hackbench test is not improved here. But you seem 
to have changed seq from tasklet to workqueue in 2011 (commit 
b3c705aa9e9), presumably for some relevant reason, like a need to sleep 
in the seq code...?

// David