On Wed, Dec 14, 2022 at 06:15:22PM +0000, Mark Brown wrote:
On Wed, Dec 14, 2022 at 09:40:02AM -0700, Shuah Khan wrote:
On 12/14/22 06:03, Nícolas F. R. A. Prado wrote:
The default timeout for kselftests is 45 seconds, but that isn't enough time to run pcm-test when there are many PCMs on the device, nor for mixer-test when slower control buses and fancier CODECs are present.
As data points, running pcm-test on mt8192-asurada-spherion takes about 1m15s, and mixer-test on rk3399-gru-kevin takes about 2m.
Set the timeout to 4 minutes to allow both pcm-test and mixer-test to run to completion with some slack.
What I have in mind is that the default run can be limited scope. Run it on a few controllers and in the report mention that a full test can be run as needed.
There are a couple of examples of default vs. full test runs - cpu and memory hot-lug tests.
For pcm-test it's probably more sensible to refactor things to run multiple PCMs (or at least cards, though that's less relevant in an embedded context) in parallel rather than cut down the test coverage, it's already very limited coverage as things stand. There is some risk there could be false positives from cross talk between the PCMs but it's probably worth it.
With mixer-test if it's actually taking a long time to run generally this is just identifying that the driver could use some work, implementing runtime power management and a register cache will probably resolve most issues.
Hi Shuah and Mark,
sorry for the delay, but I'd still like to move forward with this.
Shuah, I've checked the tests you mentioned that have limited scope by default, and we could do the same for the alsa kselftest, but I'm not sure I understand how this solves the problem here. The fact is that the current timeout is too short for a full run of the alsa kselftest on some machines, so we need to increase the timeout in any case regardless of there being a limited scope run by default or not. Mark implemented the parallelization he mentioned in the meantime, but it doesn't help every hardware. The only other option I see is reducing the time the PCM is tested for (currently 4 seconds). But I assume that number was chosen for a reason.
I'd also like to better understand why we have an arbitrary (45 seconds) default timeout. If there are users who want to limit the test duration to an arbitrary time even if that means not getting full test coverage, shouldn't such arbitrary time be supplied by the users themselves?
And is there any guidance on what are the acceptable exceptions to having a longer timeout? Because there seem to be many kselftests which override the default timeout with a longer one, even ones that disable it altogether.
I can see the value of having a timeout as the worst case scenario of how long the test takes to run, to avoid hanging indefinitely, which is what the tests with overriden timeout setting do. For the alsa kselftests that's hardware-dependent (number of kcontrols for mixer-test; and number of PCMs and working configurations for pcm-test), so we'd either need to guess a high enough value for the timeout that all known hardware fits, or allow the timeout to be set dynamically during runtime.
Thanks, Nícolas