At Tue, 06 Nov 2007 19:10:28 +0300, Stas Sergeev wrote:
Hi.
Takashi Iwai wrote:
In principle, using rate plugin with two periods doesn't work well in case that the sample rates aren't aligned. It's a design issue. You shouldn't use two periods except for hw. Period.
I understand that and I even tried a hack to increase the buffer_size and stop_threshold by the period_size in a rate plugin, although without a success. However, there is one exception: it will work provided the app is allowed to fill both fragments entirely. Without that patch, it simply can't. With the patch - the write returns earlier, the app writes again, the rate plugin gets the period filled, and commits it, before the first one expired. Isn't it exactly the right and reliable fix then?
I don't understand your description well. Could you give a simple test code to prove the bug?
Here, note that 940 != 1024 * 44.1 / 48.0 exactly. This rounding causes the drift of wake-up time at each period and the delay is accumulated.
I understand, but the app can start writing _third_ period if write returns earlier. And that write will fill up the reminder of the second one, no matter how it accumulated.
In the case above, it's woken up *LATER* than expected. write/poll doesn't return earlier. Wel, write would return ealier but it doesn't return the size of the period size but must be less than that.
So, even applying your patch, the XRUN problem may occur at some time as long as you use two periods. It can't be fixed without the fundamental change of the irq / poll handling routines in the ALSA driver.
I am more inclinced to think there is a race - except for the underruns, I've also seen a lockups in poll(). Do you think the race I described in the previous posting doesn't exist?
I don't think it's a race. It appears to be just the matter of configuration mismatch (or misconception) to me.
Whether that hack really does any good thing is questionable, indeed.
Actually, I've found out that it plaques many other bugs. For example, mpg123 sets avail_min=1, which is silly but should work. With the hack, the avail_min just gets increased. Without the hack - snd_pcm_write_areas() spins in a loop trying to commit the samples one-by-one (because the poll returns immediately), eating 100% of CPU. But that's an unrelated bug, which was just hidden, and I think there are more...
Possibly. Changing avail_min isn't a good idea, I believe, too.
First, it skips the avail_min adjustment if the app fills the period size. Thus only for apps that fills arbitrary amount of data via snd_pcm_writei() triggers this hack.
I don't think so.
It is. It has the condition "avail_min > 0", where avail_min is computed from appl_ptr % period_size. Thus, if you fill always the period_size and calls poll, this hack is always skipped.
Because of the rounding errors in a rate plugin, when the app thinks it writes the entire fragment, it actually does not. Any app is affected!
The app must not think that the whole fragment can be written at a single call. That's the bug of the app!
Second, avail_min is checked usually in irq handler and thus its resolution is also in period size. It means avail_min + 1 is equivalent with avail_min + period_size.
Yes, so don't increase it then. :)
So what we can do better? As a temporary solution, we can get rid of the problematic part, or, at least, add the check whether avail_min comes over stop_threshold. I'm not sure whether any big impact by removing the hack there. Maybe not. But, I feel it's a barren discussion. It's really a design problem. Sigh.
I think allowing an app to start writing the third fragment, filling actually the second one, is a very good fix. What problems can you see with it?
The problem is that two periods with the current rate plugin cannot work. Suppose the case of client period size 940, slave period size 1024. The wake up happens at each slave period size, i.e. slave pointer at 1024, 2048, 3072, ... It corresponds to (almost) client pointer at 940, 1881, 2822. At each wakeup point, app is notified that one period gets empty (and one another is being played). Now, you see the wake up is delayed more and more from the expected point, 940, 1880, 2820, ... Hence, at some time, the delay will reach to the period size. The app is notiried that *one* period is empty but at this point *both* periods are empty, that is, buffer underrun.
So, without another wake-up source, the problem cannot be solved.
Takashi