Re: [alsa-devel] ALSA calling pcm_pointer excessively?

11 May 2012


      On Fri, May 11, 2012 at 03:31:08PM +0200, Takashi Iwai wrote:
...
At Thu, 10 May 2012 19:05:56 +0200,
Clemens Ladisch wrote:
...
Russell King - ARM Linux wrote:
...
I think what's happening is that snd_pcm_lib_write1() is looping, and
each time it updates the hardware position, it finds that it can
transfer 8 or 16 bytes to the buffer.  Once it's transferred that,
it re-updates the hardware position which has now advanced by another
8 or 16 bytes.  Repeat, and you find that snd_pcm_lib_write1() spends
a lot of time inefficiently copying the buffer.
It seems that it will only sleep if the hardware pointer stops making
progress.
I'd guess that most existing hardware is fast enough to transfer
samples and has a big enough granularity (typically 32 byte bursts
for PCI) that the pointer doesn't change in consecutive loop
iterations.
This (untested) patch tries to avoid too many busy looping.
Hmm...  I still can't follow why such a busy loop happens when
avail_min > 1.  If avail_min = 1, a busy loop can't be avoided.
But if avail_min is set (typically equal with period_size),
runtime->twake is either the rest size or the period size, and
wait_for_avail() should wait until that sufficiently.
It's exaclty as I explained.  Lets take avail_min = 4096, as it is in my
case.
In snd_pcm_lib_write1(), we loop - here's the simplified version:
while (size > 0) {
                if (runtime->status->state == SNDRV_PCM_STATE_RUNNING)
                        snd_pcm_update_hw_ptr(substream);
                avail = snd_pcm_playback_avail(runtime);
                if (!avail) {
    		/* Sleep until avail >= avail_min */
    	}
    	/* Transfer up to max(avail,size) bytes to buffer */
    }
The problem is this.  On entry to the loop, say, we calculate that we
can transfer 4096 bytes into the buffer, and we do that.  While we're
transferring those bytes, the hardware DMA position has advanced.
So, the second time around the loop, avail will be non-zero - let's
say that the DMA position has advanced 32 bytes.  So avail will be
32 bytes, and we transfer 32 bytes.  Again, while transferring those
bytes, the DMA position has advanced, but not as far.  Let's say 8
bytes.
So, we repeat the loop, and again, we find avail is non-zero.  We
omit sleeping, instead, dropping through to transfer 8 bytes.
Meanwhile, the DMA position has advanced another 8 bytes.
Repeat, endlessly, wasting lots of CPU cycles transferring very small
sets of sample data to the buffer.
That happens because of the requirement for the above code to sleep is
that there has been _no_ advancement of the DMA position while copying
data to the buffer.  Unfortunately, that's not always a realistic
expectation, especially if you have a high data rate.
At no point in the above scenario does avail_min come into it, until
we're lucky enough that we copy data into the buffer without the DMA
position changing.  At that point, we're then allowed to sleep.  The
decision to sleep is based purely upon there being _zero_ bytes of
available buffer space.