[alsa-devel] appl_ptr and DMA overrun at end of stream

Mon May 11 19:09:43 CEST 2009

On Mon, May 11, 2009 at 12:43 PM, Takashi Iwai <tiwai at suse.de> wrote:
> At Mon, 11 May 2009 12:28:48 -0400,
> Jon Smirl wrote:
>>
>> On Mon, May 11, 2009 at 11:58 AM, Takashi Iwai <tiwai at suse.de> wrote:
>> > At Mon, 11 May 2009 11:50:22 -0400,
>> > Jon Smirl wrote:
>> >>
>> >> On Mon, May 11, 2009 at 11:40 AM, Takashi Iwai <tiwai at suse.de> wrote:
>> >> > At Mon, 11 May 2009 11:11:55 -0400,
>> >> > Jon Smirl wrote:
>> >> >>
>> >> >> On Mon, May 11, 2009 at 9:45 AM, Jaroslav Kysela <perex at perex.cz> wrote:
>> >> >> > On Mon, 11 May 2009, Jon Smirl wrote:
>> >> >> >
>> >> >> >>> Right.  This is the value to check in your case.
>> >> >> >>
>> >> >> >> What do think about redesigning the ALSA DMA interface to support
>> >> >> >> detection of over and under run? Leaving the DMA engine in a loop and
>> >> >> >> not coordinating with ALSA as to where the valid data is does not seem
>> >> >> >> to be a safe way of exchanging data. That interface may be a source of
>> >> >> >> the problems pulseaudio is encountering.
>> >> >> >>
>> >> >> >> A simple solution would be for snd_pcm_period_elapsed() to return
>> >> >> >> physical address of the last valid sample. That would let me avoid
>> >> >> >> playing with  s->runtime->control->appl_ptr. You could provide the
>> >> >> >> same data in the pointer() function.
>> >> >> >
>> >> >> > More simpler solution is to check the stream state in the low level driver.
>> >> >> > If it's in DRAINING state, then end of stream is signaled from the
>> >> >> > application and driver might not queue next buffer. We may also add another
>> >> >> > callback (or use ioctl callback) to pass this stream state change to the
>> >> >> > lowlevel driver immediately, so the driver might react more quickly on this
>> >> >> > situation.
>> >> >> >
>> >> >>
>> >> >> Quickness is the wrong way to think about this problem. ALSA knows exactly
>> >> >> when it has placed valid data into the buffer.
>> >> >
>> >> > Not really.  When the mmap mode is used, the update isn't always
>> >> > notified to the driver and the transfer can be completely
>> >> > asynchronous.
>> >>
>> >> This seems to me to be a broken design. ALSA is being put into the
>> >> position of guessing when the application has supplied new data.
>> >> Shouldn't the app be required to make a commit() call after filling in
>> >> the data? Without commit it is impossible to detect over/underrun.
>> >
>> > The commit updates the mmapped control data (so that it works even
>> > without the context switch) if the architecture supports.  In other
>> > cases, a commit issues an explicit sync ioctl.
>> >
>> > Actually it should be possible to disable the mmap-control mode
>> > explicitly, but right now it's not done from the driver side but only
>> > checks the architecture.
>>
>> Shared memory is another solution that doesn't involve context switches.
>
> It's mmap when you do between kernel <-> user spaces :)
>
>> The app can update it's valid pointer in shared memory.
>> My IRQ will call snd_pcm_period_elapsed().
>> snd_pcm_period_elapsed() can find the updated valid pointer,
>>   convert it to a physical address and leave it in a shared structure.
>> When snd_pcm_period_elapsed() returns, my IRQ can get to
>>   the pointer and submit the necessary buffers.
>>
>> What's missing is an official way of accessing
>> s->runtime->control->appl_ptr from the low level driver.
>
> The appl_ptr itself can be accessed at any time from the driver,
> so there is no need for an "official" accessor to that.
>
>>  We're
>> implementing a ring buffer. In a ring buffer I have to know where both
>> pointers are in order to detect over/under run.
>
> Well, when you call snd_pcm_period_elapsed(), the PCM core actually
> checks the buffer XRUN there.

Checking for over/under run in software is not reliable since the DMA
hardware runs asynchronously with the CPU. There will always be
variable latencies between when the CPU detects the condition and when
it can control the DMA hardware.  The only reliable way to do this is
to program the DMA hardware to do itself. AFAIK all DMA modern
hardware can be programed to do this if the right information is made
available. Programming the DMA hardware to do this is a 100% reliable
solution and not subject to random latency problems.

Also, if I had more detailed buffer information, I could set the
hardware to only play a partial last period and not require that the
period be padded with silence.

mpc5200 has full scatter/gather capability. You could even send me
random length buffers scattered anywhere in memory. But not all DMA
hardware can deal with non-contiguous buffers.

>
>> I also don't
>> understand why this is specific to my hardware, every DMA
>> implementation should need these two pointers.
>
> These two pointers *are* available.  That's why your first suggestion,
> checking appl_ptr, did work.  That was basically right.
>
> Yet, there is another question whether we need a better way for
> the buffer transfer on a queue-style device.  For such a device, the
> async transfer with mmap is somehow troublesome.
>
>
> Takashi
>

-- 
Jon Smirl
jonsmirl at gmail.com