[alsa-devel] DMA over run on playback

Wed May 13 19:29:03 CEST 2009

On Wed, May 13, 2009 at 12:09 PM, Takashi Iwai <tiwai at suse.de> wrote:
> At Wed, 13 May 2009 09:38:50 -0400,
> Jon Smirl wrote:
>>
>> On Wed, May 13, 2009 at 9:25 AM, Jaroslav Kysela <perex at perex.cz> wrote:
>> > On Wed, 13 May 2009, Jon Smirl wrote:
>> >
>> >> There's a long thread over on the pulse list about glitch free
>> >> playback. The glitches they are encountering are caused by CPU
>> >> scheduling latency.  They are trying to fix this by setting HZ up to
>> >> 1000 and constantly polling the audio DMA queue to keep it 99% full.
>> >>
>> >> This doesn't seem like the right solution to me. It is fixing the
>> >> symptom not the cause. The cause is 200-300ms scheduling latency. The
>> >> source of that needs to be tracked down and fixed in the kernel.  But
>> >> we have to live with the latencies until they are fixed.
>> >>
>> >> The strategy of checking the queue at 1000Hz works but it is very
>> >> inefficient. The underlying problem is that the buffer ALSA is using
>> >> is too small on systems with 300ms latency.  The buffer is just big
>> >> enough to cover 300ms so they rapidly check and fill it at 1000Hz to
>> >> ensure that it is full if the 300ms latency strikes.
>> >
>> > ??? The ring buffer size is not limited if hw allows that.
>> >
>> >> On my hardware with period interrupts ALSA is only checking the buffer
>> >> at 8Hz. Since I'm checking appl_ptr I know when DMA over runs the
>> >> buffers. This allows me to insert silence and I could indicated this
>> >
>> > Inserting silence might be wrong, if you broke stream timing. The elapsed()
>> > callback should be called at exact timing (and position should be updated,
>> > too).
>> >
>> >> condition to ALSA if there was a mechanism for doing so. ALSA could
>> >> use this over run knowledge to measure scheduling latency and adjust
>> >> the buffering.
>> >>
>> >> But the DMA interface between ALSA and the driver has been fixed at
>> >> stream creation time. There's no way to dynamically alter it (like
>> >> window size changes in TCP/IP).  With networking you get a list of
>> >> buffers to send. As you send these buffers you mark them sent. The
>> >> core is free to hand you buffers straight from user space or do copies
>> >> and use internal ring buffers. The network driver just gets a list of
>> >> physical addresses to send. This buffer bookkeeping could occur in
>> >> snd_pcm_period_elapsed().
>> >>
>> >> A dynamic chaining mechanism allows you to alter the buffering
>> >> mid-stream. If the driver indication a DMA over run error this tells
>> >> ALSA that it needs to insert another buffer. After a while these
>> >> errors will stop and ALSA will have measured worst case CPU scheduling
>> >> latency. From then on it will know the exact size of buffering needed
>> >> for the kernel it it running on and it can use this knowledge at
>> >> stream creation time. Now filling the buffer at 8Hz or lower will work
>> >> and you don't have to spend the power associated with 1000Hz timer
>> >> interrupts.
>> >
>> > Nothing prevents to application to allocate a big ring buffer and write
>> > samples only as necessary. Application is a producer and controller in this
>> > case. The midlevel layer can hardly do something if samples are not
>> > available. The situation will be more or less bad.
>>
>> Who is going to dynamically measure the scheduling latency of the
>> kernel and compute the correct buffer size for the low level driver?
>> You can't expect every app to do that.
>>
>> > The whole problem is that standard Linux kernel is not realtime, but audio
>> > is realtime task.
>>
>> By that definition networking is a real time task too, but it's not.
>>
>> Playing MP3s is not a real-time task. The buffering system between the
>> app and ALSA's DMA system is not properly communicating feedback and
>> that's what is causing the problem.  Networking has a correct feedback
>> look and doesn't get into trouble. ALSA's buffering system isn't
>> flexible enough to hide these big scheduling latencies without losing
>> data.
>
> It's not about flexibility.  The current audio system itself is
> flexible enough to solve the problem you mentioned.  But you just need
> to do everything by yourself.  A car with manual gears is as flexible
> as a car with automatic gears from the performance POV, but a driver
> needs more work to run it smoothly.
>
> Also, the automation isn't the best thing.  For example, think about
> automatic resizing the buffer and restarting the stream: do you really
> want this for the system like JACK?  No...

The automation I proposed would only kick in on a kernel with poor
latencies. You could write a message into the log saying that buffer
sizes were increase due to latency problems.

This would also be a clear message to anyone running Jack that their
kernel was not performing adequately. On a good, low latency kernel
these mechanisms would never trigger.

I'd much rather see a mechanism like this in the middle layer rather
than relying on ever app to get it right. Once this gets into the apps
it will be there forever, if it is in the kernel it can be
disabled/removed on a kernel that is known to not have problems.

It's also not obvious to me if there is any way for an app to measure
the size of buffer needed. All of the needed info is easily available
in the kernel drivers.

>
> However, obviously, there is a big missing piece here - some
> automation for an easy stream handling, including the automatic buffer
> optimization for lazy jobs like MP3 player.  IMO, this isn't
> necessarily in the kernel driver at all.  It's rather far better
> implemented in the user space.
>
> A question is whether this should be in alsa-lib or not.  Maybe yes,
> if it's really needed...
>
>
> Takashi
>

-- 
Jon Smirl
jonsmirl at gmail.com