[alsa-devel] On non-rewindability of resamplers

Alexander E. Patrakov patrakov at gmail.com
Mon May 12 06:52:38 CEST 2014


12.05.2014 09:11, Raymond Yau wrote:
> https://bugs.launchpad.net/ubuntu/+source/pulseaudio/+bug/1188425
>
> I: [pulseaudio] alsa-sink.c: Successfully opened device a52:0.
> I: [pulseaudio] alsa-sink.c: Selected mapping 'Digital Surround 5.1
> (IEC958/AC3)' (iec958-ac3-surround-51).
> I: [pulseaudio] alsa-sink.c: Cannot enable timer-based scheduling,
> falling back to sound IRQ scheduling.
> I: [pulseaudio] alsa-sink.c: Successfully enabled mmap() mode.
>
> Seem only hw device report whether it support disable period wake-up
>
> ioplug did not support disable period wake-up and you need A52 plugin to
> provide a parameter to disable the period wakeup of the slave

No, or maybe "not yet". PulseAudio will not try to enable timer-based 
scheduling on a52 anyway, because of the following source lines.

http://cgit.freedesktop.org/pulseaudio/pulseaudio/tree/src/modules/alsa/alsa-util.c#n245
http://cgit.freedesktop.org/pulseaudio/pulseaudio/tree/src/modules/alsa/alsa-util.c#n1393

> do you mean pulseaudio can disable period wakeup of the hda-intel
> through extplug ?

Yes. That's a difference between ioplug and extplug. But I don't really 
care about disabling period interrupts.

>
> D: [alsa-sink] alsa-util.c: PCM state is RUNNING
> I: [alsa-sink] alsa-sink.c: Starting playback.
> I: [alsa-sink] (alsa-lib)pcm_hw.c: SNDRV_PCM_IOCTL_START failed (-77)
>
> does SNDRV_PCM_IOCTL_START fail mean pcm state is no longer running ?

Note that this comes from pcm_hw.c. As PulseAudio does not use the hw: 
device in this particular use case, I have to conclude that it comes 
through the a52 or ioplug code. I am not really familiar with this code.

>
>  >
>  > Instead of what you are proposing above, I wrote a loop that
> repeatedly calls snd_pcm_rewindable() 7000000 times and prints the
> result if it differs from the previous one. With snd-hda-intel (PCH), hw
> plugin, stereo, S16_LE, 48 kHz, 6 periods, and a period size of 1024, I
> get this:
>  >
>  > Rewindable: 6119, loop iteration: 0
>  > Rewindable: 5119, loop iteration: 5389434
>
> the method can be improved
>
> instead of wake up in half period time to check the value of
> snd_pcm_rewindable()
>
> 1) set the timer to wakeup at 1/16 period time intervals,  if the values
> does not change , this mean that it does not provide accuracy of 1/16 of
> period time and you can know whether it support 1/8 when the next wakeup
> occur at 1/8 period time, ...until you get 16 values for the first period
>
> 2) if the value of snd_pcm_rewindable change at every 1/16 period time
> intervals , set the timer to wakeup at 1/256 period time at the second
> period

No need to do this. I have already made enough conclusions. 
Unfortunately, I forgot to attach the new test program (intentionally 
modified to produce an underrun), doing it now.

The output here is:

$ ./a.out
Hardware PCM card 2 'HDA Intel PCH' device 0 subdevice 0
Its setup is:
   stream       : PLAYBACK
   access       : RW_INTERLEAVED
   format       : S16_LE
   subformat    : STD
   channels     : 4
   rate         : 48000
   exact rate   : 48000 (48000/1)
   msbits       : 16
   buffer_size  : 4096
   period_size  : 1024
   period_time  : 21333
   tstamp_mode  : NONE
   period_step  : 1
   avail_min    : 1024
   period_event : 0
   start_threshold  : 1024
   stop_threshold   : 4096
   silence_threshold: 0
   silence_size : 0
   boundary     : 4611686018427387904
   appl_ptr     : 0
   hw_ptr       : 0
Playing silence
===================
Hardware PCM card 2 'HDA Intel PCH' device 0 subdevice 0
Its setup is:
   stream       : PLAYBACK
   access       : RW_INTERLEAVED
   format       : S16_LE
   subformat    : STD
   channels     : 4
   rate         : 48000
   exact rate   : 48000 (48000/1)
   msbits       : 16
   buffer_size  : 4096
   period_size  : 1024
   period_time  : 21333
   tstamp_mode  : NONE
   period_step  : 1
   avail_min    : 1024
   period_event : 0
   start_threshold  : 1024
   stop_threshold   : 4096
   silence_threshold: 0
   silence_size : 0
   boundary     : 4611686018427387904
   appl_ptr     : 4096
   hw_ptr       : 0
Rewindable: 4096, loop iteration: 0
===================
Hardware PCM card 2 'HDA Intel PCH' device 0 subdevice 0
Its setup is:
   stream       : PLAYBACK
   access       : RW_INTERLEAVED
   format       : S16_LE
   subformat    : STD
   channels     : 4
   rate         : 48000
   exact rate   : 48000 (48000/1)
   msbits       : 16
   buffer_size  : 4096
   period_size  : 1024
   period_time  : 21333
   tstamp_mode  : NONE
   period_step  : 1
   avail_min    : 1024
   period_event : 0
   start_threshold  : 1024
   stop_threshold   : 4096
   silence_threshold: 0
   silence_size : 0
   boundary     : 4611686018427387904
   appl_ptr     : 4096
   hw_ptr       : 1048
Rewindable: 3048, loop iteration: 1288389
===================
Hardware PCM card 2 'HDA Intel PCH' device 0 subdevice 0
Its setup is:
   stream       : PLAYBACK
   access       : RW_INTERLEAVED
   format       : S16_LE
   subformat    : STD
   channels     : 4
   rate         : 48000
   exact rate   : 48000 (48000/1)
   msbits       : 16
   buffer_size  : 4096
   period_size  : 1024
   period_time  : 21333
   tstamp_mode  : NONE
   period_step  : 1
   avail_min    : 1024
   period_event : 0
   start_threshold  : 1024
   stop_threshold   : 4096
   silence_threshold: 0
   silence_size : 0
   boundary     : 4611686018427387904
   appl_ptr     : 4096
   hw_ptr       : 2049
Rewindable: 2047, loop iteration: 3010739
===================
Hardware PCM card 2 'HDA Intel PCH' device 0 subdevice 0
Its setup is:
   stream       : PLAYBACK
   access       : RW_INTERLEAVED
   format       : S16_LE
   subformat    : STD
   channels     : 4
   rate         : 48000
   exact rate   : 48000 (48000/1)
   msbits       : 16
   buffer_size  : 4096
   period_size  : 1024
   period_time  : 21333
   tstamp_mode  : NONE
   period_step  : 1
   avail_min    : 1024
   period_event : 0
   start_threshold  : 1024
   stop_threshold   : 4096
   silence_threshold: 0
   silence_size : 0
   boundary     : 4611686018427387904
   appl_ptr     : 4096
   hw_ptr       : 3092
Rewindable: 1004, loop iteration: 5251015
===================
Hardware PCM card 2 'HDA Intel PCH' device 0 subdevice 0
Its setup is:
   stream       : PLAYBACK
   access       : RW_INTERLEAVED
   format       : S16_LE
   subformat    : STD
   channels     : 4
   rate         : 48000
   exact rate   : 48000 (48000/1)
   msbits       : 16
   buffer_size  : 4096
   period_size  : 1024
   period_time  : 21333
   tstamp_mode  : NONE
   period_step  : 1
   avail_min    : 1024
   period_event : 0
   start_threshold  : 1024
   stop_threshold   : 4096
   silence_threshold: 0
   silence_size : 0
   boundary     : 4611686018427387904
   appl_ptr     : 4096
   hw_ptr       : 4136
Rewindable: -40, loop iteration: 7807909
This means Too many levels of symbolic links


>
>  >
>  > So snd_pcm_rewindable() can return weird values that are updated
> every period size or so. As such, I wouldn't believe its return value
> out of the box even for hw devices. At loop iteration 5389433, the CPU
> chewed enough time for almost one period, but snd_pcm_rewindable() said
> that almost 6 periods are rewindable. Probably a missing sync_ptr()
> somewhere, or a documentation bug.
>  >
>  > With snd_pcm_avail() inserted (which does synchronize the position)
> before each call to snd_pcm_rewindable(), I get:
>  >
>  > Rewindable: 6119, loop iteration: 0
>  > Rewindable: 6112, loop iteration: 2
>  > Rewindable: 6104, loop iteration: 42
>  > Rewindable: 6096, loop iteration: 76
>  > Rewindable: 6088, loop iteration: 125
>  > Rewindable: 6080, loop iteration: 173
>  > Rewindable: 6072, loop iteration: 222
>  > Rewindable: 6064, loop iteration: 270
>  >
>  > (and an underrun in the end).
>  >
>  > With 4 channels:
>  >
>  > Rewindable: 6112, loop iteration: 0
>  > Rewindable: 6108, loop iteration: 2
>  > Rewindable: 6104, loop iteration: 14
>  > Rewindable: 6100, loop iteration: 36
>  > Rewindable: 6096, loop iteration: 58
>  > Rewindable: 6092, loop iteration: 63
>  >
>  > With 8 channels:
>  >
>  > Rewindable: 6104, loop iteration: 0
>  > Rewindable: 6098, loop iteration: 1
>  > Rewindable: 6096, loop iteration: 2
>  > Rewindable: 6094, loop iteration: 9
>  > Rewindable: 6092, loop iteration: 24
>  > Rewindable: 6090, loop iteration: 32
>  > Rewindable: 6088, loop iteration: 41
>  >
>  > So on my snd-hda-intel, the granularity of the pointer is 32 bytes.
>  >
>  > For Haswell HDMI (on another snd-hda-intel), stereo, S16_LE:
>  >
>  > Rewindable: 6128, loop iteration: 0
>  > Rewindable: 6112, loop iteration: 129
>  > Rewindable: 6096, loop iteration: 339
>  > Rewindable: 6080, loop iteration: 551
>  > Rewindable: 6064, loop iteration: 753
>  > Rewindable: 6048, loop iteration: 966
>  > Rewindable: 6032, loop iteration: 1180
>  >
>  > so the resulting granularity is 64 bytes.
>  >
>  > An unfortunate observation is that, without snd_pcm_avail(), even on
> hw just after an underrun snd_pcm_rewindable() can return negative
> numbers such as -16 or -25 that lead to nonsense error codes (EBUSY or
> ENOTTY).
>  >
>
> pcm_rewind2.c use period size instead of buffer size as start_threshold
> , pcm is already started before you fill the buffer full and pcm can be
> stopped at underrun if your program does not use boundary as stop_threshold
>
> this affect your timing if your test program behave like pcm_rewind2.c

I agree with the above. As long as it serves as a testcase for a bug, it 
is good.

>
> static snd_pcm_sframes_t snd_pcm_hw_rewindable(snd_pcm_t *pcm)
>
> {
>          return snd_pcm_mmap_hw_avail(pcm);
>    }
>
> if this function return the safe value, Do you mean it must hw_sync the
> pointer and  should return zero if snd_pcm_mmap_hw_avail is negative and
> check the pcm state  to return negative error code ?

Answering by parts.

Must hw_sync the pointer - yes.

Should return zero if snd_pcm_mmap_hw_avail is negative - not sure, for 
two reasons. First, I am not sure if snd_pcm_mmap_hw_avail is indeed 
allowed to return negative values due to yet-undetected xruns. Second, 
negative snd_pcm_mmap_hw_avail means an xrun, so I am not sure whether 0 
is a valid return code here.

Should check the pcm state to return negative error code - yes at least 
for non-xrun states such as SND_PCM_STATE_SUSPENDED or 
SND_PCM_STATE_DISCONNECTED, and I am not sure whether to return 0 or 
-EPIPE on a known xrun.

Also I am not sure about interaction with a very large stop_threshold 
(i.e. settings that ignore underruns), and the above (except hw_sync) is 
for playback only. We need a separate discussion about capture, but I am 
not yet ready to start it.

>
>  >
>  >>
>  >>
> http://www.alsa-project.org/~tiwai/writing-an-alsa-driver/ch05s07.html#pcm-interface-interrupt-handler-boundary
>  >>
>  >> High frequency timer interrupts
>  >>
>  >> This happens when the hardware doesn't generate interrupts at the
> period boundary but issues timer interrupts at a fixed timer rate (e.g.
> es1968 or ymfpci drivers).
>
> both es1968 and ymfpci use ac97 codec, there is an external clock source
> (oscillator) to provide the timing to both sound chips(ac97 controller)
> and ac97 codec to sync the transfer of audio through ac97 link at 48000Hz
>
> it depends on whether the chip can count the clock ticks to provide a
> timer interrupt
>
>  >>
>  >> I am also confuse about ymfpci really use timer interrupts.
>  >
>  >
>  > Well, that's easy. According to your own words, the card sends an
> interrupt every 256 samples and has no real notion of the user-defined
> period size. From ALSA viewpoint, this 256-sample interrupt is just a
> timer (but not a timer that is managed through functions that have
> "timer" in the name).
>
> Unlike other hardware-mixing sound cards desgined for playing game ,
> the multi voices of ymfpci is designed for playing MIDI which MIDI notes
> of a sound are usually start at same tempo
>
> some subdevices can have unpredictable delay if is it not the subdevice
> which start the hardware
>
> it depends on whether the hardware can provide registers for the driver
> to start each subdevice independently when receiving
> SNDRV_PCM_TRIGGER_START in pcm_trigger callback

OK

>  >
>  >>  > hw_ptr granularity is defined only by period_bytes_min (and
>  >> additional constraints if any).
>  >
>  >
>  > Well, this disagrees with my experiments. For S16_LE stereo,
> snd_pcm_hw_params_get_period_size_min() says 32 samples for both PCH and
> HDMI, while the measured granularity is different (8 and 16 samples).
>
> should you use period_bytes_min instead of period_size_min ?
>
> 128 bytes / (8 x 2)  = 8 samples for 8 channels
>
> for 6 channels playback , the period does not fit exactly the pcie
> playload size 128 bytes

Will retest later today.

>
>
>  >
>  >>  >
>  >>  > PulseAudio has the following consideration here: if the card cannot
>  >> report the position accurately, we need to disable the timestamp-based
>  >> scheduling, as this breaks module-combine-sink (or any successor of it),
>  >> because it relies on very accurate estimations of the actual sample rate
>  >> ratio between two non-identical cards.
>  >>  >
>  >>
>  >> https://bugs.freedesktop.org/show_bug.cgi?id=47899
>  >
>  >
>  > This is something to investigate, I am not ready to provide any
> useful comment. Although in comment #2 bluetooth is mentioned, and this
> is indeed an example where even somewhat accurate timing information is
> not available.
>
>  >>
>
>  >> if you want to hear sound from two snd-hda-intel at the same time using
>  >> combined sink, you may need driver provide the output delay in hda codec
>  >>
>  >> 7.3.4.5 Audio Function Group Capabilities
>  >>
>  >> Output Delay is a four bit value representing the number of samples
>  >> between when the sample is received from the Link and when it appears as
>  >> an analog signal at the pin. This may be a “typical” value. If this is
>  >> 0, the widgets along the critical path should be queried, and each
>  >> individual widget must report its individual delay.
>  >>
>  >> Figure 85. Audio Function Group Capabilities Response Format
>  >>
>  >> 7.3.4.6 Audio Widget Capabilities
>  >>
>  >> Delay indicates the number of sample delays through the widget. This may
>  >> be 0 if the delay value in the Audio Function Parameters is supplied to
>  >> represent the entire path.
>  >>
>  >>
> http://git.kernel.org/cgit/linux/kernel/git/tiwai/hda-emu.git/tree/codecs
>  >>
>  >> some hda codecs report delay in audio output/input widgets and the
>  >> ranges of delay vary from 3 to 13 samples, hda_proc.c did not show
>  >> output/input delay in the audio function group
>  >
>
> Did snd_hda_param_read(codec, codec->afg,  AC_PAR_AUDIO_FG_CAP) return
> any values for your hda codecs ?
>
> what is critical path ?

How do I test this? Could you please post some userspace test code or a 
kernel patch, together with the instructions?

> since some driver can enable/disable loopback mixing which the audio
> pass through less widgets when loopback mixing is disabled
>
> some idt codecs have a 5 bands equalizer in the path of Port D(not in
> the pin complex widget or mixer widget but setup using vendor specific
> verb to audio function group or vendor specific widget) but not in the
> path of Port A (headphone)
>
>
>  >
>  > Interesting, implementable for someone with the skills in this area,
> but probably not relevant for the above freedesktop bug. What you are
> talking about is just a constant offset in the snd_pcm_delay() return
> values. That's bad, but I guess not bad enough for PulseAudio to
> stutter. What PulseAudio doesn't tolerate is jitter.
>
> The two hda controllers of the reporter does not use same buffer size
> (buffer time)
>
> Do the timer based scheduling  wakeup 20ms before the the buffer is
> empty ? the timer eventually wakeup at different time if buffer time of
> two hda-controllers are not the same
>
> does this mean the pulseaudio still keep the audio data until the
> pulseaudio client close the stream ?

Not ready to answer this yet.

>
>  >
>  >> Other pulseaudio modules seen does not support rewind (e.g. jack,
>  >> tunnel, Bluetooth,...
>  >>
>  >> http://git.alsa-project.org/?p=alsa-plugins.git;a=tree
>  >>
>  >> Other alsa plugins (e.g. Jack, oss,...) seem not support rewind
>  >
>  >
>  > Jack is interesting here: it is the only ioplug-based plugin which
> sets mmap_rw = 1. As such, ALSA treats it as something that has mmapped
> buffer with the same semantics as an ordinary hardware sound card, and
> performs rewinds using this buffer. There is also a "hardware" position
> callback. The actual transfer of samples from that buffer to JACK is
> performed in a separate realtime thread which is implicitly created in
> jack_activate(). The porition is updated every JACK period.
>  >
>  > The whole construction should support rewinds, with the
> non-rewindable remainder being one JACK period (which may be different
> from one ALSA period). If the JACK period is 256 samples, this plugin
> should behave very much like one voice of ymfpci.
>
> https://github.com/jackaudio/jack2/blob/master/linux/alsa/alsa_driver.c
>
> jackd server does not use snd_pcm_rewind, support non-interleaved mode
> sound cards and sound cards with 10 or more channels (e.g iec1712, hdsp
> and hammerfall, ...) more than two playback ports
>
> the jack client has no info about how many periods or channels used by
> jackd server
>
> http://jackaudio.org/routing_alsa
>
> jack client only specify how many channels and which playback ports
>
> you can specify the stereo output to the grey jack if the jackd server
> use 8 channels playback or mix the stereo output  to the right channel
>
> http://cgit.freedesktop.org/pulseaudio/pulseaudio/tree/src/modules/jack/module-jack-sink.c
>
> seem module-jack-sink.c use fixed latency
>
> does it mean that pulseaudio only rewind those sink which support
> dynamic latency ? i.e. it won't rewind the sink if the sink used fixed
> latency

No. PulseAudio can, in theory, rewind fixed-latency sinks, but it will 
never usefully rewind this one (i.e. will truncate all rewind requests 
to 0), because it never sets max_rewind, and thus max_rewind gets 
defaulted to 0:

http://cgit.freedesktop.org/pulseaudio/pulseaudio/tree/src/pulsecore/sink.c#n337

PulseAudio here uses a different rendering strategy from the ALSA sink. 
For the ALSA sink, PulseAudio renders aggressively as much as possible 
and then rewinds if necessary. For the JACK sink, PulseAudio renders 
only the minimum required portion of data and only when strictly 
necessary (when JACK has asked for it).

Note that, in PulseAudio, sink inputs also have buffers (in the form of 
memblockq, that's where pa_sink_render reads from), and client rewinds 
can be done using these buffers even if the sink itself is not rewindable.

-- 
Alexander E. Patrakov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pcm_rewindable.c
Type: text/x-csrc
Size: 2808 bytes
Desc: not available
URL: <http://mailman.alsa-project.org/pipermail/alsa-devel/attachments/20140512/e897772b/attachment-0001.bin>


More information about the Alsa-devel mailing list