[alsa-devel] twl4030 latency update
Dear Peter, I was investigating on TWL4030 high playback latency and stumbled in an old thread started by Edgar http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.htm... where I read this is related to McBSP2 buffer length Recent kernels seems to have the same behavior (I have a debian beagleboardxM with 3.13.3-armv7-x10) Did you manage to get a fix to this problem? Would it be possible?
Regards
Leonardo
Hi Leonardo,
On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
Dear Peter, I was investigating on TWL4030 high playback latency and stumbled in an old thread started by Edgar http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.htm... where I read this is related to McBSP2 buffer length Recent kernels seems to have the same behavior (I have a debian beagleboardxM with 3.13.3-armv7-x10) Did you manage to get a fix to this problem? Would it be possible?
The 'misusing/configuring the McBSP, and sDMA' did not worked :( However the mcbsp code went through quite a bit of change since than concerning the McBSP FIFO/sDMA configuration.
If we have FIFO the sDMA is always in packet mode. The default is to transfer one sample with sDMA per DMA request. You can switch the McBSP to 'threshold' mode and set the maximum FIFO threshold you want to use. The code will figure out the optimal FIFO/burst size based on the period size and the max threshold you have set. This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I recall correctly. Playing with the max tx/rx threshold you might be able to get better latency.
Dear Peter, thanks, I'm not sure I understand all the details but after a fast find in my beagleboard /sys I found
./sys/devices/68000000.ocp/49026000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/49024000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/48096000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/48074000.mcbsp/dma_op_mode
All of these are already set as threshold: cat sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode [element] threshold
Probably I found the FIFOs to be shortened in order to reduce latency: all of the thresholds are 112 besides one of the devices which has: cat sys/devices/68000000.ocp/49022000.mcbsp/max_rx_thres 1264 for both tx and rx. Maybe that's the ALSA playback samples queue.
In fact I find: cat /proc/device-tree/ocp/mcbsp@49022000/ti,hwmods mcbsp2mcbsp2_sidetone
So it's the McBSP2 which you mentioned.
Now, I'm not sure how to change the threshold, I guess I have to patch some kernel module and rebuild?
On 20/03/2014 14:35, Peter Ujfalusi wrote:
Hi Leonardo,
On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
Dear Peter, I was investigating on TWL4030 high playback latency and stumbled in an old thread started by Edgar http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.htm... where I read this is related to McBSP2 buffer length Recent kernels seems to have the same behavior (I have a debian beagleboardxM with 3.13.3-armv7-x10) Did you manage to get a fix to this problem? Would it be possible?
The 'misusing/configuring the McBSP, and sDMA' did not worked :( However the mcbsp code went through quite a bit of change since than concerning the McBSP FIFO/sDMA configuration.
If we have FIFO the sDMA is always in packet mode. The default is to transfer one sample with sDMA per DMA request. You can switch the McBSP to 'threshold' mode and set the maximum FIFO threshold you want to use. The code will figure out the optimal FIFO/burst size based on the period size and the max threshold you have set. This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I recall correctly. Playing with the max tx/rx threshold you might be able to get better latency.
On 03/20/2014 04:31 PM, Leonardo Gabrielli wrote:
Dear Peter, thanks, I'm not sure I understand all the details but after a fast find in my beagleboard /sys I found
./sys/devices/68000000.ocp/49026000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/49024000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/48096000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/48074000.mcbsp/dma_op_mode
All of these are already set as threshold: cat sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode [element] threshold
it is in 'element' mode, the [] shows the selected mode. You can change it: echo threshold > /sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
Probably I found the FIFOs to be shortened in order to reduce latency: all of the thresholds are 112 besides one of the devices which has: cat sys/devices/68000000.ocp/49022000.mcbsp/max_rx_thres 1264 for both tx and rx. Maybe that's the ALSA playback samples queue.
This is for the threshold mode. With this value you can set the maximum slots you want to use in the McBSP FIFO.
To change it: echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_tx_thres echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_ex_thres
for example.
At the end with this value you can limit the sDMA burst sizes and the McBSP FIFO level. The threshold means: generate DMA request if threshold number of slots are free in the FIFO (playback) or when threshold amount of data available in the FIFO (capture).
In fact I find: cat /proc/device-tree/ocp/mcbsp@49022000/ti,hwmods mcbsp2mcbsp2_sidetone
So it's the McBSP2 which you mentioned.
Now, I'm not sure how to change the threshold, I guess I have to patch some kernel module and rebuild?
No, you do not need to recompile anything. See my previous comment.
On 20/03/2014 14:35, Peter Ujfalusi wrote:
Hi Leonardo,
On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
Dear Peter, I was investigating on TWL4030 high playback latency and stumbled in an old thread started by Edgar http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.htm... where I read this is related to McBSP2 buffer length Recent kernels seems to have the same behavior (I have a debian beagleboardxM with 3.13.3-armv7-x10) Did you manage to get a fix to this problem? Would it be possible?
The 'misusing/configuring the McBSP, and sDMA' did not worked :( However the mcbsp code went through quite a bit of change since than concerning the McBSP FIFO/sDMA configuration.
If we have FIFO the sDMA is always in packet mode. The default is to transfer one sample with sDMA per DMA request. You can switch the McBSP to 'threshold' mode and set the maximum FIFO threshold you want to use. The code will figure out the optimal FIFO/burst size based on the period size and the max threshold you have set. This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I recall correctly. Playing with the max tx/rx threshold you might be able to get better latency.
Thanks Peter, I've been able today to test following your suggestions. Unfortunately I didn't get any improvement on latency, but reducing sDMA FIFO threshold improved on audio integrity (with some period+samplerate combinations I have corrupted audio, maybe scrambled or empty frames).
TESTS: 1- with FIFO threshold 1264 for McBSP2 running jackd -P62 -dalsa -dhw:0 -r $SRATE -p $PERIOD -n2 -s -S -i2 -o2 with the following combinations of $SRATE - $PERIOD: 22050 - 512 - AU=ko lat=/ 22050 - 256 - AU=ok lat=54ms 22050 - 128 - AU=ko lat=/ 22050 - 64 - AU=ok lat=34ms 32000 - 64 - AU=ok lat=23ms 44100 - 64 - AU=ok lat=17ms
2- with FIFO threshold 320 for McBSP2 the rest as above 22050 - 512 - AU=ko lat=/ 22050 - 256 - AU=ok lat=54ms 22050 - 128 - AU=ok lat=38ms 22050 - 64 - AU=ok lat=40-60ms (changing for each invocation of jackd) 32000 - 64 - AU=ok lat=23ms 44100 - 64 - AU=ok lat=17ms
Outcome: maybe I got it wrong, I thought this would reduce the number of periods allocated by jack (they didn't change between the two tests) hence reduce latency. The CPU is not overwhelmed even in the 64sample tests (good).
Also: after a reboot the threshold and dma_op_mode get back to their defaults. Can I make it stable or do I need an upstart job to echo the proper values into the sysfs each time?
Cheers and thanks
Leonardo
On 21/03/2014 08:08, Peter Ujfalusi wrote:
On 03/20/2014 04:31 PM, Leonardo Gabrielli wrote:
Dear Peter, thanks, I'm not sure I understand all the details but after a fast find in my beagleboard /sys I found
./sys/devices/68000000.ocp/49026000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/49024000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/48096000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/48074000.mcbsp/dma_op_mode
All of these are already set as threshold: cat sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode [element] threshold
it is in 'element' mode, the [] shows the selected mode. You can change it: echo threshold > /sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
Probably I found the FIFOs to be shortened in order to reduce latency: all of the thresholds are 112 besides one of the devices which has: cat sys/devices/68000000.ocp/49022000.mcbsp/max_rx_thres 1264 for both tx and rx. Maybe that's the ALSA playback samples queue.
This is for the threshold mode. With this value you can set the maximum slots you want to use in the McBSP FIFO.
To change it: echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_tx_thres echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_ex_thres
for example.
At the end with this value you can limit the sDMA burst sizes and the McBSP FIFO level. The threshold means: generate DMA request if threshold number of slots are free in the FIFO (playback) or when threshold amount of data available in the FIFO (capture).
In fact I find: cat /proc/device-tree/ocp/mcbsp@49022000/ti,hwmods mcbsp2mcbsp2_sidetone
So it's the McBSP2 which you mentioned.
Now, I'm not sure how to change the threshold, I guess I have to patch some kernel module and rebuild?
No, you do not need to recompile anything. See my previous comment.
On 20/03/2014 14:35, Peter Ujfalusi wrote:
Hi Leonardo,
On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
Dear Peter, I was investigating on TWL4030 high playback latency and stumbled in an old thread started by Edgar http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.htm... where I read this is related to McBSP2 buffer length Recent kernels seems to have the same behavior (I have a debian beagleboardxM with 3.13.3-armv7-x10) Did you manage to get a fix to this problem? Would it be possible?
The 'misusing/configuring the McBSP, and sDMA' did not worked :( However the mcbsp code went through quite a bit of change since than concerning the McBSP FIFO/sDMA configuration.
If we have FIFO the sDMA is always in packet mode. The default is to transfer one sample with sDMA per DMA request. You can switch the McBSP to 'threshold' mode and set the maximum FIFO threshold you want to use. The code will figure out the optimal FIFO/burst size based on the period size and the max threshold you have set. This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I recall correctly. Playing with the max tx/rx threshold you might be able to get better latency.
Hi Leonardo,
On 03/25/2014 08:50 PM, Leonardo Gabrielli wrote:
Thanks Peter, I've been able today to test following your suggestions. Unfortunately I didn't get any improvement on latency, but reducing sDMA FIFO threshold improved on audio integrity (with some period+samplerate combinations I have corrupted audio, maybe scrambled or empty frames).
Can you elaborate on the corrupted/scrambled audio? I just don't see how it can happen. Can you get the /proc/asound/card0/pcm0p/sub0/hw_params when you have the audio quality issue?
I can not run jackd on the board anymore (with linux-next at least): FATAL: cannot locate cpu MHz in /proc/cpuinfo
but with aplay -v --period-size=512 or 64 2ch-left-since-22050.wav seams to be fine for me (assuming AU=ko means that it is the corrupted one)
TESTS: 1- with FIFO threshold 1264 for McBSP2 running jackd -P62 -dalsa -dhw:0 -r $SRATE -p $PERIOD -n2 -s -S -i2 -o2 with the following combinations of $SRATE - $PERIOD: 22050 - 512 - AU=ko lat=/ 22050 - 256 - AU=ok lat=54ms 22050 - 128 - AU=ko lat=/ 22050 - 64 - AU=ok lat=34ms 32000 - 64 - AU=ok lat=23ms 44100 - 64 - AU=ok lat=17ms
2- with FIFO threshold 320 for McBSP2 the rest as above 22050 - 512 - AU=ko lat=/ 22050 - 256 - AU=ok lat=54ms 22050 - 128 - AU=ok lat=38ms 22050 - 64 - AU=ok lat=40-60ms (changing for each invocation of jackd) 32000 - 64 - AU=ok lat=23ms 44100 - 64 - AU=ok lat=17ms
Outcome: maybe I got it wrong, I thought this would reduce the number of periods allocated by jack (they didn't change between the two tests) hence reduce latency.
The McBSP2 FIFO will be always there. There's nothing can be done on that. The size on McBSP2 is 1280 words -> 640 stereo samples, ie ~29ms with 22050, 14.5ms with 44100.
If you are staying in element mode this means that it is granted that the sample at the DMA pointer will out on the i2s line about the mentioned times. This is the delay caused by the FIFO itself. From where the rest is coming I'm not really sure.
Now if you are in threshold mode this changes a bit, but the FIFO will be there still. At start the FIFO is going to be filled up with threshold long bursts. From there you will have DMA burst with about threshold length every time the FIFO has that amount of free slots in it. In case of tx_threshold 1264 (632 sample) and 512 period size: 0. The actual threshold level will be 512 samples. 1. copy of 512 samples (1 period to FIFO) ~128 (640 - 512) free slot left in the FIFO 2. nothing happens until the FIFO level drops to 127 (we will have free space for 512 samples). 3. next 512 sample burst to FIFO. 4. the FIFO will be full or close to full 5. goto 2
When the period size is bigger than the desired threshold (set via sysfs) then the code will figure out the best configuration for the actual threshold/DMA burst.
The same principle applies to element mode, where the DMA bursts are set to one sample. Meaning that at start you will have ~640 quick DMA bursts to fill the FIFO up and after that you will have the next burst coming at every 1/Hz time.
You see, the FIFO is there to add delay in both cases however in threshold mode you are not going to stress the system with constant DMA activity, you only have bursts to fill the FIFO up.
The CPU is not overwhelmed even in the 64sample tests (good).
Also: after a reboot the threshold and dma_op_mode get back to their defaults. Can I make it stable or do I need an upstart job to echo the proper values into the sysfs each time?
You need to change these after every boot, yes. The default is element mode.
Cheers and thanks
Leonardo
On 21/03/2014 08:08, Peter Ujfalusi wrote:
On 03/20/2014 04:31 PM, Leonardo Gabrielli wrote:
Dear Peter, thanks, I'm not sure I understand all the details but after a fast find in my beagleboard /sys I found
./sys/devices/68000000.ocp/49026000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/49024000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/48096000.mcbsp/dma_op_mode ./sys/devices/68000000.ocp/48074000.mcbsp/dma_op_mode
All of these are already set as threshold: cat sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode [element] threshold
it is in 'element' mode, the [] shows the selected mode. You can change it: echo threshold > /sys/devices/68000000.ocp/49022000.mcbsp/dma_op_mode
Probably I found the FIFOs to be shortened in order to reduce latency: all of the thresholds are 112 besides one of the devices which has: cat sys/devices/68000000.ocp/49022000.mcbsp/max_rx_thres 1264 for both tx and rx. Maybe that's the ALSA playback samples queue.
This is for the threshold mode. With this value you can set the maximum slots you want to use in the McBSP FIFO.
To change it: echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_tx_thres echo 320 > /sys/devices/68000000.ocp/49022000.mcbsp/max_ex_thres
for example.
At the end with this value you can limit the sDMA burst sizes and the McBSP FIFO level. The threshold means: generate DMA request if threshold number of slots are free in the FIFO (playback) or when threshold amount of data available in the FIFO (capture).
In fact I find: cat /proc/device-tree/ocp/mcbsp@49022000/ti,hwmods mcbsp2mcbsp2_sidetone
So it's the McBSP2 which you mentioned.
Now, I'm not sure how to change the threshold, I guess I have to patch some kernel module and rebuild?
No, you do not need to recompile anything. See my previous comment.
On 20/03/2014 14:35, Peter Ujfalusi wrote:
Hi Leonardo,
On 03/20/2014 01:13 PM, Leonardo Gabrielli wrote:
Dear Peter, I was investigating on TWL4030 high playback latency and stumbled in an old thread started by Edgar http://mailman.alsa-project.org/pipermail/alsa-devel/2011-October/045173.htm...
where I read this is related to McBSP2 buffer length Recent kernels seems to have the same behavior (I have a debian beagleboardxM with 3.13.3-armv7-x10) Did you manage to get a fix to this problem? Would it be possible?
The 'misusing/configuring the McBSP, and sDMA' did not worked :( However the mcbsp code went through quite a bit of change since than concerning the McBSP FIFO/sDMA configuration.
If we have FIFO the sDMA is always in packet mode. The default is to transfer one sample with sDMA per DMA request. You can switch the McBSP to 'threshold' mode and set the maximum FIFO threshold you want to use. The code will figure out the optimal FIFO/burst size based on the period size and the max threshold you have set. This is done via a sysfs file under the mcbsp, the file is dma_op_mode if I recall correctly. Playing with the max tx/rx threshold you might be able to get better latency.
On 03/26/2014 10:26 AM, Peter Ujfalusi wrote:
Hi Leonardo,
On 03/25/2014 08:50 PM, Leonardo Gabrielli wrote:
Thanks Peter, I've been able today to test following your suggestions. Unfortunately I didn't get any improvement on latency, but reducing sDMA FIFO threshold improved on audio integrity (with some period+samplerate combinations I have corrupted audio, maybe scrambled or empty frames).
Can you elaborate on the corrupted/scrambled audio? I just don't see how it can happen. Can you get the /proc/asound/card0/pcm0p/sub0/hw_params when you have the audio quality issue?
I can not run jackd on the board anymore (with linux-next at least): FATAL: cannot locate cpu MHz in /proc/cpuinfo
but with aplay -v --period-size=512 or 64 2ch-left-since-22050.wav seams to be fine for me (assuming AU=ko means that it is the corrupted one)
Yeah, this is not correct. it means 512 or 64 periods and not the period sizes. I would need the hw_params for the actual playback to be able to reproduce the issue.
On 26/03/2014 09:26, Peter Ujfalusi wrote:
Can you elaborate on the corrupted/scrambled audio? I just don't see how it can happen. Can you get the /proc/asound/card0/pcm0p/sub0/hw_params when you have the audio quality issue?
Hello, Here you are:
cat /proc/asound/card0/pcm0p/sub0/hw_params access: MMAP_INTERLEAVED format: S16_LE subformat: STD channels: 2 rate: 22050 (22050/1) period_size: 512 buffer_size: 1024
And this is jack output:
jackd -P62 -t2000 -dalsa -dhw:0 -r22050 -p512 -n2 -s -S -i2 -o2 &
jackd 0.124.1 Copyright 2001-2009 Paul Davis, Stephane Letz, Jack O'Quinn, Torben Hohn and others. jackd comes with ABSOLUTELY NO WARRANTY This is free software, and you are welcome to redistribute it under certain conditions; see the file COPYING for details
JACK compiled with System V SHM support. loading driver .. apparent rate = 22050 creating alsa driver ... hw:0|hw:0|512|2|22050|2|2|nomon|swmeter|soft-mode|16bit configuring for 22050Hz, period = 512 frames (23.2 ms), buffer = 2 periods ALSA: final selected sample format for capture: 16bit little-endian ALSA: use 2 periods for capture ALSA: final selected sample format for playback: 16bit little-endian ALSA: use 2 periods for playback
I can send you saomething to get it clearer. I recorded a 10Hz sine wave with jaaa. The wave is totally scrambled (probably buffers are not read in order). But it may well be an issue with jack.
When the output sounds correct (256 period) hw_param is: cat /proc/asound/card0/pcm0p/sub0/hw_params access: MMAP_INTERLEAVED format: S16_LE subformat: STD channels: 2 rate: 22050 (22050/1) period_size: 256 buffer_size: 768
I can not run jackd on the board anymore (with linux-next at least): FATAL: cannot locate cpu MHz in /proc/cpuinfo
Yes, there's been a recent fix to that (you can checkout the latest jackd from git repos, see this thread: http://lists.jackaudio.org/private.cgi/jack-devel-jackaudio.org/2014-March/0...
Or maybe you can just start jackd specifying a different clock with the -c switch i.e. jackd -P62 -t2000 -c s -dalsa -dhw:0 -r22050 -p512 -n2 -s -S -i2 -o2 &
On 03/26/2014 11:35 AM, Leonardo Gabrielli wrote:
On 26/03/2014 09:26, Peter Ujfalusi wrote:
Can you elaborate on the corrupted/scrambled audio? I just don't see how it can happen. Can you get the /proc/asound/card0/pcm0p/sub0/hw_params when you have the audio quality issue?
Hello, Here you are:
cat /proc/asound/card0/pcm0p/sub0/hw_params access: MMAP_INTERLEAVED format: S16_LE subformat: STD channels: 2 rate: 22050 (22050/1) period_size: 512 buffer_size: 1024
And this is jack output:
jackd -P62 -t2000 -dalsa -dhw:0 -r22050 -p512 -n2 -s -S -i2 -o2 &
arecord -r 22050 -f S16_LE --period-size=512 --buffer-size=1024 -v | aplay -r 22050 -f S16_LE --period-size=512 --buffer-size=1024 -v
and no issue on the headphone from Beagle.
I can send you saomething to get it clearer. I recorded a 10Hz sine wave with jaaa. The wave is totally scrambled (probably buffers are not read in order). But it may well be an issue with jack.
When the output sounds correct (256 period) hw_param is: cat /proc/asound/card0/pcm0p/sub0/hw_params access: MMAP_INTERLEAVED format: S16_LE subformat: STD channels: 2 rate: 22050 (22050/1) period_size: 256 buffer_size: 768
arecord -r 22050 -f S16_LE -v --period-size=256 --buffer-size=768 | aplay -r 22050 -f S16_LE --period-size=256 --buffer-size=768 -v
again, audio is clear with this one as well
I can not run jackd on the board anymore (with linux-next at least): FATAL: cannot locate cpu MHz in /proc/cpuinfo
Yes, there's been a recent fix to that (you can checkout the latest jackd from git repos, see this thread: http://lists.jackaudio.org/private.cgi/jack-devel-jackaudio.org/2014-March/0...
I also found this, but lazy to update my jack...
Or maybe you can just start jackd specifying a different clock with the -c switch i.e. jackd -P62 -t2000 -c s -dalsa -dhw:0 -r22050 -p512 -n2 -s -S -i2 -o2 &
This does not work.
On 26/03/2014 09:26, Peter Ujfalusi wrote:
The McBSP2 FIFO will be always there. There's nothing can be done on that. The size on McBSP2 is 1280 words -> 640 stereo samples, ie ~29ms with 22050, 14.5ms with 44100.
If you are staying in element mode this means that it is granted that the sample at the DMA pointer will out on the i2s line about the mentioned times. This is the delay caused by the FIFO itself. From where the rest is coming I'm not really sure.
BTW: I forgot to mention: the latency listed in my previous email is input+output (i.e. I record pulses from the beagleboard input jack and the delayed version to the beagleboard output jack). The twl4030 analog and digital loopback features have been of course disabled, in order to get the total latency due from A/D to D/A.
So just to get confirm I understood the McBSP mechanism well: even though I can transfer to/from DMA samples in bursts of <threshold> length, each sample will always "travel along" the whole FIFO buffer length, (as if in a delay line) and thus they will always have 640samples delay?
Would it be possible to workaround this, e.g. by putting 4-channel audio frames instead of stereo frames in the FIFO (with 2 channels unused), in order to fill up the FIFO more quickly and have less latency? Or is it pure craze?
Cheers and thank you
On 03/26/2014 11:45 AM, Leonardo Gabrielli wrote:
On 26/03/2014 09:26, Peter Ujfalusi wrote:
The McBSP2 FIFO will be always there. There's nothing can be done on that. The size on McBSP2 is 1280 words -> 640 stereo samples, ie ~29ms with 22050, 14.5ms with 44100.
If you are staying in element mode this means that it is granted that the sample at the DMA pointer will out on the i2s line about the mentioned times. This is the delay caused by the FIFO itself. From where the rest is coming I'm not really sure.
BTW: I forgot to mention: the latency listed in my previous email is input+output (i.e. I record pulses from the beagleboard input jack and the delayed version to the beagleboard output jack). The twl4030 analog and digital loopback features have been of course disabled, in order to get the total latency due from A/D to D/A.
This means that the McBSP latency in worst case is 1280 + selected rx threshold in words (so /2 in case of stereo.) If you lower the rx threshold you decrease the latency on the capture side. On the playback side there's nothing can be done.
So just to get confirm I understood the McBSP mechanism well: even though I can transfer to/from DMA samples in bursts of <threshold> length, each sample will always "travel along" the whole FIFO buffer length, (as if in a delay line) and thus they will always have 640samples delay?
On the playback side this is pretty much true. On capture side the threshold means that DMA will read from FIFO when threshold amount is available in it.
Would it be possible to workaround this, e.g. by putting 4-channel audio frames instead of stereo frames in the FIFO (with 2 channels unused), in order to fill up the FIFO more quickly and have less latency? Or is it pure craze?
From the FIFO McBSP takes data word by word. If you play stereo, you need to
have stereo data in the FIFO. You can not skip two words with McBSP.
The thing I tried for playback and did not worked AFAIR: In general the idea was to configure DMA to send threshold/channel to every request while configuring the McBSP threshold register to be 1280 - threshold. In case of threshold 80 (40 stereo samples) it would play out: transfer 40 samples to FIFO per DMA request assert the DMA request when we have space for 1260 (630 samples). The number is just a guess, keeping 10 samples in FIFO sounds safe enough This would keep the FIFO fill between 10 and 50 samples. But this does not work, I think McBSP is counting the received words also and deasserts the DMA request based on this count and not the FIFO level.
Another thing which would be even more complicated is to play with the McBSP threshold runtime. With the same 40 sample: DMA is to transfer 40 samples per DMA requests. start 1. McBSP threshold to 80 2. in dma interrupt callback McBSP threshold to 1260 3. in McBSP warning interrupt (that we will be reaching the threshold soon) back to 80 4. goto 2
If we could do the step between 3 and 4 within one sample time this might work but as soon as you are late the thing will fail.
I know this is working in realtime systems like in DSPs and non linux systems...
Peter, thank you for your suggestions. For the moment I will keep with the following settings, which provide satisfying latency and CPU usage:
-r 44100, -p64, with sDMA threshold=320 as per your suggestions, and CPU governor set to performance (1GHz). This way I can have 16ms in-out latency, with 15% CPU, no glitches and XRUNs (9% in jack, and 6% is the connection of system:{capture|playback} ports.
uname -a Linux debian-BB3 3.13.3-armv7-x10 #1 SMP Sat Feb 15 01:03:40 UTC 2014 armv7l GNU/Linux
In case I will need a lower latency I will try the USB soundcard suggested by Edgar.
Leonardo
On 26/03/2014 13:51, Peter Ujfalusi wrote:
I know this is working in realtime systems like in DSPs and non linux systems...
participants (2)
-
Leonardo Gabrielli
-
Peter Ujfalusi