Hei Péter,
On Fri, Aug 12, 2016 at 7:53 PM, Peter Ujfalusi peter.ujfalusi@ti.com wrote:
Akram,
first of all: I'm able to reproduce the RX underflow :o
I am glad to hear this, the lack of other people observing this on the mailing list was making me think we had specific issues to our branch+hardware.
On 08/12/16 04:14, Akram Hameed wrote:
- McBSP2 is also configured the same between old and new kernels,
though I
note for some reason the ID pulled by dev_dbg looks spurious, though
I guess
unrelated to my issue. Configuring McBSP255 phys_base: 0x49022000 **** McBSP255 regs **** DRR2: 0xf88a DRR1: 0x0000 DXR2: 0x0000 DXR1: 0x0000 SPCR2: 0x0230 SPCR1: 0x0031 RCR2: 0x8041 RCR1: 0x0040 XCR2: 0x8041 XCR1: 0x0040 SRGR2: 0x001f SRGR1: 0x0f00 PCR0: 0x000f
We must drop the DRR1/2 (and probably DXR1/2) dump... By reading the
DRR
register with the debug code we introduce underflow. The data must be
only
moved by the DMA from the DATA registers. If we have low threshold it
is more
likely to cause channel swap and it is for sure going to introduce
missing
data - the data is going to the kernel log instead of the receive
buffer.
For what it is worth, I had to enable that register dump with dynamic_debug/control, so I guess it is not being used in normal
operation.
Yeah, it is only dumped on start, but still it is not a good thing as it
can
introduce initial channel swap.
I can not recall the history, but between 3.0 and 3.18 we might moved
to
dmaengine based PCM for OMAPs. But I still have no recollection to
experience
capture side channel swap.
I have found that absolutely the pcm was moved to dmaengine around 3.7.
I
hazard a guess my problem would not be present prior to that, but I
have no
evidence. The hardware manufacturers we use have a kernel patched for
their
specific use of OMAP3530/DM3730 at version 3.5 that I can try and
verify if
capture had issues at that time.
I have looked back at the history and nothing stands out. We are adding SRC/DST_PACKED in sDMA, but if I remove that, it makes no difference.
I think SRC packing will not be enabled for sDMA in this case where constant addressing is used for McBSP DRR? See pp. 988 of Spruf98p. Anyway, I agree, does not explain the issue at all.
Even if I enable real element mode, I can still see the underflows.
a) Swap definitely is happening due to Overflow, but so far I only
observe an
overflow when using dma_op_mode THRESHOLD. When using dma_op_mode
THRESHOLD,
the swap does not seem to coincide with an underflow, so I am at a
loss to
explain further.
How does you application works? What are the parameters ALSA is
configured for
capture/playback (cat /proc/asound/card0/pcm0p/sub0/hw_params; cat /proc/asound/card0/pcm0c/sub0/hw_params) - number of periods, etc?
Our application accesses ALSA via a C++ wrapper (RtAudio: https://www.music.mcgill.ca/~gary/rtaudio/). We capture audio
only
(48kHz, 2ch), and do not play back during these capture sessions.
Capture is
done in a background thread and double buffering is used at the
application
level so the 'audio ready' callback does not block longer than strictly necessary to service incoming data. As mentioned in a previous email,
this
approach has worked flawlessly before in kernel 3.0.
ALSA buffer size is not particularly large, perhaps it might help for
me to
increase the number of ALSA buffers? Here are the hw_params:
cat /proc/asound/card0/pcm0c/sub0/hw_params access: RW_INTERLEAVED format: S16_LE subformat: STD channels: 2 rate: 48000 (48000/1) period_size: 480 buffer_size: 960
I found no correlation between the ALSA settings and the frequency of the McBSP underruns.
And of course, playback is not operating:
cat /proc/asound/card0/pcm0p/sub0/hw_params closed
b) In dma_op_mode ELEMENT, I get Underflow reported almost every
second in
this newer kernel (again, using 10ms period size). Channel swap only
seems to
occur ever few hours, however, and I do not believe I have observed
the
overflow at all in ELEMENT mode. So, the swap occurs due to
underflow, maybe?
Underflow in McBSP or Underflow by ALSA?
Underflow in McBSP RX tells that DMA is trying to read data when the
FIFO is
empty. This can only happen if something else is reading data also
from the
McBSP since we configure the McBSP and sDMA in sync.
The underflow I observe is a McBSP one. I am not receiving any from
ALSA. Here
is some dmesg output (dma_op_mode == element, alsa hw_params same as
above):
[ 831.852416] omap-mcbsp 49022000.mcbsp: RX Buffer Underflow! [ 835.450286] omap-mcbsp 49022000.mcbsp: RX Buffer Underflow! [ 837.221862] omap-mcbsp 49022000.mcbsp: RX Buffer Underflow!
I enabled these underflow messages in omap_mcbsp_config in sound/soc/omap/mcbsp.c for curiosity's sake, but as you can see...they
occur
quite often. I am not sure what else might be accessing the DRR: it
almost
looks like the 'McBSP2.MCBSPLP_RQSTATUS_REG[3] RRDY' interrupt might be
fired
for DMA transfer too often to generate this message? The spruf98p
manual says
on page 2979: 'happens only if the MPU/IVA2.2 subsystem or sDMA
controller
does not respect the DMA length, does not wait for DMA request, or does
not
check the buffer status before reading data.'
How to fix it, I am not sure, but absolutely by using threshold mode in dma_op_mode, I get far less frequent underflow.
I can think of two causes:
- sDMA for some reason issues an extra read, there might be an ERRATA for
this? I can try to check the sDMA erratas. 2. McBSP issues extra DMA request out of blue and that forces sDMA to
read?
Have not heard anything like that, but can check the erratas for McBSP. It could be that the threshold handling is having some issues in McBSP?
I did not see any errata specific to this issue, but perhaps I misinterpret the errata document here: http://www.ti.com/lit/er/sprz278f/sprz278f.pdf
Overflow in McBSP happens when DAM is failing to read the data out
from McBSP
FIFO in time - FIFO is full and McBSP discards the incoming data)
ALSA underflow is different, it happens when the application fails to
process
the period in time and the DMA starts to rewrite the buffer. This
happens when
you have one thread to capture/process/play. If the process/play takes
more
time than we go round in the boffer (buffer time) we have ALSA
underflow. You
can try to increase the number of periods or separate the capture and process/play jobs into separate threads.
My plan from now is to get my JTAG up and running and try and make
the change
"If you are working with 16-bit stereo data a nice solution is to
configure
the McBSP for a single 32-bit element instead of 2 16-bit elements. "
The McBSP driver does not have support for this ATM. There is a side
effect of
this AFAIK: channels will be swapped, but I have not tested it. Using 32bit element instead of 2x16bit helps in case when you do not
have FIFO
for sure to give more time for the DMA. Also in case of
underrun/overrun you
will be loosing both channel's data so the swap would not happen.
This sounds like a fine solution for me in the interim: a permanent
channel
swap can be dealt with after data is received from ALSA.
I tried this (not too hard) but could not make it work for playback so
capture
could be broken as well with this hack.
You mean by this you tested DSP_A format for playback? I did not get around to trying for capture yet, but if it is broken for playback, that is not a good sign.
Couple of interesting details (I'm running my tests on top of linux-next): In element mode we have the underruns coming in steady pace, but if I do
dmesg
over the serial console (which is also using DMA nowdays) the underruns in audio got increased. In threshold mode we have less underruns, but again
dmesg
on serial will generate lots of underruns :o
Out of curiosity I have checked how things are when McBSP is master on the bus. I see no underruns at all. Even if I do the dmesg on serial. This is something I don't really understand atm. For reference I have attached my local patch on top of linux-next for the omap-twl4030 machine driver to create the PCM for McBSP master
configuration.
If I: arecord -Dplughw:0,0 -f dat --period-size=480 --buffer-size=960 >
/dev/null
I see underruns
but: arecord -Dplughw:0,1 -f dat --period-size=480 --buffer-size=960 >
/dev/null
I see no underruns
:0,0 is the McBSP slave, :0,1 is when McBSP is master.
Can you look at the patch and see if you could get the McBSP master mode working on your setup and test it?
I will see about implementing the patch on top of the kernel I am using (3.18) today. Thanks for providing it! Who knows, maybe by treating McBSP as master the problem of channel swapping is resolved as well as the underflow.
-- Péter