Hei Péter,
Thanks for the quick reply!
On Thu, Aug 11, 2016 at 7:14 PM, Peter Ujfalusi peter.ujfalusi@ti.com wrote:
Hi Akram,
On 08/11/16 04:07, Akram Hameed wrote:
Hei Peter,
Sorry to bother you directly, but I got no response to my questions on
the
alsa-devel list and since you seem to be a maintainer of the omap mcbsp
and
pcm code, I thought you would:
Sorry, it looks like I have missed your mail in alsa-devel. Adding it to
CC
and also looping Jarkko.
a) Want to be aware of my issues
b) Might help point me in a direction to fix things.
In short: I am capturing a stereo audio stream, 16-bit, 48kHz using a
gumstix
overo that is quite similar to the older beagleboards (we are using two variants - one is OMAP3530 based and the other, DM3730). Periodically,
left
channel audio will move to right channel and vice versa over time.
Previously my company was using a branch from kernel 3.0 (made by Steve Sakoman) and our audio capture was fine - no swapping, ever. Due to instabilities with the old mmc interface (unpredictable failures using
newer
UHS-3 cards), we have been forced to migrate to a newer kernel, and
here I
encounter the channel swap issue.
To me, it seems the problem is like what is described here: http://processors.wiki.ti.com/index.php/McBSP_Channel_Swapping Which I presume draws on a discussion you had years ago with one Ying: http://mailman.alsa-project.org/pipermail/alsa-devel/2011-
February/036895.html
After spending quite some time reading and investigating the various registers, I come to the following conclusions:
- Audio codec (TWL) appears to be configured in the same between
working
(3.0) and non working (3.18) kernels, except that voice mode is active
also
and some gain settings are different (different defaults, I suppose). I
have a
comparison of these register maps if you want to see.
- McBSP2 is also configured the same between old and new kernels,
though I
note for some reason the ID pulled by dev_dbg looks spurious, though I
guess
unrelated to my issue. Configuring McBSP255 phys_base: 0x49022000 **** McBSP255 regs **** DRR2: 0xf88a DRR1: 0x0000 DXR2: 0x0000 DXR1: 0x0000 SPCR2: 0x0230 SPCR1: 0x0031 RCR2: 0x8041 RCR1: 0x0040 XCR2: 0x8041 XCR1: 0x0040 SRGR2: 0x001f SRGR1: 0x0f00 PCR0: 0x000f
We must drop the DRR1/2 (and probably DXR1/2) dump... By reading the DRR register with the debug code we introduce underflow. The data must be only moved by the DMA from the DATA registers. If we have low threshold it is
more
likely to cause channel swap and it is for sure going to introduce missing data - the data is going to the kernel log instead of the receive buffer.
For what it is worth, I had to enable that register dump with dynamic_debug/control, so I guess it is not being used in normal operation.
- My channel swapping occurs when using dma_op_mode element OR
threshold mode
in kernel 3.18+ (I have only tested on 3.18 and 4.4, for the record).
Setting
a threshold of 960 with period size 10ms reduces the frequency of the
swap.
I did notice that DMA CCR is different between kernel 3 and 3.18 when operating in ELEMENT mode.
Version 3 kernel:
DMA4:Chan5 @ 48056200:CCR=/dev/mem opened. Memory mapped at address 0x4013b000. Read at address 0x48056200 (0x4013b200): 0x01084482
Version 3.18 kernel:
DMA4:Chan5 @ 48056200:CCR=/dev/mem opened. Memory mapped at address 0xb6fa1000. Read at address 0x48056200 (0xb6fa1200): 0x010C44A2 <- McBSP2 RX
request.
Interestingly, it appears the new kernel (3.18+) is using packet mode
even
when 'element' is selected. If I choose dma_op_mode THRESHOLD in the
older
kernel 3.0, I see the same 0x010C44A2 in the CCR as I do in 3.18+.
Yes, I have made a change that if we are in element mode we will still use packet mode to transfer one sample per DMA request. Real element mode
will be
used only in case of mono audio. If the audio is 2+ channels we are using packet mode to transfer the sample. This does helped to avoid swaps as we send/receive one sample at a time from/to McBSP.
I have not observed playback channel swapping. I am unsure why.
I enabled underflow and overflow reporting in the McBSP IRQ handler
(kernel
3.18+ since older 3 kernel did not have such a handler and I have not
patched
one in yet), and found several interesting things.
I can not recall the history, but between 3.0 and 3.18 we might moved to dmaengine based PCM for OMAPs. But I still have no recollection to
experience
capture side channel swap.
I have found that absolutely the pcm was moved to dmaengine around 3.7. I hazard a guess my problem would not be present prior to that, but I have no evidence. The hardware manufacturers we use have a kernel patched for their specific use of OMAP3530/DM3730 at version 3.5 that I can try and verify if capture had issues at that time.
a) Swap definitely is happening due to Overflow, but so far I only
observe an
overflow when using dma_op_mode THRESHOLD. When using dma_op_mode
THRESHOLD,
the swap does not seem to coincide with an underflow, so I am at a loss
to
explain further.
How does you application works? What are the parameters ALSA is
configured for
capture/playback (cat /proc/asound/card0/pcm0p/sub0/hw_params; cat /proc/asound/card0/pcm0c/sub0/hw_params) - number of periods, etc?
Our application accesses ALSA via a C++ wrapper (RtAudio: https://www.music.mcgill.ca/~gary/rtaudio/). We capture audio only (48kHz, 2ch), and do not play back during these capture sessions. Capture is done in a background thread and double buffering is used at the application level so the 'audio ready' callback does not block longer than strictly necessary to service incoming data. As mentioned in a previous email, this approach has worked flawlessly before in kernel 3.0.
ALSA buffer size is not particularly large, perhaps it might help for me to increase the number of ALSA buffers? Here are the hw_params:
cat /proc/asound/card0/pcm0c/sub0/hw_params access: RW_INTERLEAVED format: S16_LE subformat: STD channels: 2 rate: 48000 (48000/1) period_size: 480 buffer_size: 960
And of course, playback is not operating:
cat /proc/asound/card0/pcm0p/sub0/hw_params closed
b) In dma_op_mode ELEMENT, I get Underflow reported almost every second
in
this newer kernel (again, using 10ms period size). Channel swap only
seems to
occur ever few hours, however, and I do not believe I have observed the overflow at all in ELEMENT mode. So, the swap occurs due to underflow,
maybe?
Underflow in McBSP or Underflow by ALSA?
Underflow in McBSP RX tells that DMA is trying to read data when the FIFO
is
empty. This can only happen if something else is reading data also from
the
McBSP since we configure the McBSP and sDMA in sync.
The underflow I observe is a McBSP one. I am not receiving any from ALSA. Here is some dmesg output (dma_op_mode == element, alsa hw_params same as above): [ 831.852416] omap-mcbsp 49022000.mcbsp: RX Buffer Underflow! [ 835.450286] omap-mcbsp 49022000.mcbsp: RX Buffer Underflow! [ 837.221862] omap-mcbsp 49022000.mcbsp: RX Buffer Underflow!
I enabled these underflow messages in omap_mcbsp_config in sound/soc/omap/mcbsp.c for curiosity's sake, but as you can see...they occur quite often. I am not sure what else might be accessing the DRR: it almost looks like the 'McBSP2.MCBSPLP_RQSTATUS_REG[3] RRDY' interrupt might be fired for DMA transfer too often to generate this message? The spruf98p manual says on page 2979: 'happens only if the MPU/IVA2.2 subsystem or sDMA controller does not respect the DMA length, does not wait for DMA request, or does not check the buffer status before reading data.'
How to fix it, I am not sure, but absolutely by using threshold mode in dma_op_mode, I get far less frequent underflow.
Overflow in McBSP happens when DAM is failing to read the data out from
McBSP
FIFO in time - FIFO is full and McBSP discards the incoming data)
ALSA underflow is different, it happens when the application fails to
process
the period in time and the DMA starts to rewrite the buffer. This happens
when
you have one thread to capture/process/play. If the process/play takes
more
time than we go round in the boffer (buffer time) we have ALSA underflow.
You
can try to increase the number of periods or separate the capture and process/play jobs into separate threads.
My plan from now is to get my JTAG up and running and try and make the
change
"If you are working with 16-bit stereo data a nice solution is to
configure
the McBSP for a single 32-bit element instead of 2 16-bit elements. "
The McBSP driver does not have support for this ATM. There is a side
effect of
this AFAIK: channels will be swapped, but I have not tested it. Using 32bit element instead of 2x16bit helps in case when you do not have
FIFO
for sure to give more time for the DMA. Also in case of underrun/overrun
you
will be loosing both channel's data so the swap would not happen.
This sounds like a fine solution for me in the interim: a permanent channel swap can be dealt with after data is received from ALSA.
If you have any helpful suggestions, I would really appreciate them. I
am
operating under the assumption I can leave the TWL codec settings as
they are
and just 'lie' to McBSP about the data framing. Namely, specify single
phase
frame with 1 word of 32 bits, and set DMA packet size to a 32bit word
also.
Since 16 bit mode operates by default in dual phase, I am not sure if
the
signalling will work correctly, but I guess I will find out.
Yeah, this is also going to be a problem. As twl4030 is using I2S when in stereo mode, with fake mono 32bit you can not use I2S signals. But since
McBSP
is looking for the start condition, configuring it for DSP_A mode will probably going to start the reception/playback at the correct time.
I will look in to the DSP_A mode as suggested.
Again, my apologies for contacting you directly. Hopefully the
information I
have provided can help diagnosing the real problem in the kernel.
No problem. Sorry for that I have missed your mail in alsa-devel.
For me, I must find a solution in the near-term so I will try to figure out some
hacks
as described in the Wiki article.
The wiki page was written for daVinci devices AFAIK where they don't have
FIFO
for their McBSP.
-- Péter
Akram