Re: [alsa-devel] [PATCH 0/5] ASoC: fsl_ssi: Fixing various channel slips and bad samples insertions

13 Jan 2016

On Wed, Jan 13, 2016 at 12:20 PM, Caleb Crome caleb@crome.org wrote:
...
On Wed, Jan 13, 2016 at 6:45 AM, arnaud.mouiche@invoxia.com
arnaud.mouiche@invoxia.com wrote:
...
Le 12/01/2016 00:44, Caleb Crome a écrit :
...
On Sat, Jan 9, 2016 at 3:02 AM, arnaud.mouiche@invoxia.com
arnaud.mouiche@invoxia.com wrote:
...
Hello Caleb
Le 09/01/2016 01:47, Caleb Crome a écrit :
...
[...]
Hello Arnaud,
    I have finally gotten to test your patches, and I'm still having
trouble with channel slips.
I applied your v2 patch set, along with your changes for using a dummy
codec.
The full changes are here:
https://github.com/ccrome/linux-caleb-dev/tree/v4.4-rc8-armv7-x3
This ignores most of my previous patches, and uses your code to bring
up the SSI (without a codec) on a wandboard.
I am using SSI3, and doing a hardware loopback between TX and RX.
Here's what I run:
./atest -r 16000 -c 8 -p 2048   -D default play
which plays continuously.
and in another shell:
./atest -r 16000 -c 8 -p 2048  -D default -d 10 capture
which captures for 10 seconds.
The first time I run the capture command, it succeeds, no problem.
...
dbg: dev: 'default'
dbg: default: capture_start
dbg: start a 10 seconds duration timer
warn: First valid frame
warn:   3400 3401 3402 3403 3404 3405 3406 3407
dbg: end of tests
total number of sequence errors: 0
global tests exit status: OK
But the second and all subsequent captures, it fails with channel slips:
...
dbg: dev: 'default'
dbg: default: capture_start
dbg: start a 10 seconds duration timer
err: invalid frame after 0 null frames
err:   d2a1 d2a2 d2a3 d2a4 d2a5 d2a6 d2a7 d2c0
err:   d2c1 d2c2 d2c3 d2c4 d2c5 d2c6 d2c7 78e0
dbg: end of tests
total number of sequence errors: 430080
global tests exit status: OK
Can you use -I option to get a little more log of the error
$ ./atest -r 16000 -c 8 -p 2048  -D default -d 10 -I 10 capture
Just to know if the wrong frames comes from the previous record session,
meaning the RX fifo was not cleared correctly,
as if the CCSR_SSI_SOR_RX_CLR bit is not working as we might expect.
(ie. d2a1 .. d2c7 may come from the previous recording, and 78e0 .. may
be
the start of the new recording session)
That does appear to be what's happening.  I read the SFCSR before and
after you update the SOR register (SOR_RX_CLR), and in both cases the
rx buffer is full on the second enablement.
I can't seem to empty that buffer, either by using the SOR_RX_CLR or
by reading the SRX0 register. This is another one of those cases where
there you cannot modify the register (i.e. fifo or RX0) when RE is
disabled.
So, instead of clearing on enable, I'm clearing on shutdown and on
enable.  I attempted using the SOR_RX_CLR, but it doesn't seem to work
on the MX6.  Rather, I clear the fifo by reading all the words in a
loop, just before RE is disabled.
so the imx6.sl SSI behaves differently from the imx6.solo SSI ... Or some
things have changes between 4.3 and 4.4.
May be SOR_RX_CLR and/or RX0 registers can't be read on 'solo' because de
the SRCR.RFEN0 [or1] are not enabled yet.
In fact, clearing the fifos at the end may still introduce a race since the
RX path is still enabled when your read from fifo, and
Samples may still be received in the meantime...
You are correct about the race condition.  It seems like a catch 22:
you can't empty the fifo if SCR.RE is disabled, and if the SCR.RE is
enabled, new samples might be received at any time.  in fact, when
disabling RE, it does keep receiving until the end of the frame.
...
Also, I see that the CCSR_SSI_SOR register is not in the regmap
fsl_ssi_volatile_reg() list (line 140). May be there is an implication of
some caching mechanism ?
Ah!  Yes.  When I add the SOR to the volatile list, then writing the
SOR does clear the FIFO.  That's one for the patch for sure :-)
Now at least RX get's cleared on RX start.
However, now I've got another issue:
FYI I'm running at 16 channels/16-bits/frame, so 256 bits/frame, with
the codec as master.  (codec 0 anyway).
I'm running atest play in one shell, and atest capture in another shell.
At low sample rates, it works quite reliably (a few hundred simulated
xruns with no problem), but TX slips at higher sample rates.
I'm running
    shell a # ./atest -r SAMPLERATE -c 16 -p 1024   -D default play
    shell b # ./atest -r SAMPLERATE -c 16 -p 1024   -D default -d 100
capture  -x200,100
8kHz:  No problems after 100 seconds (~300 capture xruns)
16kHz:  No problems
32kHz: tx slips after a few RX xruns (12) and never recovers.
48kHz:  tx starts properly (usually), but slips immediately upon RX start.
FYI, I'm watching TX on a scope, and can verify that it is in fact the
TX slipping.   I caught it once, and found that it was a duplicated
sample in the TX stream, which seems to say that the DMA is not
keeping up maybe?  Perhaps tweaking with watermarks will help that, or
perhaps dual-fifo will help.  Will try a couple things and get back to
you.
In the meantime, any idea how to check to see what speed the SSI
peripheral clock is running?  I just want to make sure it's going fast
enough.
here's a log of the failure at 32kHz:
<first verified on the scope that play is running without slips>
... 20 or 30 valid xruns...
ALSA lib pcm.c:7843:(snd_pcm_recover) overrun occurred
warn: First valid frame
warn:   49a0 49a1 49a2 49a3 49a4 49a5 49a6 49a7 49a8 49a9 49aa 49ab
49ac 49ad 49ae 49af
warn: default: force capture xrun
warn: default: CT_W4_XRUN_END
warn: default: capture read failed: Broken pipe
ALSA lib pcm.c:7843:(snd_pcm_recover) overrun occurred
warn: First valid frame
warn:   3a40 3a41 3a42 3a43 3a44 3a45 3a46 3a47 3a48 3a49 3a4a 3a4b
3a4c 3a4d 3a4e 3a4f
warn: default: force capture xrun
warn: default: CT_W4_XRUN_END
warn: default: capture read failed: Broken pipe
ALSA lib pcm.c:7843:(snd_pcm_recover) overrun occurred
warn: First valid frame
warn:   3240 3241 3242 3243 3244 3245 3246 3247 3248 3249 324a 324b
324c 324d 324e 324f
warn: first invalid frame while expecting frame 0x0617
warn:   c2e0 c2e1 c2e2 c2e3 c2e4 c2e5 c2e6 c2e6 c2e7 c2e8 c2e9 c2ea
c2eb c2ec c2ed c2ee
warn:   c2ef c300 c301 c302 c303 c304 c305 c306 c307 c308 c309 c30a
c30b c30c c30d c30e
warn:   c30f c320 c321 c322 c323 c324 c325 c326 c327 c328 c329 c32a
c32b c32c c32d c32e
warn:   c32f c340 c341 c342 c343 c344 c345 c346 c347 c348 c349 c34a
c34b c34c c34d c34e
err:   c34f c360 c361 c362 c363 c364 c365 c366 c367 c368 c369 c36a
c36b c36c c36d c36e
err:   c36f c380 c381 c382 c383 c384 c385 c386 c387 c388 c389 c38a
c38b c38c c38d c38e
err:   c38f c3a0 c3a1 c3a2 c3a3 c3a4 c3a5 c3a6 c3a7 c3a8 c3a9 c3aa
c3ab c3ac c3ad c3ae
err:   c3af c3c0 c3c1 c3c2 c3c3 c3c4 c3c5 c3c6 c3c7 c3c8 c3c9 c3ca
c3cb c3cc c3cd c3ce
err:   c3cf c3e0 c3e1 c3e2 c3e3 c3e4 c3e5 c3e6 c3e7 c3e8 c3e9 c3ea
c3eb c3ec c3ed c3ee
err:   c3ef c400 c401 c402 c403 c404 c405 c406 c407 c408 c409 c40a
c40b c40c c40d c40e
err:   c40f c420 c421 c422 c423 c424 c425 c426 c427 c428 c429 c42a
c42b c42c c42d c42e
warn: default: force capture xrun
warn: default: CT_W4_XRUN_END
As you can see, this is caused by duplicated c2e6, which I guess is a
DMA underrun?
SUCCESS!  So far...
With your patches (including the latest SOR register update), plus
setting the watermark & DMA MAXBURST to 8, I don't get any more errors
at 48kHz ...  yet.
Even in single fifo mode, 48kHz, 16-bits+16channels seems to work now.
I'll keep you updated on if this really solves all the issues.
Here's the last patch for updating the watermark.
commit b634014b831b9527df319b404ac50e54a3790742
Author: Caleb Crome caleb@crome.org
Date:   Wed Jan 13 13:12:37 2016 -0800
ASoC: fsl_ssi: Increase watermark and maxburst to allow SSI to work
    without slips at high data rates.
The DMA cannot keep up with the SSI consumpation with the watermark
    set to be fifo_depth-2 when running at 48kHz/16-bits/16-channels
    (12288 words/second).  By increasing the watermark to 8, the DMA can
    keep up with the SSI.
Signed-off-by: Caleb Crome caleb@crome.org

diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 5cfc540..026df79 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -221,6 +221,12 @@ struct fsl_ssi_soc_data {
  * @dbg_stats: Debugging statistics
  *
  * @soc: SoC specific data
+ *
+ * @fifo_watermark: the FIFO watermark setting.  Notifies DMA when
+ *             there are @fifo_watermark or fewer words in TX fifo or
+ *             @fifo_watermark or more empty words in RX fifo.
+ * @dma_maxburst: max number of words to transfer in one go.  So far,
+ *             this is always the same as fifo_watermark.
  */
 struct fsl_ssi_private {
     struct regmap *regs;
@@ -259,6 +265,9 @@ struct fsl_ssi_private {
const struct fsl_ssi_soc_data *soc;
     struct device *dev;
+
+    u32 fifo_watermark;
+    u32 dma_maxburst;
 };
/*
@@ -1037,21 +1046,7 @@ static int _fsl_ssi_set_dai_fmt(struct device *dev,
     regmap_write(regs, CCSR_SSI_SRCR, srcr);
     regmap_write(regs, CCSR_SSI_SCR, scr);
-    /*
-     * Set the watermark for transmit FIFI 0 and receive FIFO 0. We don't
-     * use FIFO 1. We program the transmit water to signal a DMA transfer
-     * if there are only two (or fewer) elements left in the FIFO. Two
-     * elements equals one frame (left channel, right channel). This value,
-     * however, depends on the depth of the transmit buffer.
-     *
-     * We set the watermark on the same level as the DMA burstsize.  For
-     * fiq it is probably better to use the biggest possible watermark
-     * size.
-     */
-    if (ssi_private->use_dma)
-        wm = ssi_private->fifo_depth - 2;
-    else
-        wm = ssi_private->fifo_depth;
+    wm = ssi_private->watermark;
regmap_write(regs, CCSR_SSI_SFCSR,
             CCSR_SSI_SFCSR_TFWM0(wm) | CCSR_SSI_SFCSR_RFWM0(wm) |
@@ -1359,12 +1354,8 @@ static int fsl_ssi_imx_probe(struct
platform_device *pdev,
         dev_dbg(&pdev->dev, "could not get baud clock: %ld\n",
              PTR_ERR(ssi_private->baudclk));
-    /*
-     * We have burstsize be "fifo_depth - 2" to match the SSI
-     * watermark setting in fsl_ssi_startup().
-     */
-    ssi_private->dma_params_tx.maxburst = ssi_private->fifo_depth - 2;
-    ssi_private->dma_params_rx.maxburst = ssi_private->fifo_depth - 2;
+    ssi_private->dma_params_tx.maxburst = ssi_private->dma_maxburst;
+    ssi_private->dma_params_rx.maxburst = ssi_private->dma_maxburst;
     ssi_private->dma_params_tx.addr = ssi_private->ssi_phys + CCSR_SSI_STX0;
     ssi_private->dma_params_rx.addr = ssi_private->ssi_phys + CCSR_SSI_SRX0;
@@ -1518,6 +1509,42 @@ static int fsl_ssi_probe(struct platform_device *pdev)
                 /* Older 8610 DTs didn't have the fifo-depth property */
         ssi_private->fifo_depth = 8;
+    /*
+     * Set the watermark for transmit FIFI 0 and receive FIFO 0. We don't
+     * use FIFO 1. We program the transmit water to signal a DMA transfer
+     * if there are only two (or fewer) elements left in the FIFO. Two
+     * elements equals one frame (left channel, right channel). This value,
+     * however, depends on the depth of the transmit buffer.
+     *
+     * We set the watermark on the same level as the DMA burstsize.  For
+     * fiq it is probably better to use the biggest possible watermark
+     * size.
+     */
+    switch (ssi_private->fifo_depth) {
+    case 15:
+        /*
+         * 2 samples is not enough when running at high data
+         * rates (like 48kHz @ 16 bits/channel, 16 channels)
+         * 8 seems to split things evenly and leave enough time
+         * for the DMA to fill the FIFO before it's over/under
+         * run.
+         */
+        ssi_private->fifo_watermark = 8;
+        ssi_private->dma_maxburst = 8;
+    case 8:
+    default:
+        /*
+         * maintain old behavior for older chips.
+         * Keeping it the same because I don't have an older
+         * board to test with.
+         * I suspect this could be changed to be something to
+         * leave some more space in the fifo.
+         */
+        ssi_private->fifo_watermark = ssi_private->fifo_depth - 2;
+        ssi_private->dma_maxburst = ssi_private->fifo_depth - 2;
+        break;
+    }
+
     dev_set_drvdata(&pdev->dev, ssi_private);
if (ssi_private->soc->imx) {