[alsa-devel] [PATCH 0/6] Fixups to MPC5200 ASoC drivers
Hi everyone,
Jon, as we talked about earlier today, I've spent some time working on the MPC5200 AC97 driver to try and get rid of some of the nagging bugs. The two big changes that I ended up making were to remove the tracking of appl_ptr (and all the race conditions that went with it), and to fix the handling of AC97 slot enables/disables so that a stream can be stopped and started again without going through the hw_params() step.
I know that you had the appl_ptr tracking in to try and solve the problem of audible noise at the end of playback. After discussing the data flow model on #alsa-soc this morning, I learned that the driver is really supposed to be free-running, and the ALSA layer is supposed to be able to keep up. It is also responsible to ensure that buffer is filled with silence at the end of playback to eliminate noise before the trigger can properly turn everything off. Only a couple of other drivers use appl_ptr, so I'm pretty sure this driver shouldn't be doing it either.
Instead, from a recommendation this morning, I added a 'fudge' factor to the value reported by the .pointer() callback of 1/4 the period size. At first I thought this helped, but after more testing I find that the driver seems to work correctly with aplay both with and without the fudge factor applied.
However, I was able to reproduce the noise problem when using aplay if I have DEBUG #defined at the top of the mpc5200_dma.c file with debug console logs being spit out the serial port. In that situation, the STOP trigger calls printk(), and a stale sample can be heard at the end of playback. However, I believe this is a bug with the serial console driver (in that it disables IRQs for a long time) causing unbounded latencies, so the trigger doesn't get processed fast enough. It doesn't help that aplay doesn't flush the buffers with silence before triggering STOP. Other programs (like gstreamer) do seem to flush out stale data before stopping the stream. I'm sure someone will correct me if I'm making some bad assumptions here.
So, please test. Let me know if these work or not. I've don't know if the last patch (Add fudge factor...) is needed or not.
Cheers, g.
---
Grant Likely (6): ASoC/mpc5200: Add fudge factor to value reported by .pointer() ASoC/mpc5200: fix enable/disable of AC97 slots ASoC/mpc5200: add to_psc_dma_stream() helper ASoC/mpc5200: Improve printk debug output for trigger ASoC/mpc5200: get rid of the appl_ptr tracking nonsense ASoC/mpc5200: Track DMA position by period number instead of bytes
sound/soc/fsl/mpc5200_dma.c | 98 ++++++++++---------------------------- sound/soc/fsl/mpc5200_dma.h | 24 ++++++--- sound/soc/fsl/mpc5200_psc_ac97.c | 39 ++++++++------- 3 files changed, 63 insertions(+), 98 deletions(-)
All DMA blocks are lined up to period boundaries, but the DMA handling code tracks bytes instead. This patch reworks the code to track the period index into the DMA buffer instead of the physical address pointer. Doing so makes the code simpler and easier to understand.
Signed-off-by: Grant Likely grant.likely@secretlab.ca ---
sound/soc/fsl/mpc5200_dma.c | 28 +++++++++------------------- sound/soc/fsl/mpc5200_dma.h | 8 ++------ 2 files changed, 11 insertions(+), 25 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.c b/sound/soc/fsl/mpc5200_dma.c index 6096d22..986d3c8 100644 --- a/sound/soc/fsl/mpc5200_dma.c +++ b/sound/soc/fsl/mpc5200_dma.c @@ -58,13 +58,11 @@ static void psc_dma_bcom_enqueue_next_buffer(struct psc_dma_stream *s) /* Prepare and enqueue the next buffer descriptor */ bd = bcom_prepare_next_buffer(s->bcom_task); bd->status = s->period_bytes; - bd->data[0] = s->period_next_pt; + bd->data[0] = s->runtime->dma_addr + (s->period_next * s->period_bytes); bcom_submit_next_buffer(s->bcom_task, NULL);
/* Update for next period */ - s->period_next_pt += s->period_bytes; - if (s->period_next_pt >= s->period_end) - s->period_next_pt = s->period_start; + s->period_next = (s->period_next + 1) % s->runtime->periods; }
static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) @@ -79,7 +77,7 @@ static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) if (bcom_queue_full(s->bcom_task)) return;
- s->appl_ptr += s->period_size; + s->appl_ptr += s->runtime->period_size;
psc_dma_bcom_enqueue_next_buffer(s); } @@ -91,7 +89,7 @@ static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) if (bcom_queue_full(s->bcom_task)) return;
- s->appl_ptr += s->period_size; + s->appl_ptr += s->runtime->period_size;
psc_dma_bcom_enqueue_next_buffer(s); } @@ -108,9 +106,7 @@ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) while (bcom_buffer_done(s->bcom_task)) { bcom_retrieve_buffer(s->bcom_task, NULL, NULL);
- s->period_current_pt += s->period_bytes; - if (s->period_current_pt >= s->period_end) - s->period_current_pt = s->period_start; + s->period_current = (s->period_current+1) % s->runtime->periods; } psc_dma_bcom_enqueue_tx(s); spin_unlock(&s->psc_dma->lock); @@ -133,9 +129,7 @@ static irqreturn_t psc_dma_bcom_irq_rx(int irq, void *_psc_dma_stream) while (bcom_buffer_done(s->bcom_task)) { bcom_retrieve_buffer(s->bcom_task, NULL, NULL);
- s->period_current_pt += s->period_bytes; - if (s->period_current_pt >= s->period_end) - s->period_current_pt = s->period_start; + s->period_current = (s->period_current+1) % s->runtime->periods;
psc_dma_bcom_enqueue_next_buffer(s); } @@ -185,12 +179,8 @@ static int psc_dma_trigger(struct snd_pcm_substream *substream, int cmd) case SNDRV_PCM_TRIGGER_START: s->period_bytes = frames_to_bytes(runtime, runtime->period_size); - s->period_start = virt_to_phys(runtime->dma_area); - s->period_end = s->period_start + - (s->period_bytes * runtime->periods); - s->period_next_pt = s->period_start; - s->period_current_pt = s->period_start; - s->period_size = runtime->period_size; + s->period_next = 0; + s->period_current = 0; s->active = 1;
/* track appl_ptr so that we have a better chance of detecting @@ -343,7 +333,7 @@ psc_dma_pointer(struct snd_pcm_substream *substream) else s = &psc_dma->playback;
- count = s->period_current_pt - s->period_start; + count = s->period_current * s->period_bytes;
return bytes_to_frames(substream->runtime, count); } diff --git a/sound/soc/fsl/mpc5200_dma.h b/sound/soc/fsl/mpc5200_dma.h index 8d396bb..529f7a0 100644 --- a/sound/soc/fsl/mpc5200_dma.h +++ b/sound/soc/fsl/mpc5200_dma.h @@ -13,7 +13,6 @@ * @psc_dma: pointer back to parent psc_dma data structure * @bcom_task: bestcomm task structure * @irq: irq number for bestcomm task - * @period_start: physical address of start of DMA region * @period_end: physical address of end of DMA region * @period_next_pt: physical address of next DMA buffer to enqueue * @period_bytes: size of DMA period in bytes @@ -27,12 +26,9 @@ struct psc_dma_stream { struct bcom_task *bcom_task; int irq; struct snd_pcm_substream *stream; - dma_addr_t period_start; - dma_addr_t period_end; - dma_addr_t period_next_pt; - dma_addr_t period_current_pt; + int period_next; + int period_current; int period_bytes; - int period_size; };
/**
On Sat, 2009-11-07 at 01:33 -0700, Grant Likely wrote:
All DMA blocks are lined up to period boundaries, but the DMA handling code tracks bytes instead. This patch reworks the code to track the period index into the DMA buffer instead of the physical address pointer. Doing so makes the code simpler and easier to understand.
Signed-off-by: Grant Likely grant.likely@secretlab.ca
Very minor coding style thing below otherwise all get my Ack.
Acked-by: Liam Girdwood lrg@slimlogic.co.uk
sound/soc/fsl/mpc5200_dma.c | 28 +++++++++------------------- sound/soc/fsl/mpc5200_dma.h | 8 ++------ 2 files changed, 11 insertions(+), 25 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.c b/sound/soc/fsl/mpc5200_dma.c index 6096d22..986d3c8 100644 --- a/sound/soc/fsl/mpc5200_dma.c +++ b/sound/soc/fsl/mpc5200_dma.c @@ -58,13 +58,11 @@ static void psc_dma_bcom_enqueue_next_buffer(struct psc_dma_stream *s) /* Prepare and enqueue the next buffer descriptor */ bd = bcom_prepare_next_buffer(s->bcom_task); bd->status = s->period_bytes;
- bd->data[0] = s->period_next_pt;
bd->data[0] = s->runtime->dma_addr + (s->period_next * s->period_bytes); bcom_submit_next_buffer(s->bcom_task, NULL);
/* Update for next period */
- s->period_next_pt += s->period_bytes;
- if (s->period_next_pt >= s->period_end)
s->period_next_pt = s->period_start;
- s->period_next = (s->period_next + 1) % s->runtime->periods;
}
static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) @@ -79,7 +77,7 @@ static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) if (bcom_queue_full(s->bcom_task)) return;
s->appl_ptr += s->period_size;
s->appl_ptr += s->runtime->period_size; psc_dma_bcom_enqueue_next_buffer(s);
}
@@ -91,7 +89,7 @@ static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) if (bcom_queue_full(s->bcom_task)) return;
s->appl_ptr += s->period_size;
s->appl_ptr += s->runtime->period_size;
psc_dma_bcom_enqueue_next_buffer(s); }
@@ -108,9 +106,7 @@ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) while (bcom_buffer_done(s->bcom_task)) { bcom_retrieve_buffer(s->bcom_task, NULL, NULL);
s->period_current_pt += s->period_bytes;
if (s->period_current_pt >= s->period_end)
s->period_current_pt = s->period_start;
s->period_current = (s->period_current+1) % s->runtime->periods;
I prefer a space around operators.
s->period_current = (s->period_current + 1) % s->runtime->periods;
Liam
On Sat, Nov 7, 2009 at 3:35 AM, Liam Girdwood lrg@slimlogic.co.uk wrote:
On Sat, 2009-11-07 at 01:33 -0700, Grant Likely wrote:
All DMA blocks are lined up to period boundaries, but the DMA handling code tracks bytes instead. This patch reworks the code to track the period index into the DMA buffer instead of the physical address pointer. Doing so makes the code simpler and easier to understand.
Signed-off-by: Grant Likely grant.likely@secretlab.ca
Very minor coding style thing below otherwise all get my Ack.
Acked-by: Liam Girdwood lrg@slimlogic.co.uk
Thanks Liam.
- s->period_current_pt += s->period_bytes;
- if (s->period_current_pt >= s->period_end)
- s->period_current_pt = s->period_start;
- s->period_current = (s->period_current+1) % s->runtime->periods;
I prefer a space around operators.
s->period_current = (s->period_current + 1) % s->runtime->periods;
So do I, but this kept the line length down below 80 chars. Avoiding the line spillage this way looks nicer than the alternatives.
g.
Sound drivers PCM DMA is supposed to free-run until told to stop by the trigger callback. The current code tries to track appl_ptr, to avoid stale buffer data getting played out at the end of the data stream. Unfortunately it also results in race conditions which can cause the audio to stall.
Signed-off-by: Grant Likely grant.likely@secretlab.ca ---
sound/soc/fsl/mpc5200_dma.c | 52 +++++++------------------------------------ sound/soc/fsl/mpc5200_dma.h | 2 -- 2 files changed, 8 insertions(+), 46 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.c b/sound/soc/fsl/mpc5200_dma.c index 986d3c8..4e47586 100644 --- a/sound/soc/fsl/mpc5200_dma.c +++ b/sound/soc/fsl/mpc5200_dma.c @@ -65,36 +65,6 @@ static void psc_dma_bcom_enqueue_next_buffer(struct psc_dma_stream *s) s->period_next = (s->period_next + 1) % s->runtime->periods; }
-static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) -{ - if (s->appl_ptr > s->runtime->control->appl_ptr) { - /* - * In this case s->runtime->control->appl_ptr has wrapped around. - * Play the data to the end of the boundary, then wrap our own - * appl_ptr back around. - */ - while (s->appl_ptr < s->runtime->boundary) { - if (bcom_queue_full(s->bcom_task)) - return; - - s->appl_ptr += s->runtime->period_size; - - psc_dma_bcom_enqueue_next_buffer(s); - } - s->appl_ptr -= s->runtime->boundary; - } - - while (s->appl_ptr < s->runtime->control->appl_ptr) { - - if (bcom_queue_full(s->bcom_task)) - return; - - s->appl_ptr += s->runtime->period_size; - - psc_dma_bcom_enqueue_next_buffer(s); - } -} - /* Bestcomm DMA irq handler */ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) { @@ -107,8 +77,9 @@ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) bcom_retrieve_buffer(s->bcom_task, NULL, NULL);
s->period_current = (s->period_current+1) % s->runtime->periods; + + psc_dma_bcom_enqueue_next_buffer(s); } - psc_dma_bcom_enqueue_tx(s); spin_unlock(&s->psc_dma->lock);
/* If the stream is active, then also inform the PCM middle layer @@ -182,28 +153,21 @@ static int psc_dma_trigger(struct snd_pcm_substream *substream, int cmd) s->period_next = 0; s->period_current = 0; s->active = 1; - - /* track appl_ptr so that we have a better chance of detecting - * end of stream and not over running it. - */ s->runtime = runtime; - s->appl_ptr = s->runtime->control->appl_ptr - - (runtime->period_size * runtime->periods);
/* Fill up the bestcomm bd queue and enable DMA. * This will begin filling the PSC's fifo. */ spin_lock_irqsave(&psc_dma->lock, flags);
- if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE) { + if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE) bcom_gen_bd_rx_reset(s->bcom_task); - for (i = 0; i < runtime->periods; i++) - if (!bcom_queue_full(s->bcom_task)) - psc_dma_bcom_enqueue_next_buffer(s); - } else { + else bcom_gen_bd_tx_reset(s->bcom_task); - psc_dma_bcom_enqueue_tx(s); - } + + for (i = 0; i < runtime->periods; i++) + if (!bcom_queue_full(s->bcom_task)) + psc_dma_bcom_enqueue_next_buffer(s);
bcom_enable(s->bcom_task); spin_unlock_irqrestore(&psc_dma->lock, flags); diff --git a/sound/soc/fsl/mpc5200_dma.h b/sound/soc/fsl/mpc5200_dma.h index 529f7a0..d9c741b 100644 --- a/sound/soc/fsl/mpc5200_dma.h +++ b/sound/soc/fsl/mpc5200_dma.h @@ -19,8 +19,6 @@ */ struct psc_dma_stream { struct snd_pcm_runtime *runtime; - snd_pcm_uframes_t appl_ptr; - int active; struct psc_dma *psc_dma; struct bcom_task *bcom_task;
On Sat, Nov 7, 2009 at 3:34 AM, Grant Likely grant.likely@secretlab.ca wrote:
Sound drivers PCM DMA is supposed to free-run until told to stop by the trigger callback. The current code tries to track appl_ptr, to avoid stale buffer data getting played out at the end of the data stream. Unfortunately it also results in race conditions which can cause the audio to stall.
I leave in an hour and I will be off net for a week so I can't look at these.
The problem at end of stream works like this:
last buffer containing music plays this buffer has been padded with zero to make a full sample interrupt occurs at end of buffer --- at this point the next chained buffer starts playing, it is full of junk --- this chaining happens in hardware Alsa processes the callback and sends stop stream --- oops, too late, buffer full of noise has already played several samples --- these samples of noise are clearly audible. --- they are usually a fragment from earlier in the song.
Using aplay with short clips like the action sounds for pidgin, etc makes these noise bursts obviously visible.
To fix this you need a mechanism to determine where the valid data in the buffering system ends and noise starts. Once you know the extent of the valid data we can keep the DMA channel programmed to not play invalid data. You can play off the end of valid data two ways; under run when ALSA doesn't supply data fast enough and end of buffer.
ALSA does not provide information on where valid data ends in easily consumable form so I've been trying to reconstruct it from appl_ptr. A much cleaner solution would be for ALSA to provide a field that indicates the last valid address in the ring buffer system. Then in the driver's buffer complete callback I could get that value and reprogram the DMA engine not to run off the end of valid data. As each buffer completes I would reread the value and update the DMA stop address. You also need the last valid address field when DMA is first started.
Signed-off-by: Grant Likely grant.likely@secretlab.ca
sound/soc/fsl/mpc5200_dma.c | 52 +++++++------------------------------------ sound/soc/fsl/mpc5200_dma.h | 2 -- 2 files changed, 8 insertions(+), 46 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.c b/sound/soc/fsl/mpc5200_dma.c index 986d3c8..4e47586 100644 --- a/sound/soc/fsl/mpc5200_dma.c +++ b/sound/soc/fsl/mpc5200_dma.c @@ -65,36 +65,6 @@ static void psc_dma_bcom_enqueue_next_buffer(struct psc_dma_stream *s) s->period_next = (s->period_next + 1) % s->runtime->periods; }
-static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) -{
- if (s->appl_ptr > s->runtime->control->appl_ptr) {
- /*
- * In this case s->runtime->control->appl_ptr has wrapped around.
- * Play the data to the end of the boundary, then wrap our own
- * appl_ptr back around.
- */
- while (s->appl_ptr < s->runtime->boundary) {
- if (bcom_queue_full(s->bcom_task))
- return;
- s->appl_ptr += s->runtime->period_size;
- psc_dma_bcom_enqueue_next_buffer(s);
- }
- s->appl_ptr -= s->runtime->boundary;
- }
- while (s->appl_ptr < s->runtime->control->appl_ptr) {
- if (bcom_queue_full(s->bcom_task))
- return;
- s->appl_ptr += s->runtime->period_size;
- psc_dma_bcom_enqueue_next_buffer(s);
- }
-}
/* Bestcomm DMA irq handler */ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) { @@ -107,8 +77,9 @@ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) bcom_retrieve_buffer(s->bcom_task, NULL, NULL);
s->period_current = (s->period_current+1) % s->runtime->periods;
- psc_dma_bcom_enqueue_next_buffer(s);
}
- psc_dma_bcom_enqueue_tx(s);
spin_unlock(&s->psc_dma->lock);
/* If the stream is active, then also inform the PCM middle layer @@ -182,28 +153,21 @@ static int psc_dma_trigger(struct snd_pcm_substream *substream, int cmd) s->period_next = 0; s->period_current = 0; s->active = 1;
- /* track appl_ptr so that we have a better chance of detecting
- * end of stream and not over running it.
- */
s->runtime = runtime;
- s->appl_ptr = s->runtime->control->appl_ptr -
- (runtime->period_size * runtime->periods);
/* Fill up the bestcomm bd queue and enable DMA. * This will begin filling the PSC's fifo. */ spin_lock_irqsave(&psc_dma->lock, flags);
- if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE) {
- if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE)
bcom_gen_bd_rx_reset(s->bcom_task);
- for (i = 0; i < runtime->periods; i++)
- if (!bcom_queue_full(s->bcom_task))
- psc_dma_bcom_enqueue_next_buffer(s);
- } else {
- else
bcom_gen_bd_tx_reset(s->bcom_task);
- psc_dma_bcom_enqueue_tx(s);
- }
- for (i = 0; i < runtime->periods; i++)
- if (!bcom_queue_full(s->bcom_task))
- psc_dma_bcom_enqueue_next_buffer(s);
bcom_enable(s->bcom_task); spin_unlock_irqrestore(&psc_dma->lock, flags); diff --git a/sound/soc/fsl/mpc5200_dma.h b/sound/soc/fsl/mpc5200_dma.h index 529f7a0..d9c741b 100644 --- a/sound/soc/fsl/mpc5200_dma.h +++ b/sound/soc/fsl/mpc5200_dma.h @@ -19,8 +19,6 @@ */ struct psc_dma_stream { struct snd_pcm_runtime *runtime;
- snd_pcm_uframes_t appl_ptr;
int active; struct psc_dma *psc_dma; struct bcom_task *bcom_task;
On Sat, Nov 7, 2009 at 7:51 AM, Jon Smirl jonsmirl@gmail.com wrote:
On Sat, Nov 7, 2009 at 3:34 AM, Grant Likely grant.likely@secretlab.ca wrote:
Sound drivers PCM DMA is supposed to free-run until told to stop by the trigger callback. The current code tries to track appl_ptr, to avoid stale buffer data getting played out at the end of the data stream. Unfortunately it also results in race conditions which can cause the audio to stall.
I leave in an hour and I will be off net for a week so I can't look at these.
There is a surefire way to fix this but I have resisted doing it because it is fixing a symptom not a cause.
Simply have the driver zero out the buffer in the completion interrupt before handing it back to ALSA. Then if ALSA lets us play invalid data the invalid data will be silence. I implemented this and it works every time.
Downside is a big memset() in an IRQ handler.
The problem at end of stream works like this:
last buffer containing music plays this buffer has been padded with zero to make a full sample interrupt occurs at end of buffer --- at this point the next chained buffer starts playing, it is full of junk --- this chaining happens in hardware Alsa processes the callback and sends stop stream --- oops, too late, buffer full of noise has already played several samples --- these samples of noise are clearly audible. --- they are usually a fragment from earlier in the song.
Using aplay with short clips like the action sounds for pidgin, etc makes these noise bursts obviously visible.
To fix this you need a mechanism to determine where the valid data in the buffering system ends and noise starts. Once you know the extent of the valid data we can keep the DMA channel programmed to not play invalid data. You can play off the end of valid data two ways; under run when ALSA doesn't supply data fast enough and end of buffer.
ALSA does not provide information on where valid data ends in easily consumable form so I've been trying to reconstruct it from appl_ptr. A much cleaner solution would be for ALSA to provide a field that indicates the last valid address in the ring buffer system. Then in the driver's buffer complete callback I could get that value and reprogram the DMA engine not to run off the end of valid data. As each buffer completes I would reread the value and update the DMA stop address. You also need the last valid address field when DMA is first started.
Signed-off-by: Grant Likely grant.likely@secretlab.ca
sound/soc/fsl/mpc5200_dma.c | 52 +++++++------------------------------------ sound/soc/fsl/mpc5200_dma.h | 2 -- 2 files changed, 8 insertions(+), 46 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.c b/sound/soc/fsl/mpc5200_dma.c index 986d3c8..4e47586 100644 --- a/sound/soc/fsl/mpc5200_dma.c +++ b/sound/soc/fsl/mpc5200_dma.c @@ -65,36 +65,6 @@ static void psc_dma_bcom_enqueue_next_buffer(struct psc_dma_stream *s) s->period_next = (s->period_next + 1) % s->runtime->periods; }
-static void psc_dma_bcom_enqueue_tx(struct psc_dma_stream *s) -{
- if (s->appl_ptr > s->runtime->control->appl_ptr) {
- /*
- * In this case s->runtime->control->appl_ptr has wrapped around.
- * Play the data to the end of the boundary, then wrap our own
- * appl_ptr back around.
- */
- while (s->appl_ptr < s->runtime->boundary) {
- if (bcom_queue_full(s->bcom_task))
- return;
- s->appl_ptr += s->runtime->period_size;
- psc_dma_bcom_enqueue_next_buffer(s);
- }
- s->appl_ptr -= s->runtime->boundary;
- }
- while (s->appl_ptr < s->runtime->control->appl_ptr) {
- if (bcom_queue_full(s->bcom_task))
- return;
- s->appl_ptr += s->runtime->period_size;
- psc_dma_bcom_enqueue_next_buffer(s);
- }
-}
/* Bestcomm DMA irq handler */ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) { @@ -107,8 +77,9 @@ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) bcom_retrieve_buffer(s->bcom_task, NULL, NULL);
s->period_current = (s->period_current+1) % s->runtime->periods;
- psc_dma_bcom_enqueue_next_buffer(s);
}
- psc_dma_bcom_enqueue_tx(s);
spin_unlock(&s->psc_dma->lock);
/* If the stream is active, then also inform the PCM middle layer @@ -182,28 +153,21 @@ static int psc_dma_trigger(struct snd_pcm_substream *substream, int cmd) s->period_next = 0; s->period_current = 0; s->active = 1;
- /* track appl_ptr so that we have a better chance of detecting
- * end of stream and not over running it.
- */
s->runtime = runtime;
- s->appl_ptr = s->runtime->control->appl_ptr -
- (runtime->period_size * runtime->periods);
/* Fill up the bestcomm bd queue and enable DMA. * This will begin filling the PSC's fifo. */ spin_lock_irqsave(&psc_dma->lock, flags);
- if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE) {
- if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE)
bcom_gen_bd_rx_reset(s->bcom_task);
- for (i = 0; i < runtime->periods; i++)
- if (!bcom_queue_full(s->bcom_task))
- psc_dma_bcom_enqueue_next_buffer(s);
- } else {
- else
bcom_gen_bd_tx_reset(s->bcom_task);
- psc_dma_bcom_enqueue_tx(s);
- }
- for (i = 0; i < runtime->periods; i++)
- if (!bcom_queue_full(s->bcom_task))
- psc_dma_bcom_enqueue_next_buffer(s);
bcom_enable(s->bcom_task); spin_unlock_irqrestore(&psc_dma->lock, flags); diff --git a/sound/soc/fsl/mpc5200_dma.h b/sound/soc/fsl/mpc5200_dma.h index 529f7a0..d9c741b 100644 --- a/sound/soc/fsl/mpc5200_dma.h +++ b/sound/soc/fsl/mpc5200_dma.h @@ -19,8 +19,6 @@ */ struct psc_dma_stream { struct snd_pcm_runtime *runtime;
- snd_pcm_uframes_t appl_ptr;
int active; struct psc_dma *psc_dma; struct bcom_task *bcom_task;
-- Jon Smirl jonsmirl@gmail.com
On Sat, Nov 7, 2009 at 6:04 AM, Jon Smirl jonsmirl@gmail.com wrote:
On Sat, Nov 7, 2009 at 7:51 AM, Jon Smirl jonsmirl@gmail.com wrote:
On Sat, Nov 7, 2009 at 3:34 AM, Grant Likely grant.likely@secretlab.ca wrote:
Sound drivers PCM DMA is supposed to free-run until told to stop by the trigger callback. The current code tries to track appl_ptr, to avoid stale buffer data getting played out at the end of the data stream. Unfortunately it also results in race conditions which can cause the audio to stall.
I leave in an hour and I will be off net for a week so I can't look at these.
There is a surefire way to fix this but I have resisted doing it because it is fixing a symptom not a cause.
Simply have the driver zero out the buffer in the completion interrupt before handing it back to ALSA. Then if ALSA lets us play invalid data the invalid data will be silence. I implemented this and it works every time.
Downside is a big memset() in an IRQ handler.
... and then the driver may as well manually copy the audio data from the buffer into the PSC FIFO. No win here.
The other option (which I think is how ALSA is designed) is for userspace to insert silence at the end of playback data so that the stop trigger lands in a safe place.
g.
On Sat, Nov 7, 2009 at 5:51 AM, Jon Smirl jonsmirl@gmail.com wrote:
On Sat, Nov 7, 2009 at 3:34 AM, Grant Likely grant.likely@secretlab.ca wrote:
Sound drivers PCM DMA is supposed to free-run until told to stop by the trigger callback. The current code tries to track appl_ptr, to avoid stale buffer data getting played out at the end of the data stream. Unfortunately it also results in race conditions which can cause the audio to stall.
I leave in an hour and I will be off net for a week so I can't look at these.
Okay, no problem. I can be patient.
The problem at end of stream works like this:
last buffer containing music plays this buffer has been padded with zero to make a full sample interrupt occurs at end of buffer --- at this point the next chained buffer starts playing, it is full of junk --- this chaining happens in hardware Alsa processes the callback and sends stop stream --- oops, too late, buffer full of noise has already played several samples --- these samples of noise are clearly audible. --- they are usually a fragment from earlier in the song.
I'm not yet convinced that this sequence is correct. Well, I mean, I'm not convinced about the buffer only being filled to top up the current period. My understanding of ALSA is that the application is supposed to make sure there is enough silence in the buffer to handle the lag between notification that the last period with valid data has been played out and the stop trigger.
Using aplay with short clips like the action sounds for pidgin, etc makes these noise bursts obviously visible.
Yup, I've got a bunch of clips that I can reproduce the problem with, and I can reproduce it reliably using aplay. However, the problem I'm seeing seems to be related to a dev_dbg() call in the trigger stop path. When KERN_DEBUG messages get sent to the console, and the console is one of the PSC ports, then I get the replayed sample artifact at the end. However, if I 'tail -f /var/log/kern.log', then I still get to see the debug output, but the audible artifact doesn't occur. That says to me that the real problem is an unbounded latency caused by another part of the kernel (the tty console in this case).
It seems to me that aplay doesn't silence out very many buffers past the end of the sample, but I won't know for sure until I instrument it to see what data is present in the buffers. I'll do that next week.
To fix this you need a mechanism to determine where the valid data in the buffering system ends and noise starts. Once you know the extent of the valid data we can keep the DMA channel programmed to not play invalid data. You can play off the end of valid data two ways; under run when ALSA doesn't supply data fast enough and end of buffer.
Underrun is a realtime failure. ALSA handles it by triggering STOP and START of the stream AFAIKT. Just about all ALSA drivers using DMA will end up replaying buffers if the kernel cannot keep up with hardware.
End of buffer seems to be the responsibility of userspace, but I need to investigate this more.
My experiments to this point seems to suggest that when you hear the artifacts it is due to both the end of buffer condition, and a realtime failure in executing the stop trigger.
ALSA does not provide information on where valid data ends in easily consumable form so I've been trying to reconstruct it from appl_ptr. A much cleaner solution would be for ALSA to provide a field that indicates the last valid address in the ring buffer system. Then in the driver's buffer complete callback I could get that value and reprogram the DMA engine not to run off the end of valid data. As each buffer completes I would reread the value and update the DMA stop address. You also need the last valid address field when DMA is first started.
... assuming that audio needs to stop exactly at the end of valid data. But if the last few periods are silence, then this assumption isn't true.
g.
On Sat, Nov 07, 2009 at 11:51:16AM -0700, Grant Likely wrote:
On Sat, Nov 7, 2009 at 5:51 AM, Jon Smirl jonsmirl@gmail.com wrote:
current period. My understanding of ALSA is that the application is supposed to make sure there is enough silence in the buffer to handle the lag between notification that the last period with valid data has been played out and the stop trigger.
This is certainly the most robust approach for applications. For a large proportion of hardware it won't matter too much since they're able to shut down the audio very quickly but that can't be entirely relied upon, especially at higher rates on slower machines.
occur. That says to me that the real problem is an unbounded latency caused by another part of the kernel (the tty console in this case).
That's certainly not going to help anything here - if a delay is introduced in telling the hardware to shut down the DMA then that increases the chance for the DMA controller to start pushing valid audio data from the buffer to the audio interface.
A much cleaner solution would be for ALSA to provide a field that indicates the last valid address in the ring buffer system. Then in the driver's buffer complete callback I could get that value and reprogram the DMA engine not to run off the end of valid data. As each buffer completes I would reread the value and update the DMA stop address. You also need the last valid address field when DMA is first started.
... assuming that audio needs to stop exactly at the end of valid data. But if the last few periods are silence, then this assumption isn't true.
Indeed, it makes the whole thing much more reliable.
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialised some data but not yet managed to update the driver to tell it it's being handed over; if the driver just carries on running through the data there's a reasonable chance nobody will notice that case.
On Sat, Nov 7, 2009 at 3:12 PM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Sat, Nov 07, 2009 at 11:51:16AM -0700, Grant Likely wrote:
On Sat, Nov 7, 2009 at 5:51 AM, Jon Smirl jonsmirl@gmail.com wrote:
current period. My understanding of ALSA is that the application is supposed to make sure there is enough silence in the buffer to handle the lag between notification that the last period with valid data has been played out and the stop trigger.
This is certainly the most robust approach for applications. For a large proportion of hardware it won't matter too much since they're able to shut down the audio very quickly but that can't be entirely relied upon, especially at higher rates on slower machines.
I can comment but no access to test equipment...
Playing invalid data always happens in the current ALSA model. The only question is does enough of it play to be audible.
on the mpc5200 batched driver... At the end of song ALSA only pads out the last period with silence. When this buffer is fully enqueued the DMA hardware generates an interrupt. The DMA hardware also begins enqueuing the next period.
Now we start a race. Can ALSA come back from that interrupt and stop the playing of the enqueued data before it makes it out of the FIFO? An mpc5200 is slow enough that the CPU never makes it back in time. This isn't a latency problem, it never makes it back in time. Latency issues just make it play more invalid data.
There are two solutions: 1) tell me where the end of the valid data is. That allows me to program the hardware to not enqueue the invalid data. 2) For batched hardware, pad an extra period with silence after the end of the stream. (that what zeroing the buffer before handing it back to ALSA
I believe this race is present in all ALSA drivers. It's just a lot harder to hit on different hardware. For example to hit it on Intel HDA which is non-batched hardware, the song would need to end right at the end of a period. Then the interrupt latency would need to be bad enough that some invalid data got played. But x86 CPUs are very fast so it is rare for the interrupt latency to be bad enough that the stream doesn't get stopped in time.
occur. That says to me that the real problem is an unbounded latency caused by another part of the kernel (the tty console in this case).
That's certainly not going to help anything here - if a delay is introduced in telling the hardware to shut down the DMA then that increases the chance for the DMA controller to start pushing valid audio data from the buffer to the audio interface.
A much cleaner solution would be for ALSA to provide a field that indicates the last valid address in the ring buffer system. Then in the driver's buffer complete callback I could get that value and reprogram the DMA engine not to run off the end of valid data. As each buffer completes I would reread the value and update the DMA stop address. You also need the last valid address field when DMA is first started.
... assuming that audio needs to stop exactly at the end of valid data. But if the last few periods are silence, then this assumption isn't true.
Indeed, it makes the whole thing much more reliable.
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialized some data but not yet managed to update the driver to tell it it's being handed over; if the driver just carries on running through the data there's a reasonable chance nobody will notice that case.
That's an under run condition.
ALSA would need to track how many periods the driver has queue. If ALSA has received enough interrupts to know that the driver is playing the last period, it would not be safe to just tack the data onto the end. You would also need to call into the driver with a 'append' call. That's because it isn't possible for ALSA to deterministically know if the last period has finished playing or not. In the 'append' call implementation the driver would determine if the DMA hardware was still running and restart it if needed.
On Wed, Nov 11, 2009 at 11:38:06AM -0500, Jon Smirl wrote:
There are two solutions:
- tell me where the end of the valid data is. That allows me to
program the hardware to not enqueue the invalid data. 2) For batched hardware, pad an extra period with silence after the end of the stream. (that what zeroing the buffer before handing it back to ALSA
You've also got the option of lying about where the hardware is in some form in order to give you more headroom.
I'm not sure what you mean by "batched hardware" here.
I believe this race is present in all ALSA drivers. It's just a lot harder to hit on different hardware. For example to hit it on Intel HDA which is non-batched hardware, the song would need to end right at the end of a period. Then the interrupt latency would need to be bad enough that some invalid data got played. But x86 CPUs are very fast so it is rare for the interrupt latency to be bad enough that the stream doesn't get stopped in time.
The potential is there for this to happen on any hardware, yes. On the other hand, it's not been a pressing issue elswhere - including on things like older ARM systems which aren't exactly noted for their snappy performance. It really does sound like there's something special going on with these systems that's at least somewhat unique to them.
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialized some data but not yet managed to update the driver to tell it it's being handed over; if the driver
That's an under run condition.
Yes, of course - the issue is that this approach encourages them, making the system less robust if things are on the edge. The mpc5200 seems to be not just on the edge but comfortably beyond it for some reason.
On Wed, Nov 11, 2009 at 11:37 AM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Wed, Nov 11, 2009 at 11:38:06AM -0500, Jon Smirl wrote:
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialized some data but not yet managed to update the driver to tell it it's being handed over; if the driver
That's an under run condition.
Yes, of course - the issue is that this approach encourages them, making the system less robust if things are on the edge. The mpc5200 seems to be not just on the edge but comfortably beyond it for some reason.
I can't reproduce the issue at all as long at the dev_dbg() statement in the trigger stop path is disabled. With it enabled, I hear the problem every time. The 5200 may not be a speedy beast, but it is plenty fast enough to shut down the audio stream before stale data starts getting played out.
g.
On 11 Nov 2009, at 19:24, Grant Likely grant.likely@secretlab.ca wrote:
On Wed, Nov 11, 2009 at 11:37 AM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Wed, Nov 11, 2009 at 11:38:06AM -0500, Jon Smirl wrote:
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialized some data but not yet managed to update the driver to tell it it's being handed over; if the driver
That's an under run condition.
Yes, of course - the issue is that this approach encourages them, making the system less robust if things are on the edge. The mpc5200 seems to be not just on the edge but comfortably beyond it for some reason.
I can't reproduce the issue at all as long at the dev_dbg() statement in the trigger stop path is disabled. With it enabled, I hear the problem every time. The 5200 may not be a speedy beast, but it is plenty fast enough to shut down the audio stream before stale data starts getting played out.
Yes, that does sound entirely consistent with the problem being a latency issue if you're sending the dev_dbg() output to the serial port. I'd be surprised if it were anything else to be honest.
On Wed, Nov 11, 2009 at 2:24 PM, Grant Likely grant.likely@secretlab.ca wrote:
On Wed, Nov 11, 2009 at 11:37 AM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Wed, Nov 11, 2009 at 11:38:06AM -0500, Jon Smirl wrote:
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialized some data but not yet managed to update the driver to tell it it's being handed over; if the driver
That's an under run condition.
Yes, of course - the issue is that this approach encourages them, making the system less robust if things are on the edge. The mpc5200 seems to be not just on the edge but comfortably beyond it for some reason.
I can't reproduce the issue at all as long at the dev_dbg() statement in the trigger stop path is disabled. With it enabled, I hear the problem every time. The 5200 may not be a speedy beast, but it is plenty fast enough to shut down the audio stream before stale data starts getting played out.
"fast enough" - you just said it is a race. I've been saying it is a race too.
There are two options: 1) Eliminate the race by developing a system to deterministically flag the end of valid data. 2) Fudge everything around making it almost impossible to lose the race, but the race is still there.
The dev_dbg() aggravates the race until it is obviously visible every time. A deterministic solution would not be impacted by the dev_dbg().
Implementing a deterministic solution requires support from ALSA core.
On Wed, Nov 11, 2009 at 2:34 PM, Jon Smirl jonsmirl@gmail.com wrote:
On Wed, Nov 11, 2009 at 2:24 PM, Grant Likely grant.likely@secretlab.ca wrote:
On Wed, Nov 11, 2009 at 11:37 AM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Wed, Nov 11, 2009 at 11:38:06AM -0500, Jon Smirl wrote:
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialized some data but not yet managed to update the driver to tell it it's being handed over; if the driver
That's an under run condition.
Yes, of course - the issue is that this approach encourages them, making the system less robust if things are on the edge. The mpc5200 seems to be not just on the edge but comfortably beyond it for some reason.
I can't reproduce the issue at all as long at the dev_dbg() statement in the trigger stop path is disabled. With it enabled, I hear the problem every time. The 5200 may not be a speedy beast, but it is plenty fast enough to shut down the audio stream before stale data starts getting played out.
"fast enough" - you just said it is a race. I've been saying it is a race too.
Yes, it is a race; but not the kind that is dangerous. Audio playout is always a real-time problem; whether in the middle of a stream or at the end. If the CPU gets nailed with an unbounded latency, then there will be audible artifacts - Regardless of whether the driver knows where the end of data is or not. If it does know, then audio will stutter. If it doesn't know, then there will be repeated samples. Both are nasty to the human ear. So, making the driver do extra work to keep the extra data in sync will probably force larger minimum latencies for playout (trouble for VoIP apps) so the CPU can keep up, and won't help one iota for making audio better.
The real solution is to fix the worst case latencies.
There are two options:
- Eliminate the race by developing a system to deterministically flag
the end of valid data. 2) Fudge everything around making it almost impossible to lose the race, but the race is still there.
3) eliminate the unbounded latencies (fix the PSC driver and/or use a real time kernel) 4) make sure userspace fills all the periods with silence before triggering stop. Gstreamer seems to already do this. I suspect pulseaudio does the same.
The dev_dbg() aggravates the race until it is obviously visible every time. A deterministic solution would not be impacted by the dev_dbg().
But it still wouldn't help a bit when the same latency occurs in the middle of playback.
g.
On Wed, Nov 11, 2009 at 4:57 PM, Grant Likely grant.likely@secretlab.ca wrote:
On Wed, Nov 11, 2009 at 2:34 PM, Jon Smirl jonsmirl@gmail.com wrote:
On Wed, Nov 11, 2009 at 2:24 PM, Grant Likely grant.likely@secretlab.ca wrote:
On Wed, Nov 11, 2009 at 11:37 AM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Wed, Nov 11, 2009 at 11:38:06AM -0500, Jon Smirl wrote:
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialized some data but not yet managed to update the driver to tell it it's being handed over; if the driver
That's an under run condition.
Yes, of course - the issue is that this approach encourages them, making the system less robust if things are on the edge. The mpc5200 seems to be not just on the edge but comfortably beyond it for some reason.
I can't reproduce the issue at all as long at the dev_dbg() statement in the trigger stop path is disabled. With it enabled, I hear the problem every time. The 5200 may not be a speedy beast, but it is plenty fast enough to shut down the audio stream before stale data starts getting played out.
"fast enough" - you just said it is a race. I've been saying it is a race too.
Yes, it is a race; but not the kind that is dangerous. Audio playout is always a real-time problem; whether in the middle of a stream or at the end. If the CPU gets nailed with an unbounded latency, then there will be audible artifacts - Regardless of whether the driver knows where the end of data is or not. If it does know, then audio will stutter. If it doesn't know, then there will be repeated samples. Both are nasty to the human ear. So, making the driver do extra work to keep the extra data in sync will probably force larger minimum latencies for playout (trouble for VoIP apps) so the CPU can keep up, and won't help one iota for making audio better.
I don't think it is that much more work for ALSA to provide an accessible field indicating the end of valid data. It's already tracking appl_ptr. Appl_ptr just needs to be translated into a physical DMA buffer address and we've been making mistakes doing that translation.
The real solution is to fix the worst case latencies.
There are two options:
- Eliminate the race by developing a system to deterministically flag
the end of valid data. 2) Fudge everything around making it almost impossible to lose the race, but the race is still there.
- eliminate the unbounded latencies (fix the PSC driver and/or use a
real time kernel) 4) make sure userspace fills all the periods with silence before triggering stop. Gstreamer seems to already do this. I suspect pulseaudio does the same.
The dev_dbg() aggravates the race until it is obviously visible every time. A deterministic solution would not be impacted by the dev_dbg().
But it still wouldn't help a bit when the same latency occurs in the middle of playback.
The deterministic solution of tracking the end of valid data ensures that under run will be silent instead of playing invalid data.
g.
-- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd.
On Wed, Nov 11, 2009 at 06:13:25PM -0500, Jon Smirl wrote:
I don't think it is that much more work for ALSA to provide an accessible field indicating the end of valid data. It's already tracking appl_ptr. Appl_ptr just needs to be translated into a physical DMA buffer address and we've been making mistakes doing that translation.
I'm sure if you provide reasonable patches they'll get applied; it doesn't seem likely that anyone else will work on this.
On Wed, Nov 11, 2009 at 4:57 PM, Grant Likely grant.likely@secretlab.ca wrote
But it still wouldn't help a bit when the same latency occurs in the middle of playback.
The deterministic solution of tracking the end of valid data ensures that under run will be silent instead of playing invalid data.
In the middle of playback a sudden silence period is very noticable - applications that expect data loss generally have some means to try to cover it up since sudden silence is so noticable to the human ear. Mid playback there's a bit more chance that random data from earlier in the playback will be acceptable, and if the application has actually updated the data but not got round to telling the driver about that yet then even better.
I think you're getting too focused on the fact that there is a race condition here since the consequences of race conditions are usually so severe. There's always going to be some hardware for which there's a degree of racyness (free running hardware has its own set of issues here) so the system design takes this into account.
On Wed, Nov 11, 2009 at 1:37 PM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Wed, Nov 11, 2009 at 11:38:06AM -0500, Jon Smirl wrote:
There are two solutions:
- tell me where the end of the valid data is. That allows me to
program the hardware to not enqueue the invalid data. 2) For batched hardware, pad an extra period with silence after the end of the stream. (that what zeroing the buffer before handing it back to ALSA
You've also got the option of lying about where the hardware is in some form in order to give you more headroom.
I'm not sure what you mean by "batched hardware" here.
SNDRV_PCM_INFO_BATCH
Hardware that can't give you the DMA position except at the end of DMA transfers.
I believe this race is present in all ALSA drivers. It's just a lot harder to hit on different hardware. For example to hit it on Intel HDA which is non-batched hardware, the song would need to end right at the end of a period. Then the interrupt latency would need to be bad enough that some invalid data got played. But x86 CPUs are very fast so it is rare for the interrupt latency to be bad enough that the stream doesn't get stopped in time.
The potential is there for this to happen on any hardware, yes. On the other hand, it's not been a pressing issue elswhere - including on things like older ARM systems which aren't exactly noted for their snappy performance. It really does sound like there's something special going on with these systems that's at least somewhat unique to them.
Providing a final valid data point to the driver would possibly even make things worse since if it were used then you'd have the equivalent race where the application has initialized some data but not yet managed to update the driver to tell it it's being handed over; if the driver
That's an under run condition.
Yes, of course - the issue is that this approach encourages them, making the system less robust if things are on the edge. The mpc5200 seems to be not just on the edge but comfortably beyond it for some reason.
Signed-off-by: Grant Likely grant.likely@secretlab.ca ---
sound/soc/fsl/mpc5200_dma.c | 15 ++++++++++----- sound/soc/fsl/mpc5200_dma.h | 1 + 2 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.c b/sound/soc/fsl/mpc5200_dma.c index 4e47586..658e3fa 100644 --- a/sound/soc/fsl/mpc5200_dma.c +++ b/sound/soc/fsl/mpc5200_dma.c @@ -77,6 +77,7 @@ static irqreturn_t psc_dma_bcom_irq_tx(int irq, void *_psc_dma_stream) bcom_retrieve_buffer(s->bcom_task, NULL, NULL);
s->period_current = (s->period_current+1) % s->runtime->periods; + s->period_count++;
psc_dma_bcom_enqueue_next_buffer(s); } @@ -101,6 +102,7 @@ static irqreturn_t psc_dma_bcom_irq_rx(int irq, void *_psc_dma_stream) bcom_retrieve_buffer(s->bcom_task, NULL, NULL);
s->period_current = (s->period_current+1) % s->runtime->periods; + s->period_count++;
psc_dma_bcom_enqueue_next_buffer(s); } @@ -142,17 +144,17 @@ static int psc_dma_trigger(struct snd_pcm_substream *substream, int cmd) else s = &psc_dma->playback;
- dev_dbg(psc_dma->dev, "psc_dma_trigger(substream=%p, cmd=%i)" - " stream_id=%i\n", - substream, cmd, substream->pstr->stream); - switch (cmd) { case SNDRV_PCM_TRIGGER_START: + dev_dbg(psc_dma->dev, "START: stream=%i fbits=%u ps=%u #p=%u\n", + substream->pstr->stream, runtime->frame_bits, + (int)runtime->period_size, runtime->periods); s->period_bytes = frames_to_bytes(runtime, runtime->period_size); s->period_next = 0; s->period_current = 0; s->active = 1; + s->period_count = 0; s->runtime = runtime;
/* Fill up the bestcomm bd queue and enable DMA. @@ -177,6 +179,8 @@ static int psc_dma_trigger(struct snd_pcm_substream *substream, int cmd) break;
case SNDRV_PCM_TRIGGER_STOP: + dev_dbg(psc_dma->dev, "STOP: stream=%i periods_count=%i\n", + substream->pstr->stream, s->period_count); s->active = 0;
spin_lock_irqsave(&psc_dma->lock, flags); @@ -190,7 +194,8 @@ static int psc_dma_trigger(struct snd_pcm_substream *substream, int cmd) break;
default: - dev_dbg(psc_dma->dev, "invalid command\n"); + dev_dbg(psc_dma->dev, "unhandled trigger: stream=%i cmd=%i\n", + substream->pstr->stream, cmd); return -EINVAL; }
diff --git a/sound/soc/fsl/mpc5200_dma.h b/sound/soc/fsl/mpc5200_dma.h index d9c741b..c6f29e4 100644 --- a/sound/soc/fsl/mpc5200_dma.h +++ b/sound/soc/fsl/mpc5200_dma.h @@ -27,6 +27,7 @@ struct psc_dma_stream { int period_next; int period_current; int period_bytes; + int period_count; };
/**
Move the resolving of the psc_dma_stream pointer to a helper function to reduce duplicate code
Signed-off-by: Grant Likely grant.likely@secretlab.ca ---
sound/soc/fsl/mpc5200_dma.c | 7 +------ sound/soc/fsl/mpc5200_dma.h | 9 +++++++++ 2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.c b/sound/soc/fsl/mpc5200_dma.c index 658e3fa..9c88e15 100644 --- a/sound/soc/fsl/mpc5200_dma.c +++ b/sound/soc/fsl/mpc5200_dma.c @@ -133,17 +133,12 @@ static int psc_dma_trigger(struct snd_pcm_substream *substream, int cmd) struct snd_soc_pcm_runtime *rtd = substream->private_data; struct psc_dma *psc_dma = rtd->dai->cpu_dai->private_data; struct snd_pcm_runtime *runtime = substream->runtime; - struct psc_dma_stream *s; + struct psc_dma_stream *s = to_psc_dma_stream(substream, psc_dma); struct mpc52xx_psc __iomem *regs = psc_dma->psc_regs; u16 imr; unsigned long flags; int i;
- if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE) - s = &psc_dma->capture; - else - s = &psc_dma->playback; - switch (cmd) { case SNDRV_PCM_TRIGGER_START: dev_dbg(psc_dma->dev, "START: stream=%i fbits=%u ps=%u #p=%u\n", diff --git a/sound/soc/fsl/mpc5200_dma.h b/sound/soc/fsl/mpc5200_dma.h index c6f29e4..956d6a5 100644 --- a/sound/soc/fsl/mpc5200_dma.h +++ b/sound/soc/fsl/mpc5200_dma.h @@ -68,6 +68,15 @@ struct psc_dma { } stats; };
+/* Utility for retrieving psc_dma_stream structure from a substream */ +inline struct psc_dma_stream * +to_psc_dma_stream(struct snd_pcm_substream *substream, struct psc_dma *psc_dma) +{ + if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE) + return &psc_dma->capture; + return &psc_dma->playback; +} + int mpc5200_audio_dma_create(struct of_device *op); int mpc5200_audio_dma_destroy(struct of_device *op);
On Sat, Nov 07, 2009 at 01:34:31AM -0700, Grant Likely wrote:
+/* Utility for retrieving psc_dma_stream structure from a substream */ +inline struct psc_dma_stream * +to_psc_dma_stream(struct snd_pcm_substream *substream, struct psc_dma *psc_dma) +{
- if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE)
return &psc_dma->capture;
- return &psc_dma->playback;
+}
Traditionally this'd be an if () else but it makes no difference either way to the generated code.
The MPC5200 AC97 driver is disabling the slots when a stop trigger is received, but not reenabling them if the stream is started again without processing the hw_params again.
This patch fixes the problem by caching the slot enable bit settings calculated at hw_params time so that they can be reapplied every time the start trigger is received.
Signed-off-by: Grant Likely grant.likely@secretlab.ca ---
sound/soc/fsl/mpc5200_dma.h | 4 ++++ sound/soc/fsl/mpc5200_psc_ac97.c | 39 ++++++++++++++++++++------------------ 2 files changed, 25 insertions(+), 18 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.h b/sound/soc/fsl/mpc5200_dma.h index 956d6a5..22208b3 100644 --- a/sound/soc/fsl/mpc5200_dma.h +++ b/sound/soc/fsl/mpc5200_dma.h @@ -16,6 +16,7 @@ * @period_end: physical address of end of DMA region * @period_next_pt: physical address of next DMA buffer to enqueue * @period_bytes: size of DMA period in bytes + * @ac97_slot_bits: Enable bits for turning on the correct AC97 slot */ struct psc_dma_stream { struct snd_pcm_runtime *runtime; @@ -28,6 +29,9 @@ struct psc_dma_stream { int period_current; int period_bytes; int period_count; + + /* AC97 state */ + u32 ac97_slot_bits; };
/** diff --git a/sound/soc/fsl/mpc5200_psc_ac97.c b/sound/soc/fsl/mpc5200_psc_ac97.c index c4ae3e0..3dbc7f7 100644 --- a/sound/soc/fsl/mpc5200_psc_ac97.c +++ b/sound/soc/fsl/mpc5200_psc_ac97.c @@ -130,6 +130,7 @@ static int psc_ac97_hw_analog_params(struct snd_pcm_substream *substream, struct snd_soc_dai *cpu_dai) { struct psc_dma *psc_dma = cpu_dai->private_data; + struct psc_dma_stream *s = to_psc_dma_stream(substream, psc_dma);
dev_dbg(psc_dma->dev, "%s(substream=%p) p_size=%i p_bytes=%i" " periods=%i buffer_size=%i buffer_bytes=%i channels=%i" @@ -140,20 +141,10 @@ static int psc_ac97_hw_analog_params(struct snd_pcm_substream *substream, params_channels(params), params_rate(params), params_format(params));
- - if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE) { - if (params_channels(params) == 1) - psc_dma->slots |= 0x00000100; - else - psc_dma->slots |= 0x00000300; - } else { - if (params_channels(params) == 1) - psc_dma->slots |= 0x01000000; - else - psc_dma->slots |= 0x03000000; - } - out_be32(&psc_dma->psc_regs->ac97_slots, psc_dma->slots); - + /* Determine the set of enable bits to turn on */ + s->ac97_slot_bits = (params_channels(params) == 1) ? 0x100 : 0x300; + if (substream->pstr->stream != SNDRV_PCM_STREAM_CAPTURE) + s->ac97_slot_bits <<= 16; return 0; }
@@ -163,6 +154,8 @@ static int psc_ac97_hw_digital_params(struct snd_pcm_substream *substream, { struct psc_dma *psc_dma = cpu_dai->private_data;
+ dev_dbg(psc_dma->dev, "%s(substream=%p)\n", __func__, substream); + if (params_channels(params) == 1) out_be32(&psc_dma->psc_regs->ac97_slots, 0x01000000); else @@ -176,14 +169,24 @@ static int psc_ac97_trigger(struct snd_pcm_substream *substream, int cmd, { struct snd_soc_pcm_runtime *rtd = substream->private_data; struct psc_dma *psc_dma = rtd->dai->cpu_dai->private_data; + struct psc_dma_stream *s = to_psc_dma_stream(substream, psc_dma);
switch (cmd) { + case SNDRV_PCM_TRIGGER_START: + dev_dbg(psc_dma->dev, "AC97 START: stream=%i\n", + substream->pstr->stream); + + /* Set the slot enable bits */ + psc_dma->slots |= s->ac97_slot_bits; + out_be32(&psc_dma->psc_regs->ac97_slots, psc_dma->slots); + break; + case SNDRV_PCM_TRIGGER_STOP: - if (substream->pstr->stream == SNDRV_PCM_STREAM_CAPTURE) - psc_dma->slots &= 0xFFFF0000; - else - psc_dma->slots &= 0x0000FFFF; + dev_dbg(psc_dma->dev, "AC97 STOP: stream=%i\n", + substream->pstr->stream);
+ /* Clear the slot enable bits */ + psc_dma->slots &= ~(s->ac97_slot_bits); out_be32(&psc_dma->psc_regs->ac97_slots, psc_dma->slots); break; }
ALSA playback seems to be more reliable if the .pointer() hook reports a value slightly into the period, rather than right on the period boundary. This patch adds a fudge factor of 1/4 the period size to better estimate the actual position of the DMA engine in the audio buffer.
Signed-off-by: Grant Likely grant.likely@secretlab.ca ---
sound/soc/fsl/mpc5200_dma.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/sound/soc/fsl/mpc5200_dma.c b/sound/soc/fsl/mpc5200_dma.c index 9c88e15..ee5a606 100644 --- a/sound/soc/fsl/mpc5200_dma.c +++ b/sound/soc/fsl/mpc5200_dma.c @@ -297,7 +297,7 @@ psc_dma_pointer(struct snd_pcm_substream *substream) else s = &psc_dma->playback;
- count = s->period_current * s->period_bytes; + count = (s->period_current * s->period_bytes) + (s->period_bytes >> 2);
return bytes_to_frames(substream->runtime, count); }
On Sat, Nov 07, 2009 at 01:34:55AM -0700, Grant Likely wrote:
ALSA playback seems to be more reliable if the .pointer() hook reports a value slightly into the period, rather than right on the period boundary. This patch adds a fudge factor of 1/4 the period size to better estimate the actual position of the DMA engine in the audio buffer.
It occurs to me that in terms of dealing with what's going on here this probably is achieving exactly the same effect as Jon's code in that it tells ALSA that things are a bit ahead of where the buffer started.
On Sat, Nov 7, 2009 at 11:11 AM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Sat, Nov 07, 2009 at 01:34:55AM -0700, Grant Likely wrote:
ALSA playback seems to be more reliable if the .pointer() hook reports a value slightly into the period, rather than right on the period boundary. This patch adds a fudge factor of 1/4 the period size to better estimate the actual position of the DMA engine in the audio buffer.
It occurs to me that in terms of dealing with what's going on here this probably is achieving exactly the same effect as Jon's code in that it tells ALSA that things are a bit ahead of where the buffer started.
Possibly, but I can both reproduce and eliminate the problem Jon is seeing regardless of whether or not this patch, so I'm not yet convinced.
g.
On Sat, Nov 07, 2009 at 11:19:54AM -0700, Grant Likely wrote:
On Sat, Nov 7, 2009 at 11:11 AM, Mark Brown
It occurs to me that in terms of dealing with what's going on here this probably is achieving exactly the same effect as Jon's code in that it tells ALSA that things are a bit ahead of where the buffer started.
Possibly, but I can both reproduce and eliminate the problem Jon is seeing regardless of whether or not this patch, so I'm not yet convinced.
That doesn't entirely surprise me; I'm not convinced that the original approach entirely deals with the issue rather than just making it much harder to see.
On Sat, Nov 7, 2009 at 12:33 PM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Sat, Nov 07, 2009 at 11:19:54AM -0700, Grant Likely wrote:
On Sat, Nov 7, 2009 at 11:11 AM, Mark Brown
It occurs to me that in terms of dealing with what's going on here this probably is achieving exactly the same effect as Jon's code in that it tells ALSA that things are a bit ahead of where the buffer started.
Possibly, but I can both reproduce and eliminate the problem Jon is seeing regardless of whether or not this patch, so I'm not yet convinced.
That doesn't entirely surprise me; I'm not convinced that the original approach entirely deals with the issue rather than just making it much harder to see.
Indeed. I'm at the point where I'm far more interested in achieving "correctness" than trying to hobble together a set of conditions that appears to work most of the time.
g.
On Sat, Nov 07, 2009 at 01:33:40AM -0700, Grant Likely wrote:
However, I was able to reproduce the noise problem when using aplay if I have DEBUG #defined at the top of the mpc5200_dma.c file with debug console logs being spit out the serial port. In that situation, the STOP trigger calls printk(), and a stale sample can be heard at the end of playback. However, I believe this is a bug with the serial console driver (in that it disables IRQs for a long time) causing unbounded latencies, so the trigger doesn't get processed fast enough.
Yes, that will always be a problem with free running DMA - if it's not shut down quickly enough then it'll just keep processing data. Serial ports don't tend to play well with instrumenting it, sadly.
So, please test. Let me know if these work or not. I've don't know if the last patch (Add fudge factor...) is needed or not.
I've applied patches 1-5 since they seem fairly clear code cleanups and fixes. If they've introduced any problems we can fix them incrementally. I'll wait to see what the outcome of testing is for patch 6.
As I mentioned on IRC testing with PulseAudio would be good for this - it makes more demands on the quality of the DMA implementation than most applications.
On Sat, Nov 7, 2009 at 5:57 AM, Mark Brown broonie@opensource.wolfsonmicro.com wrote:
On Sat, Nov 07, 2009 at 01:33:40AM -0700, Grant Likely wrote:
So, please test. Let me know if these work or not. I've don't know if the last patch (Add fudge factor...) is needed or not.
I've applied patches 1-5 since they seem fairly clear code cleanups and fixes. If they've introduced any problems we can fix them incrementally. I'll wait to see what the outcome of testing is for patch 6.
Cool, thanks!
As I mentioned on IRC testing with PulseAudio would be good for this - it makes more demands on the quality of the DMA implementation than most applications.
I had trouble getting pulseaudio to work on my headless system. Couldn't get clients to connect. I'll hack on it again next week.
g.
participants (4)
-
Grant Likely
-
Jon Smirl
-
Liam Girdwood
-
Mark Brown