[alsa-devel] Bug in alsa-lib causes high CPU with rate plugin

Takashi Iwai tiwai at suse.de
Fri Nov 9 17:57:05 CET 2012


At Thu, 8 Nov 2012 13:47:52 -0800,
Trent Piepho wrote:
> 
> There is a bug in the way alsa-lib sleeps when using a plugin that can
> cause very high CPU usage for PCM playback with using rate conversion.
> 
> I've attached a simple test program.  It simply uses blocking mode and
> calls snd_pcm_writei() with a buffer of zeros one thousand times and
> then exits.  One would think the CPU usage would be very low.  And
> indeed when the write sized passed to snd_pcm_writei is equal to the
> period size it is quite low (i5 with HDA):
> 
> lucid at Dashboard:~$ /usr/bin/time ./a.out > /dev/null
> period size 352, write size 352
> done, wrote 1000 buffers of 352 frames
> 0.10user 0.15system 0:09.01elapsed 2%CPU (0avgtext+0avgdata 9360maxresident)k
> 
> But, with one small change to have snd_pcm_writei() write buffers one
> frame larger than the period size, one gets this:
> 
> period size 352, write size 353
> done, wrote 1000 buffers of 353 frames
> 1.92user 2.00system 0:09.04elapsed 43%CPU (0avgtext+0avgdata 9360maxresident)k
> 
> Simply increasing the write size by one caused the CPU usage to jump
> to 43% from 2%!  Because more software writes in chunks equal to the
> period size it doesn't see this problem.  That's why it wasn't found
> and fixed long ago.
> 
> I've figured out why this is happening.  The problem is caused by how
> alsa-lib decides to sleep when waiting for available buffer space and
> the way the rate plugin can only process complete periods at a time.
> 
> Suppose we have the rate plugin with hw as its slave.  Both have a
> ring buffer that is 3+ periods in size.  Allowing for the rate
> conversion, the buffers and periods are the same size for both.
> avail_min is set to the normal value of one period for both.
> 
> Suppose there are two empty periods in the buffer and we call
> snd_pcm_writei() to write 1.5 periods.  That fits in the empty space
> in the rate plugin's buffer and so gets written immediately.  The rate
> plugin's buffer now has 0.5 empty periods of space.  The rate plugin
> will resample just one period of the newly written data, because it
> can only process complete periods at a time, and write that to the hw
> buffer.  So the hw buffer now has one free period.  This means the
> rate plugin has less free space in its buffer than the hw has, because
> the rate plugin has half a period of data it can't yet resample and
> pass to the slave.
> 
> Now we write another one period of data with another snd_pcm_writei()
> call.  This gets to snd_pcm_write_areas() in pcm.c and triggers this
> code (edited for brevity):
> 
>                 if ((state == SND_PCM_STATE_RUNNING &&
>                      (snd_pcm_uframes_t)avail < pcm->avail_min &&
>                      size > (snd_pcm_uframes_t)avail)) {
>                         err = snd_pcm_wait(pcm, -1);
>                         goto _again;
>                 }
> 
> avail is the available space in the rate plugin's buffer, which is
> half a period.  pcm->avail_min is also for the rate plugin.  size is
> the amount remaining in the writei() call, which is one period in this
> example.
> 
> As we can see, the conditions of the if() statement are true.  avail
> of 0.5 periods is less than avail_min of 1 period and the size of the
> write (1 period) is greater than avail.  So we call snd_pcm_wait() and
> then go back to the top when it returns.  avail will get updated to
> the current value and this code runs again.
> 
> Eventually snd_pcm_wait() will call poll() and try to sleep in the
> kernel while waiting for more room to be available.  But remember the
> hw buffer had one full period free!  That's more than tghe half period
> that's free in the rate buffer and, and this is the key here, it is
> *more than avail_min*.  So poll() will not sleep at all!  That's where
> the 43% CPU comes from.  alsa-lib is in a tight loop inside
> snd_pcm_write_areas() calling snd_pcm_wait() over and over again.
> 
> snd_pcm_wait() doesn't sleep, because it's based on the status of the
> hw buffer and that buffer does have free space.  snd_pcm_write_areas()
> would write to the hw buffer and not try to sleep.  But it's not
> looking at the hw buffer when it decides to sleep or write, it's
> looking at the rate buffer.  And the rate buffer's fill level says to
> sleep, not write.  And if snd_pcm_wait() was looking at the rate
> buffer, it would sleep.
> 
> We have the rate buffer that's current fill level says, "sleep first,
> then write" and the HW buffer that says the opposite, "write first,
> then sleep".  The code that writes looks at the former and decides to
> sleep but the code that sleeps looks at the latter and decides it's
> not time sleep.
> 
> A trivial patch that stops this behavior is below, which simply forces
> the rate plugin's avail_min to 1.
> 
> diff --git a/src/pcm/pcm_rate.c b/src/pcm/pcm_rate.c
> index 41089d7..86f57c4 100644
> --- a/src/pcm/pcm_rate.c
> +++ b/src/pcm/pcm_rate.c
> @@ -1142,6 +1142,7 @@ static int snd_pcm_rate_start(snd_pcm_t *pcm)
>                 return 0;
>         }
>         rate->start_pending = 0;
> +       pcm->avail_min = 1;
>         return snd_pcm_start(rate->gen.slave);
>  }
> 
> I think it would be better to have the rate plugin decrease avail_min
> by how much unresampled data it has yet to pass to the slave.  This
> preserves the idea of avail_min if it were set to more than one
> period.  This also fixes the problem.  I've attached a real patch to
> do that.

Yeah, this is a known longtime PITA in the rate plugin.

Tweaking avail_min is also an option, but I'm afraid it's too
hackish.

The primary problem is the discrepancy of the state "I have to wait
until period size becomes available" between the rate plugin and the
underlying PCM.  Since the poll is performed only on the slave PCM,
checking the rate plugin's avail is bogus in such a state.  In other
words, if the rate plugin puts the decision for the sleep simply
depending on the slave state, there will be no inconsistency between
them.

Below is a quick fix.  Does it work for you?


thanks,

Takashi

---
diff --git a/src/pcm/pcm.c b/src/pcm/pcm.c
index 5880057..33e39c6 100644
--- a/src/pcm/pcm.c
+++ b/src/pcm/pcm.c
@@ -2339,6 +2339,15 @@ int snd_pcm_open_named_slave(snd_pcm_t **pcmp, const char *name,
 }
 #endif
 
+/* return true if the PCM stream needs for waiting */
+static int check_for_wait(snd_pcm_t *pcm, snd_pcm_uframes_t avail)
+{
+	if (pcm->fast_ops->check_for_wait)
+		return pcm->fast_ops->check_for_wait(pcm, avail);
+	else
+		return avail < pcm->avail_min;
+}
+
 /**
  * \brief Wait for a PCM to become ready
  * \param pcm PCM handle
@@ -2352,7 +2361,7 @@ int snd_pcm_open_named_slave(snd_pcm_t **pcmp, const char *name,
  */
 int snd_pcm_wait(snd_pcm_t *pcm, int timeout)
 {
-	if (snd_pcm_mmap_avail(pcm) >= pcm->avail_min) {
+	if (check_for_wait(pcm, snd_pcm_mmap_avail(pcm))) {
 		/* check more precisely */
 		switch (snd_pcm_state(pcm)) {
 		case SND_PCM_STATE_XRUN:
@@ -6771,14 +6780,14 @@ snd_pcm_sframes_t snd_pcm_write_areas(snd_pcm_t *pcm, const snd_pcm_channel_area
 			goto _end;
 		}
 		if ((state == SND_PCM_STATE_RUNNING &&
-		     (snd_pcm_uframes_t)avail < pcm->avail_min &&
-		     size > (snd_pcm_uframes_t)avail)) {
+		     size > (snd_pcm_uframes_t)avail &&
+		     check_for_wait(pcm, (snd_pcm_uframes_t)avail))) {
 			if (pcm->mode & SND_PCM_NONBLOCK) {
 				err = -EAGAIN;
 				goto _end;
 			}
 
-			err = snd_pcm_wait(pcm, -1);
+			err = snd_pcm_wait_nocheck(pcm, -1);
 			if (err < 0)
 				break;
 			goto _again;			
diff --git a/src/pcm/pcm_local.h b/src/pcm/pcm_local.h
index e7798fd..1ce6a64 100644
--- a/src/pcm/pcm_local.h
+++ b/src/pcm/pcm_local.h
@@ -177,6 +177,7 @@ typedef struct {
 	int (*poll_descriptors_count)(snd_pcm_t *pcm);
 	int (*poll_descriptors)(snd_pcm_t *pcm, struct pollfd *pfds, unsigned int space);
 	int (*poll_revents)(snd_pcm_t *pcm, struct pollfd *pfds, unsigned int nfds, unsigned short *revents);
+	int (*check_for_wait)(snd_pcm_t *pcm, snd_pcm_uframes_t avail);
 } snd_pcm_fast_ops_t;
 
 struct _snd_pcm {
diff --git a/src/pcm/pcm_rate.c b/src/pcm/pcm_rate.c
index 4ba8521..c0a28aa 100644
--- a/src/pcm/pcm_rate.c
+++ b/src/pcm/pcm_rate.c
@@ -1067,6 +1067,17 @@ static int snd_pcm_rate_poll_revents(snd_pcm_t *pcm, struct pollfd *pfds, unsign
 	return snd_pcm_poll_descriptors_revents(rate->gen.slave, pfds, nfds, revents);
 }
 
+/* we rely on the available space in the slave instead of the rate PCM itself,
+ * since there might be pending samples in the rate plugin buffer although
+ * the slave has an empty period.
+ */
+static int snd_pcm_rate_check_for_wait(snd_pcm_t *pcm, snd_pcm_uframes_t avail)
+{
+	snd_pcm_rate_t *rate = pcm->private_data;
+	snd_pcm_t *slave = rate->gen.slave;
+	return snd_pcm_avail_update(slave) < slave->avail_min;
+}
+
 static int snd_pcm_rate_drain(snd_pcm_t *pcm)
 {
 	snd_pcm_rate_t *rate = pcm->private_data;
@@ -1234,6 +1245,7 @@ static const snd_pcm_fast_ops_t snd_pcm_rate_fast_ops = {
 	.poll_descriptors_count = snd_pcm_generic_poll_descriptors_count,
 	.poll_descriptors = snd_pcm_generic_poll_descriptors,
 	.poll_revents = snd_pcm_rate_poll_revents,
+	.check_for_wait = snd_pcm_rate_check_for_wait,
 };
 
 static const snd_pcm_ops_t snd_pcm_rate_ops = {


More information about the Alsa-devel mailing list