[alsa-devel] [PATCH 1/2] ALSA: firewire: process packets in 'struct snd_pcm_ops.ack' callback

Takashi Sakamoto o-takashi at sakamocchi.jp
Tue Jun 13 00:49:37 CEST 2017


Hi,

On Jun 8 2017 06:20, Takashi Iwai wrote:
> From: Takashi Iwai <tiwai at suse.de>
> Subject: [PATCH] ALSA: pcm: Suppress status/control mmap when ack ops is
>   present
> 
> The drivers using PCM ack ops require the notification whenever
> appl_ptr is updated in general.  But when the PCM status/control page
> is mmapped, this notification doesn't happen, per design, thus it's
> not guaranteed to receive the fine-grained updates.
> 
> For improving the situation, this patch simply suppresses the PCM
> status/control mmap when ack ops is defined.  At least, for all
> existing drivers with ack, this should give more benefit.
> 
> Once when we really need the full optimization with status/control
> mmap even using ack ops, we may reconsider the check, e.g. introducing
> a new flag.  But, so far, this should be good enough.
> 
> Signed-off-by: Takashi Iwai <tiwai at suse.de>
> ---
>   sound/core/pcm_native.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
> index 2bde07a4a87f..b993b0420411 100644
> --- a/sound/core/pcm_native.c
> +++ b/sound/core/pcm_native.c
> @@ -3189,6 +3189,10 @@ static int snd_pcm_mmap_status(struct snd_pcm_substream *substream, struct file
>   			       struct vm_area_struct *area)
>   {
>   	long size;
> +
> +	/* suppress status/control mmap when driver requires ack */
> +	if (substream->ops->ack)
> +		return -ENXIO;
>   	if (!(area->vm_flags & VM_READ))
>   		return -EINVAL;
>   	size = area->vm_end - area->vm_start;
> @@ -3225,6 +3229,10 @@ static int snd_pcm_mmap_control(struct snd_pcm_substream *substream, struct file
>   				struct vm_area_struct *area)
>   {
>   	long size;
> +
> +	/* suppress status/control mmap when driver requires ack */
> +	if (substream->ops->ack)
> +		return -ENXIO;
>   	if (!(area->vm_flags & VM_READ))
>   		return -EINVAL;
>   	size = area->vm_end - area->vm_start;

Let me evaluate this patch in a point of overhead at kernel/userspace 
interaction.

When considering about shape of ALSA PCM applications, I can show it by
below simple pseudo code:

```
while:
   poll(2)
   calculate how many PCM frames are available.
   memcpy(2)
```

To satisfy requirements of some drivers, we need to find a way to take
userspace applications to commit the number of handled PCM frames to
kernel space after the memcpy(2). For this purpose, in ALSA PCM
interface, SNDRV_PCM_IOCTL_SYNC_PTR is available.

```
while:
   poll(2)
   calculate how many PCM frames are available.
   memcpy(2)
   ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR
```

Well, as an actual solution, this patch disallows userspace applications
to map page frame for the status/control data of PCM substream, because
alsa-lib has fallback mode for this case to utilize
SNDRV_PCM_IOCTL_SYNC_PTR. This is a similar situation to ARM platform.
As I described in a commit fccf53881e9b ("ALSA: pcm: add 'applptr' event
of tracepoint"), ALSA PCM core doesn't perform page frame mapping for
the status/control data[1].

In the commit message, I compared tracing data of tracepoints and log of
strace. In the case, the number of calls for ioctl(2) with
SNDRV_PCM_IOCTL_SYNC_PTR in a loop with a call of poll(2) is seven. One
of the sevel calls is actually used to commit the number of PCM frames.
There's 6 extra system calls. These 6 calls are just used to check
current status of runtime of PCM substream, and on cache coherent
architecture, they can be suppressed by the page frame mapping. The poll
loop repeats during runtime, thus it's important to analyze overhead of
the extra system calls.

I suggest 3 points to evaluate this patch.

1. CPU time for the system call
2. disabling kernel preemption and local IRQ
3. backward compatibility

Here, I describe my opinion to this patch for each of these points.

1. CPU time for the system call
System calls cost expensive than usual API calls. As long as
investigated on x86_64 architecture on perf-stat, additional 6 call of
ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR increases 5,000-15,000
instructions in program execution. Recent hardware is enough fast. It
might not be so large disadvantage.

However, when drivers implement 'struct snd_pcm_ops.ack' callback, I
have different opinion; it's too frequent. In either drivers with
indirect layer and packet-oriented drivers, the callback is used to
process PCM frames in intermediate buffer for userspace. The frequent
call of the callback is not enough effective because the applications is
programmed with a manner to be accurate for actual time somehow, thus
they don't process more PCM frames in one poll loop.

2. disabling kernel preemption and local IRQ
In kernel land, service routine for SNDRV_PCM_IOCTL_SYNC_PTR partly runs
with acquired PCM stream lock, which disables kernel preemption and
local IRQ usually for running CPU. This lock is required because the PCM
stream is handled in any handler for hardware events, but this situation
can reduce realtime performance because no process or service routine for
interrupt can't use the CPU. If we have the other options, it's worth to
reconsider.

Intel SST-based drivers use nonatomic flag for the lock. In this case,
kernel preemption is enabled but IRQ is disabled. I think Intel
developers work is for their embedded solution. The care of realtime
performance might be good for their work, too.

3. backward compatibility
For backward compatibility, this patch is a simple solution. Even if
patched kernel runs with old version of relevant stuffs, tinyalsa or
alsa-lib, things work well as expected.

However, without this patch, things still work well. In either drivers
with indirect layer or packet-oriented drivers, queued PCM frames are
transferred according to any hardware events. For such drivers, commit
notification from applications just expands their probability to
improve performance; e.g. latency between queueing and transmission.

I think it better to back to your initial direction; adding info flag,
let stuffs in userspace parse it, perform the commit for the number of
handled PCM frames in 'snd_pcm_hw_mmap_commit()'.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/commit/?id=fccf53881e9b564321326f62ed5f85130fa6e569


Regards

Takashi Sakamoto


More information about the Alsa-devel mailing list