[alsa-devel] [PATCH 0/2][RFC] ALSA: firewire-lib: change packet processing timing in process context
In IEC 61883-1/6 engine in ALSA firewire stack, a work to process packet is done in three places: 1. bottom-half of handler for interrupts scheduled on 1394OHCI controller 2. a callback of struct 'snd_pcm_ops.pointer()' 3. a callback of struct 'snd_pcm_ops.ack()'
This patchset adds optimization for 3., and remove 2. for efficiency.
In Linux kernel v2.6.39, 1. and 2. was committed in initial implementation of this engine at a commit 31ef9134eb52('ALSA: add LaCie FireWire Speakers/Griffin FireWave Surround driver'). This engine schedules hardware interrupts 16 packets (=INTERRUPT_INTERVAL) thus 1. appears in software IRQ context every 2 msec (= 16packets * 1sec / 8,000cycle).
In Linux kernel v3.5, an idea of 2. was introduced at a commit e9148dddc3c7('ALSA: firewire-lib: flush completed packets when reading PCM position') to reduce timing gap between queueing/dequeueing PCM frames and scheduling packet transmission. Additionally, it has an effect to reduce load of the software IRQ context. It's done in a call of ioctl(2) with SNDRV_PCM_IOCTL_HWSYNC. As a result, in mmap programming scenario, a call of snd_pcm_hwsync() can process packets for recent isochronous cycle.
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit()
In Linux kernel v4.13, ALSA PCM core was improved in an aspect of updating appl_ptr at a commit 9027c4639ef1('ALSA: pcm: Call ack() whenever appl_ptr is updated'). This commit ensures callback of 'struct snd_pcm_ops.ack()' when updating appl_ptr. This callback can be done in below ioctl(2) requests: - SNDRV_PCM_IOCTL_REWIND - SNDRV_PCM_IOCTL_FORWARD - SNDRV_PCM_IOCTL_SYNC_PTR - SNDRV_PCM_IOCTL_READ[I|N]_FRAMES - SNDRV_PCM_IOCTL_WRITE[I|N]_FRAMES
As I notes in a commit 875becf8412c('ALSA: firewire: process packets in 'struct snd_pcm_ops.ack' callback'), 3. works less efficiently because in usual PCM operation the above ioctl(2) requests are not used in event loop.
At the same time of the appl_ptr improvement, a new flag of PCM hardware definition was added at a commit 42f945970af9('ALSA: pcm: Add the explicit appl_ptr sync support'), and relevant changes were added to ALSA PCM core and alsa-lib v1.1.5. This flag expects userspace to call ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR request per iteration of PCM frame queueing/dequeueing in mmap programming scenario below:
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR)
This patchset aims to support the new flag, SNDRV_PCM_INFO_SYNC_APPLPTR, to tell userspace to call ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR after queueing/dequeueing PCM frames in each iteration of event dispatcher. As a result:
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
This patchset also adds optimization for timing to process packets in process context. As the above call graph shows, support of the flag brings closest two calls to process packet against intervals of isochronous cycle and it is not enough efficient. This patchset also cancel packet processing in .pointer() callback. As a result:
while (1): snd_pcm_hwsync() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
Takashi Sakamoto (2): ALSA: firewire-lib: adopt SYNC_APPLPTR flag to flush packets after queueing/dequeueing PCM frames ALSA: firewire-lib: cancel processing packets in struct snd_pcm_ops.pointer() call
sound/firewire/amdtp-stream.c | 34 ++-------------------------------- sound/firewire/amdtp-stream.h | 12 +++++++++++- 2 files changed, 13 insertions(+), 33 deletions(-)
In Linux kernel v4.13, ALSA PCM core was improved in an aspect of updating appl_ptr at a commit 9027c4639ef1('ALSA: pcm: Call ack() whenever appl_ptr is updated'). This commit ensures a callback of 'struct snd_pcm_ops.ack()' when updating appl_ptr in process context. This callback can be done in below ioctl(2) requests: - SNDRV_PCM_IOCTL_REWIND - SNDRV_PCM_IOCTL_FORWARD - SNDRV_PCM_IOCTL_SYNC_PTR - SNDRV_PCM_IOCTL_READ[I|N]_FRAMES - SNDRV_PCM_IOCTL_WRITE[I|N]_FRAMES
In a commit 875becf8412c('ALSA: firewire: process packets in 'struct snd_pcm_ops.ack' callback'), I committed implementation of .ack() callback to reduce timing gap between queueing/dequeueing PCM frames and processing packets. However, in usual mmap programming scenario of ALSA PCM applications, the above operations are not always used in event dispatcher and the commit is not necessarily effective to this engine.
At the same time of the appl_ptr improvement, a new flag of PCM hardware definition was introduced at a commit 42f945970af9('ALSA: pcm: Add the explicit appl_ptr sync support'), and relevant changes were added to ALSA PCM core and alsa-lib v1.1.5. This flag expects userspace to call ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR request per iteration of PCM frame handling in mmap programming scenario below:
while (1): ->poll() ->snd_pcm_hwsync() ->ioctl(HWSYNC) ->snd_pcm_mmap_begin() (queue/dequeue PCM frames here) ->snd_pcm_mmap_commit() ->ioctl(SYNC_PTR)
Although adoption of this flag to this engine increases the number of system calls in an iteration, it's useful in an aspect of packet processing.
This commit adopts the flag. As a result, this engine has two chances to process packets in process context at one iteration of the above calls.
while (1): ->poll() ->snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() ->snd_pcm_mmap_begin() (queue/dequeue PCM frames here) ->snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
Signed-off-by: Takashi Sakamoto o-takashi@sakamocchi.jp --- sound/firewire/amdtp-stream.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c index cb9acfe60f6a..3520dc2bec61 100644 --- a/sound/firewire/amdtp-stream.c +++ b/sound/firewire/amdtp-stream.c @@ -156,7 +156,8 @@ int amdtp_stream_add_pcm_hw_constraints(struct amdtp_stream *s, SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_JOINT_DUPLEX | SNDRV_PCM_INFO_MMAP | - SNDRV_PCM_INFO_MMAP_VALID; + SNDRV_PCM_INFO_MMAP_VALID | + SNDRV_PCM_INFO_SYNC_APPLPTR;
/* SNDRV_PCM_INFO_BATCH */ hw->periods_min = 2;
In current implementation of IEC 61883-1/6 engine in ALSA firewire stack, there's two chances to handle isochronous packets in process contexts. One is a callback of struct snd_pcm_ops.pointer() and another is the one of .ack(). When alsa-lib applications queue/dequeue PCM frames according to mmap scenario, the isochronous packets are handled in below call graph.
while () ->poll() ->snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() ->snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) ->snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
In an above iteration, two calls of fw_iso_context_flush_completions() are executed with quite small interval, thus few packets can be handled in a call of .ack(). This is inefficient.
Although .ack() ensures to be called in process contexts, .pointer() is called in interrupt contexts as well as process context. Especially, this engine is designed to call snd_pcm_period_elapsed() in software IRQ context of tasklet, scheduled in both of process context and interrupt context of 1394 OHCI IR/IT context. To reduce closest two call of .pointer(), a condition to judge running context is used in .pointer. An idea to flush packets in a callback of .pointer() requires the above extra care.
On the other hand, a callback of .ack() has no care like that, except for being called within acquired lock. Therefore it's reasonable to cancel .pointer() side.
This commit cancels flushing packets in a callback of .pointer(). The callback of .pointer just returns the number of PCM frames in intermediate PCM buffer or SNDRV_PCM_POS_XRUN. As a result:
while () ->poll() ->snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) ->snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
Signed-off-by: Takashi Sakamoto o-takashi@sakamocchi.jp --- sound/firewire/amdtp-stream.c | 31 ------------------------------- sound/firewire/amdtp-stream.h | 12 +++++++++++- 2 files changed, 11 insertions(+), 32 deletions(-)
diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c index 3520dc2bec61..48f17264fc35 100644 --- a/sound/firewire/amdtp-stream.c +++ b/sound/firewire/amdtp-stream.c @@ -922,37 +922,6 @@ int amdtp_stream_start(struct amdtp_stream *s, int channel, int speed) } EXPORT_SYMBOL(amdtp_stream_start);
-/** - * amdtp_stream_pcm_pointer - get the PCM buffer position - * @s: the AMDTP stream that transports the PCM data - * - * Returns the current buffer position, in frames. - */ -unsigned long amdtp_stream_pcm_pointer(struct amdtp_stream *s) -{ - /* - * This function is called in software IRQ context of period_tasklet or - * process context. - * - * When the software IRQ context was scheduled by software IRQ context - * of IR/IT contexts, queued packets were already handled. Therefore, - * no need to flush the queue in buffer anymore. - * - * When the process context reach here, some packets will be already - * queued in the buffer. These packets should be handled immediately - * to keep better granularity of PCM pointer. - * - * Later, the process context will sometimes schedules software IRQ - * context of the period_tasklet. Then, no need to flush the queue by - * the same reason as described for IR/IT contexts. - */ - if (!in_interrupt() && amdtp_stream_running(s)) - fw_iso_context_flush_completions(s->context); - - return READ_ONCE(s->pcm_buffer_pointer); -} -EXPORT_SYMBOL(amdtp_stream_pcm_pointer); - /** * amdtp_stream_pcm_ack - acknowledge queued PCM frames * @s: the AMDTP stream that transfers the PCM frames diff --git a/sound/firewire/amdtp-stream.h b/sound/firewire/amdtp-stream.h index e45de3eecfe3..8a96419c462b 100644 --- a/sound/firewire/amdtp-stream.h +++ b/sound/firewire/amdtp-stream.h @@ -168,7 +168,6 @@ int amdtp_stream_add_pcm_hw_constraints(struct amdtp_stream *s, struct snd_pcm_runtime *runtime);
void amdtp_stream_pcm_prepare(struct amdtp_stream *s); -unsigned long amdtp_stream_pcm_pointer(struct amdtp_stream *s); int amdtp_stream_pcm_ack(struct amdtp_stream *s); void amdtp_stream_pcm_abort(struct amdtp_stream *s);
@@ -229,6 +228,17 @@ static inline bool cip_sfc_is_base_44100(enum cip_sfc sfc) return sfc & 1; }
+/** + * amdtp_stream_pcm_pointer - get the PCM buffer position + * @s: the AMDTP stream that transports the PCM data + * + * Returns the current buffer position, in frames. + */ +static inline unsigned long amdtp_stream_pcm_pointer(struct amdtp_stream *s) +{ + return READ_ONCE(s->pcm_buffer_pointer); +} + /** * amdtp_stream_wait_callback - sleep till callbacked or timeout * @s: the AMDTP stream
On Wed, 12 Sep 2018 17:04:48 +0200, Takashi Sakamoto wrote:
In IEC 61883-1/6 engine in ALSA firewire stack, a work to process packet is done in three places:
- bottom-half of handler for interrupts scheduled on 1394OHCI controller
- a callback of struct 'snd_pcm_ops.pointer()'
- a callback of struct 'snd_pcm_ops.ack()'
This patchset adds optimization for 3., and remove 2. for efficiency.
In Linux kernel v2.6.39, 1. and 2. was committed in initial implementation of this engine at a commit 31ef9134eb52('ALSA: add LaCie FireWire Speakers/Griffin FireWave Surround driver'). This engine schedules hardware interrupts 16 packets (=INTERRUPT_INTERVAL) thus 1. appears in software IRQ context every 2 msec (= 16packets * 1sec / 8,000cycle).
In Linux kernel v3.5, an idea of 2. was introduced at a commit e9148dddc3c7('ALSA: firewire-lib: flush completed packets when reading PCM position') to reduce timing gap between queueing/dequeueing PCM frames and scheduling packet transmission. Additionally, it has an effect to reduce load of the software IRQ context. It's done in a call of ioctl(2) with SNDRV_PCM_IOCTL_HWSYNC. As a result, in mmap programming scenario, a call of snd_pcm_hwsync() can process packets for recent isochronous cycle.
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit()
In Linux kernel v4.13, ALSA PCM core was improved in an aspect of updating appl_ptr at a commit 9027c4639ef1('ALSA: pcm: Call ack() whenever appl_ptr is updated'). This commit ensures callback of 'struct snd_pcm_ops.ack()' when updating appl_ptr. This callback can be done in below ioctl(2) requests:
- SNDRV_PCM_IOCTL_REWIND
- SNDRV_PCM_IOCTL_FORWARD
- SNDRV_PCM_IOCTL_SYNC_PTR
- SNDRV_PCM_IOCTL_READ[I|N]_FRAMES
- SNDRV_PCM_IOCTL_WRITE[I|N]_FRAMES
As I notes in a commit 875becf8412c('ALSA: firewire: process packets in 'struct snd_pcm_ops.ack' callback'), 3. works less efficiently because in usual PCM operation the above ioctl(2) requests are not used in event loop.
At the same time of the appl_ptr improvement, a new flag of PCM hardware definition was added at a commit 42f945970af9('ALSA: pcm: Add the explicit appl_ptr sync support'), and relevant changes were added to ALSA PCM core and alsa-lib v1.1.5. This flag expects userspace to call ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR request per iteration of PCM frame queueing/dequeueing in mmap programming scenario below:
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR)
This patchset aims to support the new flag, SNDRV_PCM_INFO_SYNC_APPLPTR, to tell userspace to call ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR after queueing/dequeueing PCM frames in each iteration of event dispatcher. As a result:
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
This patchset also adds optimization for timing to process packets in process context. As the above call graph shows, support of the flag brings closest two calls to process packet against intervals of isochronous cycle and it is not enough efficient. This patchset also cancel packet processing in .pointer() callback. As a result:
while (1): snd_pcm_hwsync() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
Takashi Sakamoto (2): ALSA: firewire-lib: adopt SYNC_APPLPTR flag to flush packets after queueing/dequeueing PCM frames ALSA: firewire-lib: cancel processing packets in struct snd_pcm_ops.pointer() call
Both look like a good improvement. Looking forward to seeing the official patches once after tests by others pass.
thanks,
Takashi
Hi,
On Sep 13 2018 06:21, Takashi Iwai wrote:
On Wed, 12 Sep 2018 17:04:48 +0200, Takashi Sakamoto wrote:
In IEC 61883-1/6 engine in ALSA firewire stack, a work to process packet is done in three places:
- bottom-half of handler for interrupts scheduled on 1394OHCI controller
- a callback of struct 'snd_pcm_ops.pointer()'
- a callback of struct 'snd_pcm_ops.ack()'
This patchset adds optimization for 3., and remove 2. for efficiency.
In Linux kernel v2.6.39, 1. and 2. was committed in initial implementation of this engine at a commit 31ef9134eb52('ALSA: add LaCie FireWire Speakers/Griffin FireWave Surround driver'). This engine schedules hardware interrupts 16 packets (=INTERRUPT_INTERVAL) thus 1. appears in software IRQ context every 2 msec (= 16packets * 1sec / 8,000cycle).
In Linux kernel v3.5, an idea of 2. was introduced at a commit e9148dddc3c7('ALSA: firewire-lib: flush completed packets when reading PCM position') to reduce timing gap between queueing/dequeueing PCM frames and scheduling packet transmission. Additionally, it has an effect to reduce load of the software IRQ context. It's done in a call of ioctl(2) with SNDRV_PCM_IOCTL_HWSYNC. As a result, in mmap programming scenario, a call of snd_pcm_hwsync() can process packets for recent isochronous cycle.
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit()
In Linux kernel v4.13, ALSA PCM core was improved in an aspect of updating appl_ptr at a commit 9027c4639ef1('ALSA: pcm: Call ack() whenever appl_ptr is updated'). This commit ensures callback of 'struct snd_pcm_ops.ack()' when updating appl_ptr. This callback can be done in below ioctl(2) requests:
- SNDRV_PCM_IOCTL_REWIND
- SNDRV_PCM_IOCTL_FORWARD
- SNDRV_PCM_IOCTL_SYNC_PTR
- SNDRV_PCM_IOCTL_READ[I|N]_FRAMES
- SNDRV_PCM_IOCTL_WRITE[I|N]_FRAMES
As I notes in a commit 875becf8412c('ALSA: firewire: process packets in 'struct snd_pcm_ops.ack' callback'), 3. works less efficiently because in usual PCM operation the above ioctl(2) requests are not used in event loop.
At the same time of the appl_ptr improvement, a new flag of PCM hardware definition was added at a commit 42f945970af9('ALSA: pcm: Add the explicit appl_ptr sync support'), and relevant changes were added to ALSA PCM core and alsa-lib v1.1.5. This flag expects userspace to call ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR request per iteration of PCM frame queueing/dequeueing in mmap programming scenario below:
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR)
This patchset aims to support the new flag, SNDRV_PCM_INFO_SYNC_APPLPTR, to tell userspace to call ioctl(2) with SNDRV_PCM_IOCTL_SYNC_PTR after queueing/dequeueing PCM frames in each iteration of event dispatcher. As a result:
while (1): snd_pcm_hwsync() ->ioctl(HWSYNC) ->struct snd_pcm_ops.pointer() = amdtp_stream_pcm_pointer() ->fw_iso_context_flush_completions() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
This patchset also adds optimization for timing to process packets in process context. As the above call graph shows, support of the flag brings closest two calls to process packet against intervals of isochronous cycle and it is not enough efficient. This patchset also cancel packet processing in .pointer() callback. As a result:
while (1): snd_pcm_hwsync() snd_pcm_mmap_begin() (queue/dequeue PCM frames here.) snd_pcm_mmap_commit() ->ioctl(SYNC_PTR) ->struct snd_pcm_ops.ack() = amdtp_stream_pcm_ack() ->fw_iso_context_flush_completions()
Takashi Sakamoto (2): ALSA: firewire-lib: adopt SYNC_APPLPTR flag to flush packets after queueing/dequeueing PCM frames ALSA: firewire-lib: cancel processing packets in struct snd_pcm_ops.pointer() call
Both look like a good improvement. Looking forward to seeing the official patches once after tests by others pass.
Thanks for your positive feedback.
To ease the others to test this patchset, I prepare for a remote branch in github.com. https://github.com/takaswie/snd-firewire-improve/tree/topic/applptr
From the branch, whole codes relevant to ALSA firewire stack can be backported to Linux kernel v4.17 or later.
This patchset allows userspace applications to queue isochronous packet with PCM frames without waiting for software IRQ of 1394 OHCI controller. In this development cycle I have a plan to allow applications to have a smaller PCM intermediate buffer than current one (=period-time greater than 5msec). Users look it as 'improvement in a point of latency', if things go well.
Thanks
Takashi Sakamoto
participants (2)
-
Takashi Iwai
-
Takashi Sakamoto