Re: [PATCH v15 00/16] Add audio support in v4l2 framework

2 May 2024


      Em Thu, 2 May 2024 09:59:56 +0100
Mauro Carvalho Chehab mchehab@kernel.org escreveu:
...
Em Thu, 02 May 2024 09:46:14 +0200
Takashi Iwai tiwai@suse.de escreveu:
...
On Wed, 01 May 2024 03:56:15 +0200,
Mark Brown wrote:
...
On Tue, Apr 30, 2024 at 05:27:52PM +0100, Mauro Carvalho Chehab wrote:
...
Mark Brown broonie@kernel.org escreveu:
...
On Tue, Apr 30, 2024 at 10:21:12AM +0200, Sebastian Fricke wrote:
...
...
The discussion around this originally was that all the audio APIs are
very much centered around real time operations rather than completely
...
The media subsystem is also centered around real time. Without real
time, you can't have a decent video conference system. Having
mem2mem transfers actually help reducing real time delays, as it 
avoids extra latency due to CPU congestion and/or data transfers
from/to userspace.
Real time means strongly tied to wall clock times rather than fast - the
issue was that all the ALSA APIs are based around pushing data through
the system based on a clock.
...
...
That doesn't sound like an immediate solution to maintainer overload
issues...  if something like this is going to happen the DRM solution
does seem more general but I'm not sure the amount of stop energy is
proportionate.
...
I don't think maintainer overload is the issue here. The main
point is to avoid a fork at the audio uAPI, plus the burden
of re-inventing the wheel with new codes for audio formats,
new documentation for them, etc.
I thought that discussion had been had already at one of the earlier
versions?  TBH I've not really been paying attention to this since the
very early versions where I raised some similar "why is this in media"
points and I thought everyone had decided that this did actually make
sense.
Yeah, it was discussed in v1 and v2 threads, e.g.
  https://patchwork.kernel.org/project/linux-media/cover/1690265540-25999-1-gi...
My argument at that time was how the operation would be, and the point
was that it'd be a "batch-like" operation via M2M without any timing
control.  It'd be a very special usage for for ALSA, and if any, it'd
be hwdep -- that is a very hardware-specific API implementation -- or
try compress-offload API, which looks dubious.
OTOH, the argument was that there is already a framework for M2M in
media API and that also fits for the batch-like operation, too.  So
was the thread evolved until now.
M2M transfers are not a hardware-specific API, and such kind of
transfers is not new either. Old media devices like bttv have
internally a way to do PCI2PCI transfers, allowing media streams
to be transferred directly without utilizing CPU. The media driver
supports it for video, as this made a huge difference of performance
back then.
On embedded world, this is a pretty common scenario: different media
IP blocks can communicate with each other directly via memory. This
can happen for video capture, video display and audio.
With M2M, most of the control is offloaded to the hardware.
There are still time control associated with it, as audio and video
needs to be in sync. This is done by controlling the buffers size 
and could be fine-tuned by checking when the buffer transfer is done.
On media, M2M buffer transfers are started via VIDIOC_QBUF,
which is a request to do a frame transfer. A similar ioctl
(VIDIOC_DQBUF) is used to monitor when the hardware finishes
transfering the buffer. On other words, the CPU is responsible
for time control.
Just complementing: on media, we do this per video buffer (or
per half video buffer). A typical use case on cameras is to have
buffers transferred 30 times per second, if the video was streamed 
at 30 frames per second.
I would assume that, on an audio/video stream, the audio data
transfer will be programmed to also happen on a regular interval.
So, if the video stream is programmed to a 30 frames per second
rate, I would assume that the associated audio stream will also be
programmed to be grouped into 30 data transfers per second. On such
scenario, if the audio is sampled at 48 kHZ, it means that:
1) each M2M transfer commanded by CPU will copy 1600 samples;
2) the time between each sample will remain 1/48000;
3) a notification event telling that 1600 samples were transferred
   will be generated when the last sample happens;
4) CPU will do time control by looking at the notification events.
...
On other words, this is still real time. The main difference
from a "sync" transfer is that the CPU doesn't need to copy data
from/to different devices, as such operation is offloaded to the
hardware.
Regards,
Mauro