Re: [alsa-devel] [PATCH 1/2] ALSA: firewire: process packets in 'struct snd_pcm_ops.ack' callback

18 Jun 2017

      Hi,
On Jun 17 2017 00:45, Takashi Iwai wrote:
...
On Fri, 16 Jun 2017 17:00:13 +0200,
Takashi Sakamoto wrote:
...
On Jun 16 2017 04:06, Takashi Iwai wrote:
...
...
Some devices/drivers request applications to perform this, due to
their
...
...
...
...
design of hardware for data transmission.
Yes, and it's the default behavior of all non-x86 platforms, too.
Here, you have mixture of two items; architecture difference for cache
coherent functionality, and device feature for data transmission. These
two items are apparently different things.
In other words, devices can be supported independently of platform
architectures. Why should I apply the solution prepared for cache
coherency issue to add better support for the device with new feature
for its data transmission? On x86 platform (although including some
exceptions such as old ATOM processors), status/control data of PCM
substream can successfully be mapped to process' VMA of applications.
Why should it be disable it even if works well?
Because it doesn't work well for the new feature :)
In this point, I cannot understand your insistence.
The page frame for status/control data of PCM substream is mapped into
process' VMA of application with _read-only_ attributes. In a point to
deliver the status/control data from kernel space to user space, it 
works well. On x86 platforms, this works fine exactly as the aim.
On the other hand, what we should achieve for current issue is to 
deliver information from applications to hardware. This is not relevant 
to the page frame mapping.
These two are apparently different issues. You intend to apply a
solution for the former as a solution for the latter, however it's
against original aim of the latter. To me this is in a gray zone to
agree with it.
...
...
...
So... what's the "new"?  That is what I don't understand...
It's the already existing model deployed without mmap.  Nothing new.
This subsystem has no drivers with the similar feature, thus it's new
design for data transmission. The design supports below things:

Hardware features:

1.1. Data transmission is done by direct media access (DMA).
1.2. Hardware cares of two points for its data transmission; one is
      hwptr and another is appl_ptr on PCM buffer dedicated for the
      data transmission.
1.3. (perhaps)The granurarity of data transmission can be differed,
      not fixed to the size of 'period'.

Drivers are designed with below items:

2.1. Return SNDRV_PCM_ACCESS_MMAP_XXX flag
2.2. Implement for 'struct snd_pcm_ops.ack' to tell appl_ptr to
      hardware.

Applications should perform according to below items:

3.1.  Operate with SNDRV_PCM_IOCTL_[READ|WRITE][N|I] to drive kernel
       land stuffs with changed appl_ptr.
3.2.  Or operate with SYNC_PTR to driver kernel land stuffs when
       handling any PCM frames.
As of v4.12-rc5, there's no stuffs to satisfy all of these items.
It works on non-x86 systems as is with 4.12.
It's a sub-effect, like a bonus, from a solution for issues of cache
coherency dependently of architecture, as I said.
...
...
...
In other words, such a driver already works fine on non-x86
platforms.  It's "broken" only on x86 due to the forced mmap usage.
It's better to distinguish two issues; architecture's support for
cache coherency and support for new type of device.
Well, that's not really true.  Most of drivers worked so far with a
luck.  Fortunately, the hardware that doesn't have the mappable
hardware buffer can be woken up well enough via period setup, thus it
could synchronize appl_ptr that user-space already changed in the
past.  This, however, doesn't mean that the current x86 mmap is
perfectly working.
Imagine that you stop the stream at the middle of period chunk.  For
the hardware with the mapped h/w buffer, it can play up to the aborted
position.  OTOH, for the hardware without mapped buffer, it can't
reach at that position because the sync of appl_ptr was missing before
the period elapsed.  If the appl_ptr could have been notified via
sync_ptr, the driver could copy the data, and it could reach to the
aborted point.
So, currently it effectively enforces the BATCH style buffer although
it doesn't have to be so.  It's a known bug by the status / control
mmap, but we've ignored this just because such hardware are minor and
the problem itself is trivial.  But you see that the current code is
even not perfect for the existing hardware.
I'm a developer for drivers which use PCM buffer as intermediate buffer
for packet buffer. I understand what you explained correctly because
I've considered about it for recent several years.
But this is our of my concern about your patch.
...
...
...
...
...
And, I think you miss a few points, thus the argument was twisted.

The primary goal is to achieve the notification of appl_ptr change
   to kernel.

It's for the purpose to support devices which have the new design for
data transmission. This is the reason that I agree with upgrading
version of PCM interface. I suggest adding new info flags for the
specific purpose.
Disabling mmap for status/control data of PCM substream is just to
support architectures to which ALSA PCM core judge non cache
coherency.
...
...
...
...
It's not good to utilize it as a solution of this issue because of
abuse of interfaces.
No, it's no abuse.  The sync_ptr is the correct and designed API to
notify appl_ptr update from the user space to kernel.  For most of
device drivers on x86, this isn't requested strictly, thus it's not
done when the PCM control is mmapped.  We've been just lucky, so far.
I described that it was originally designed to solve architecture's
support for cache coherency. Don't depend on extra bonus from it.
Oh well...  Please stop such a fundamentalism argument.
The original purpose doesn't mean to limit its usage.  We aren't
arguing history or religion.
If an issue were encapsulated only in kernel land or user land,
I would have no objection to your patch. I'm willing to review it.
However, in a current case, it relates to interaction between
kernel/user. In my opinion, it's better to avoid changing fundamental
meaning of features which are already exposed to one side. Therefore
on x86/ppc/alpha architectures, userspace applications (not only
alsa-lib API applications but also applications with any I/O library)
can use status/control data on own VMA. Else, not. This is independent
on peripheral devices.
My intention to continue this discussion is the above. The scope of
issues is different; one is architecture-dependent, another is
device-dependent, thus solution should be different. Your patch has a
potential to puzzle developers for user land, like 'why the page frame
is not mapped for my application even if on the same architecture? why
it's device dependent?'
...
...
...
It's a trade-off, but still good enough for the driver requesting it
(the Intel one, which can achieve the deep sleep by that).
What's the 'deep sleep'? Please explain about it when you introduce
new words into this discussion.
I thought Pierre (or other Intel people) already mentioned that.  Or
maybe it's in a different patchset.  In anyway...
The chip (DSP) can prefetch the data on the buffer and go to a deep
sleep mode.  It's the reason why appl_ptr update is needed.  Then we
can know how much data can be prefetched.  This should be the great
reduction of power, thus a slight increase of instructions would be
like a peanut.
In my mailbox, there's no message with the keyword. I guess that it was
introduced with the other expression. But anyway, it's mostly what I've
imagined. Thanks for your confirmation.
(I omitted some texts to focus on the main issue.)
Regards
Takashi Sakamoto