Hi,
On Jun 17 2017 00:45, Takashi Iwai wrote:
On Fri, 16 Jun 2017 17:00:13 +0200, Takashi Sakamoto wrote:
On Jun 16 2017 04:06, Takashi Iwai wrote:
Some devices/drivers request applications to perform this, due to
their
design of hardware for data transmission.
Yes, and it's the default behavior of all non-x86 platforms, too.
Here, you have mixture of two items; architecture difference for cache coherent functionality, and device feature for data transmission. These two items are apparently different things.
In other words, devices can be supported independently of platform architectures. Why should I apply the solution prepared for cache coherency issue to add better support for the device with new feature for its data transmission? On x86 platform (although including some exceptions such as old ATOM processors), status/control data of PCM substream can successfully be mapped to process' VMA of applications. Why should it be disable it even if works well?
Because it doesn't work well for the new feature :)
In this point, I cannot understand your insistence.
The page frame for status/control data of PCM substream is mapped into process' VMA of application with _read-only_ attributes. In a point to deliver the status/control data from kernel space to user space, it works well. On x86 platforms, this works fine exactly as the aim.
On the other hand, what we should achieve for current issue is to deliver information from applications to hardware. This is not relevant to the page frame mapping.
These two are apparently different issues. You intend to apply a solution for the former as a solution for the latter, however it's against original aim of the latter. To me this is in a gray zone to agree with it.
So... what's the "new"? That is what I don't understand... It's the already existing model deployed without mmap. Nothing new.
This subsystem has no drivers with the similar feature, thus it's new design for data transmission. The design supports below things:
- Hardware features:
1.1. Data transmission is done by direct media access (DMA). 1.2. Hardware cares of two points for its data transmission; one is hwptr and another is appl_ptr on PCM buffer dedicated for the data transmission. 1.3. (perhaps)The granurarity of data transmission can be differed, not fixed to the size of 'period'.
- Drivers are designed with below items:
2.1. Return SNDRV_PCM_ACCESS_MMAP_XXX flag 2.2. Implement for 'struct snd_pcm_ops.ack' to tell appl_ptr to hardware.
- Applications should perform according to below items:
3.1. Operate with SNDRV_PCM_IOCTL_[READ|WRITE][N|I] to drive kernel land stuffs with changed appl_ptr. 3.2. Or operate with SYNC_PTR to driver kernel land stuffs when handling any PCM frames.
As of v4.12-rc5, there's no stuffs to satisfy all of these items.
It works on non-x86 systems as is with 4.12.
It's a sub-effect, like a bonus, from a solution for issues of cache coherency dependently of architecture, as I said.
In other words, such a driver already works fine on non-x86 platforms. It's "broken" only on x86 due to the forced mmap usage.
It's better to distinguish two issues; architecture's support for cache coherency and support for new type of device.
Well, that's not really true. Most of drivers worked so far with a luck. Fortunately, the hardware that doesn't have the mappable hardware buffer can be woken up well enough via period setup, thus it could synchronize appl_ptr that user-space already changed in the past. This, however, doesn't mean that the current x86 mmap is perfectly working.
Imagine that you stop the stream at the middle of period chunk. For the hardware with the mapped h/w buffer, it can play up to the aborted position. OTOH, for the hardware without mapped buffer, it can't reach at that position because the sync of appl_ptr was missing before the period elapsed. If the appl_ptr could have been notified via sync_ptr, the driver could copy the data, and it could reach to the aborted point.
So, currently it effectively enforces the BATCH style buffer although it doesn't have to be so. It's a known bug by the status / control mmap, but we've ignored this just because such hardware are minor and the problem itself is trivial. But you see that the current code is even not perfect for the existing hardware.
I'm a developer for drivers which use PCM buffer as intermediate buffer for packet buffer. I understand what you explained correctly because I've considered about it for recent several years.
But this is our of my concern about your patch.
And, I think you miss a few points, thus the argument was twisted.
- The primary goal is to achieve the notification of appl_ptr change to kernel.
It's for the purpose to support devices which have the new design for data transmission. This is the reason that I agree with upgrading version of PCM interface. I suggest adding new info flags for the specific purpose.
Disabling mmap for status/control data of PCM substream is just to support architectures to which ALSA PCM core judge non cache
coherency.
It's not good to utilize it as a solution of this issue because of abuse of interfaces.
No, it's no abuse. The sync_ptr is the correct and designed API to notify appl_ptr update from the user space to kernel. For most of device drivers on x86, this isn't requested strictly, thus it's not done when the PCM control is mmapped. We've been just lucky, so far.
I described that it was originally designed to solve architecture's support for cache coherency. Don't depend on extra bonus from it.
Oh well... Please stop such a fundamentalism argument. The original purpose doesn't mean to limit its usage. We aren't arguing history or religion.
If an issue were encapsulated only in kernel land or user land, I would have no objection to your patch. I'm willing to review it.
However, in a current case, it relates to interaction between kernel/user. In my opinion, it's better to avoid changing fundamental meaning of features which are already exposed to one side. Therefore on x86/ppc/alpha architectures, userspace applications (not only alsa-lib API applications but also applications with any I/O library) can use status/control data on own VMA. Else, not. This is independent on peripheral devices.
My intention to continue this discussion is the above. The scope of issues is different; one is architecture-dependent, another is device-dependent, thus solution should be different. Your patch has a potential to puzzle developers for user land, like 'why the page frame is not mapped for my application even if on the same architecture? why it's device dependent?'
It's a trade-off, but still good enough for the driver requesting it (the Intel one, which can achieve the deep sleep by that).
What's the 'deep sleep'? Please explain about it when you introduce new words into this discussion.
I thought Pierre (or other Intel people) already mentioned that. Or maybe it's in a different patchset. In anyway...
The chip (DSP) can prefetch the data on the buffer and go to a deep sleep mode. It's the reason why appl_ptr update is needed. Then we can know how much data can be prefetched. This should be the great reduction of power, thus a slight increase of instructions would be like a peanut.
In my mailbox, there's no message with the keyword. I guess that it was introduced with the other expression. But anyway, it's mostly what I've imagined. Thanks for your confirmation.
(I omitted some texts to focus on the main issue.)
Regards
Takashi Sakamoto