On Thu, Aug 06, 2020 at 11:34:12PM +0800, Tom Yan wrote:
On Thu, 6 Aug 2020 at 22:47, Takashi Sakamoto o-takashi@sakamocchi.jp wrote:
As long as I know, neither tools in alsa-utils/alsa-tools nor pulseaudio use the lock mechanism. In short, you are the first person to address to the issue. Thanks for your patience since the first post in 2015.
As for the compare-and-swap part, it's just a plus. Not that "double-looping" for *each* channel doesn't work. It just again seems silly and primitive (and was once confusing to me).
I prepare a rough kernel patch abount the compare-and-swap idea for our discussion. The compare-and-swap is done under lock acquisition of 'struct snd_card.controls_rwsem', therefore many types of operations to control element (e.g. read as well) get affects. This idea works well at first when alsa-lib supports corresponding API and userspace applications uses it. Therefore we need more work than changing just in userspace.
But in my opinion, if things can be solved just in userspace, it should be done with no change in kernel space.
I didn't know much about programming or so back then (even by now I know only a little) when I first noticed the problem, so I just avoided using amixer / my volume wheel / parts of pulse and resorted to alsamixer. For some reason the race didn't *appear to* exist with onboard or USB audio but only my Xonar STX (maybe because volume adjustments take a bit more time with it), so for a long time I thought it's some delicate bug in the oxygen/xonar driver that I failed to identify. Only until very recently it gets back to my head and becomes clear to me that it's a general design flaw in ALSA.
Just to confirm, does SNDRV_CTL_IOCTL_ELEM_LOCK currently prevent get()/read? Or is it just a write lock as I thought? If that's the case, and if the compare-and-swap approach doesn't involve a lot of changes (in all the kernel drivers, for example), I'd say we better start moving to something neat than using something which is less so and wasn't really ever adopted anyway.
In your case, SNDRV_CTL_IOCTL_ELEM_LOCK looks 'write-lock', therefore any userspace applications can do read operation to the control element locked by the other processes.
To solve the issue, the pair of read/write operations should be done between lock/unlock operations. I can consider about two cases.
A case is that all of applications implements the above. This is already proposed. The case is that the world is universe.
+-----------+-----------+ | process A | process B | +-----------+-----------+ | lock | | | read | | | |lock(EPERM)| reschedule lock/get/put/unlock actions | write | | | |lock(EPERM)| reschedule lock/get/put/unlock actions | unlock | | | | lock | | | read | calculate for new value | | write | | | unlock | +-----------+-----------+
Another case is that a part of application implements the above. Let me drill down the case into two cases. These cases are realistic (sign...):
+-----------+------------+ | process A | process B | +-----------+------------+ | lock | | | read | | | write | | | | read | calculate for new value | |write(EPERM)| | unlock | | | | write | <- expected value +-----------+------------+
+-----------+------------+ | process A | process B | +-----------+------------+ | lock | | | read | | | | read | calculate for new value | write | | | |write(EPERM)| | unlock | | | | write | <- unexpected value +-----------+------------+
The lock/unlock mechanism is not perfect solution as long as any applications perform like the process.
If we can expect the most of applications to be back to read operation at failure of write operation, thing goes well.
+-----------+------------+ | process A | process B | +-----------+------------+ | lock | | | read | | | | read | calculate for new value | write | | | |write(EPERM)| | unlock | | | | read | calculate for new value | | write | <- expected value +-----------+------------+
Thanks
Takashi Sakamoto