On Tue, 12 Apr 2016 22:24:08 +0200, Lars Lindqvist wrote:
Den 2016-04-12 skrev Takashi Iwai:
On Tue, 12 Apr 2016 18:46:17 +0200, Lars Lindqvist wrote:
Den 2016-04-12 skrev Takashi Iwai:
Actually we have a semaphore before shm access, so the race at two opens shouldn't happen. I noticed it after I sent my previous mail.
But the semaphore is taken also at snd_pcm_dmix_close(). So I wonder where the race actually happens. Both open and close must be protected while another stream is opening or closing.
Could you try to check where you get the exact error...?
The execution tree, as far as I can find is the following:
snd_pcm_dmix_open:
- Line 1009 snd_pcm_direct_shm_create_or_connect with the code in question,
- which returns 0. So we end up in line 1058, dmix->shmtr->use_server is 0,
- so go to line 1072.. running:
snd_pcm_open_slave, running: snd_pcm_open_named_slave, running: snd_pcm_open_conf
- where snd_dlobj_cache_get gives open_func = snd_pcm_hw_open, so
snd_pcm_hw_open() snd_open_device("/dev/snd/pcmC0D0p")
- where open() returns -1 with errno = EBADFD
OK, then the question is why other stream could be closed while this is being opened and protected via semaphore. Maybe the semaphore protection isn't perfect?
In anyway, in such a case, we may retry opening the stream as the first element. This is safer than blindly assuming the first element via nattach value (which is racy). An untested patch is below.
Yes, this seems to work. I'm at least not able to trigger the problem myself anymore. I'll use this for a few days, and report back if I happen to get any unexpected EBADFDs.
OK, I now applied the fix to git tree. Let's see whether this works stably enough. Let me know if you still see a similar problem with it.
thanks,
Takashi