On 22/Ago/2016 17:11, Takashi Iwai wrote:
On Mon, 22 Aug 2016 17:02:41 +0200, Ismael Luceno wrote:
On 22/Ago/2016 11:26, Takashi Iwai wrote:
On Sun, 14 Aug 2016 02:28:52 +0200, Ismael Luceno wrote:
Easily seen when two threads try at the same time, one of them will fail.
The bug was identified by using apulse with Skype.
Fixes: dec428c35221 ("pcm: fix 'unable to create IPC shm instance' caused by fork from a thread") Fixes: https://github.com/i-rinat/apulse/issues/38 Signed-off-by: Ismael Luceno ismael@iodev.co.uk
src/pcm/pcm_direct.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/src/pcm/pcm_direct.c b/src/pcm/pcm_direct.c index c3925cc20fd3..b5215ba35406 100644 --- a/src/pcm/pcm_direct.c +++ b/src/pcm/pcm_direct.c @@ -101,6 +101,8 @@ retryget: if ((dmix->shmid = shmget(dmix->ipc_key, sizeof(snd_pcm_direct_share_t), IPC_CREAT | IPC_EXCL | dmix->ipc_perm)) != -1) first_instance = 1;
if (dmix->shmid < 0 && errno == EEXIST)
goto retryget;
Hrm, but this would result in an endless loop if the shm was already taken persistently.
If so, shouldn't the first call to shmget succeed?
To me it seems very unlikely that both calls continuosly fail.
Well, you are inserting a loop at the code below:
retryget: dmix->shmid = shmget(dmix->ipc_key, sizeof(snd_pcm_direct_share_t), dmix->ipc_perm); if (dmix->shmid < 0) { if (errno == ENOENT) if ((dmix->shmid = shmget(dmix->ipc_key, sizeof(snd_pcm_direct_share_t), IPC_CREAT | IPC_EXCL | dmix->ipc_perm)) != -1) first_instance = 1; ==> if (dmix->shmid < 0 && errno == EEXIST) ==> goto retryget; }
It's in the if block when the first shmget() fails. If there was already a shm with the given id that had been assigned by another (not necessarily by alsa-lib but by whatever program), the first shmget returns an error with EEXIST. Then it goes back again in a loop; and it can be endless if another program doesn't release the shm.
I will fix my patch, I see a problem now, but the spirit of the idea should be right: if the first call fails with ENOENT, and the second with EEXIST, it should retry, and getting EEXIST again should be very unlikely.
Also, which call does give a negative shmid, actually? It's from the first shmget() or the second shmget()?
What happens is that both threads go down that path but, of course, only one succeeds in the second shmget call.
This should have been protected by a sempahore beforehand. And if it's about threads, the application itself has to take care of the race. alsa-lib is no thread-safe, after all.
So, did you see the issue with multiple processes, or is it about the multi-threads?
Threads in this case, but could happen with processes as well.