[alsa-devel] [PATCH - pcm 1/1] pcm: fix "unable to create IPC shm instance" in some case
bsiice
bsiice at msn.com
Sat Mar 5 07:37:34 CET 2016
On 03/05/2016 02:08 AM, Takashi Iwai wrote:
> On Fri, 04 Mar 2016 18:36:46 +0100,
> bsiice wrote:
>>
>> On 03/05/2016 12:37 AM, Takashi Iwai wrote:
>>> On Fri, 04 Mar 2016 17:22:58 +0100,
>>> bsiice wrote:
>>>>
>>>> On 03/03/2016 09:53 PM, Takashi Iwai wrote:
>>>>> On Thu, 03 Mar 2016 13:40:42 +0100,
>>>>> IceBsi wrote:
>>>>>>
>>>>>> From: Qing Cai <caiqing at neusoft.com>
>>>>>>
>>>>>> As stated in manpage SHMCTL(2), shm_nattch is "No. of current attaches"
>>>>>> (i.e., number of processes attached to the shared memeory). If an
>>>>>> application uses alsa-lib and invokes fork() at some point, there should
>>>>>> be the following execution sequence:
>>>>>> 1. execute the following statement:
>>>>>> pcm_direct.c:110: dmix->shmptr = shmat(dmix->shmid, 0, 0)
>>>>>> (shm_nattch becomes 1)
>>>>>> 2. invoke fork() in some thread.
>>>>>> (shm_nattch becomes 2)
>>>>>> 3. execute the following statement:
>>>>>> pcm_direct.c:122: if (buf.shm_nattch == 1)
>>>>>> 4. execute the following statement:
>>>>>> pcm_direct.c:131: if (dmix->shmptr->magic != SND_PCM_DIRECT_MAGIC)
>>>>>> (As stated in manpage SHMGET(2), "When a new shared memory segment
>>>>>> is created, its contents are initialized to zero values", so
>>>>>> dmix->shmptr->magic is 0)
>>>>>> 5. execute the following statements:
>>>>>> pcm_direct.c:132: snd_pcm_direct_shm_discard(dmix)
>>>>>> pcm_direct.c:133: return -EINVAL
>>>>>> The above execution sequence will cause the following error:
>>>>>> unable to create IPC shm instance
>>>>>> This error causes multimedia application has no sound. This error rarely
>>>>>> occurs, probability is about 1%.
>>>>>> Because the first user of the shared memory will get that
>>>>>> dmix->shmptr->magic is 0, check dmix->shmptr->magic's value to determine
>>>>>> if "we're the first user" is OK.
>>>>>> Tests have been made 400+ times after this fix, and the issue no longer
>>>>>> exists.
>>>>>
>>>>> I think this is still racy. Multiple users can grab the shmem at the
>>>>> very same time. Maybe it looks as if working just because both users
>>>>> behavior as the first user and do clear and initialize.
>>>> I think this won't be a race condition. Since
>>>> snd_pcm_direct_shm_create_or_connect() is protected by
>>>> snd_pcm_direct_semaphore_down() and snd_pcm_direct_semaphore_up(),
>>>> multiple processes won't do clear and initialization concurrently.
>>>
>>> Hrm, OK, now understood the situation.
>>>
>>>>> The check of bus.shm_nattach=1 should be fine, per se. The problem is
>>>>> the magic key check of the secondary. In the current code, as you
>>>>> pointed out, this may happen before the first client finishes the
>>>>> initialization.
>>>> I think the check of dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC is more
>>>> robust. Assume there are two clients of shmem and shmem is initialized,
>>>> if one of the clients exits, shm_nattach will become 1 and shmem may be
>>>> initialized again.
>>>
>>> Yeah, in the fork from a thread, the shm_nattch check doesn't work
>>> reliably, indeed.
>>>
>>>> In the current code, because there is semaphore, magic key check won't
>>>> happen before initialization of shmem.
>>>> In the scenario that I want to tell, there is only one process, and only
>>>> one thread doning the clear and initialization of shmem, and the fork()
>>>> is invoked in another thread of non alsa-lib context (i.e., the fork()
>>>> code is not in alsa-lib). If the fork() is invoked before shmat(), the
>>>> problem won't happen; If the fork() is invoked after the check of
>>>> bus.shm_nattach=1, the problem won't happen too. Only if the fork() is
>>>> invoked just after shmat() and just before the check of
>>>> bus.shm_nattach=1, the problem happens.
>>>
>>> Well, the reason why the magic key check was introduced was about the
>>> safety. Since the shmid is given explicitly, it might override some
>>> shmem usages other than dmix accidentally. With your patch, it
>>> forcibly clears the region -- this is quite opposite to the intention
>>> of the magic key check.
>>
>> I just think it clears the region only once at the first time. Could you
>> please show me the scenario in detail?
>
> The shmid ipc key is given explicitly by an alsa-lib configuration,
> but there is no guarantee that this key has never been used by some
> other programs for a completely different purpose. So, for non-first
> user, it verifies whether the attached region really belongs to
> alsa-lib dmix.
>
>
> Takashi
>
Thank you for explanation! Now I understood.
I have tried changing ipc_key many times when I have been investigating
the problem, and changing ipc_key doesn't solve the problem I
encounterd.
I think there is more benefits to do magic key check instead of
shm_nattch check, because the situation that I encounterd happens more
often than the situation that other program has a same ipc key with
alsa-lib. The problem left is to make ipc key as unique as possible.
Every program/library should fulfil the responsibility of generating a
unique key. As stated on http://www.tldp.org/LDP/lpg/node24.html, ftok()
should be used to generate a unique key.
thanks,
Qing Cai
More information about the Alsa-devel
mailing list