[alsa-devel] [PATCH - pcm 1/1] pcm: fix "unable to create IPC shm instance" in some case

bsiice bsiice at msn.com
Sat Mar 5 15:21:22 CET 2016


On 03/05/2016 06:07 PM, Takashi Iwai wrote:
> On Sat, 05 Mar 2016 09:35:51 +0100,
> bsiice wrote:
>>
>> On 03/05/2016 03:36 PM, Takashi Iwai wrote:
>>> On Sat, 05 Mar 2016 07:37:34 +0100,
>>> bsiice wrote:
>>>>
>>>> On 03/05/2016 02:08 AM, Takashi Iwai wrote:
>>>>> On Fri, 04 Mar 2016 18:36:46 +0100,
>>>>> bsiice wrote:
>>>>>>
>>>>>> On 03/05/2016 12:37 AM, Takashi Iwai wrote:
>>>>>>> On Fri, 04 Mar 2016 17:22:58 +0100,
>>>>>>> bsiice wrote:
>>>>>>>>
>>>>>>>> On 03/03/2016 09:53 PM, Takashi Iwai wrote:
>>>>>>>>> On Thu, 03 Mar 2016 13:40:42 +0100,
>>>>>>>>> IceBsi wrote:
>>>>>>>>>>
>>>>>>>>>> From: Qing Cai <caiqing at neusoft.com>
>>>>>>>>>>
>>>>>>>>>> As stated in manpage SHMCTL(2), shm_nattch is "No. of current attaches"
>>>>>>>>>> (i.e., number of processes attached to the shared memeory). If an
>>>>>>>>>> application uses alsa-lib and invokes fork() at some point, there should
>>>>>>>>>> be the following execution sequence:
>>>>>>>>>>  1. execute the following statement:
>>>>>>>>>>       pcm_direct.c:110: dmix->shmptr = shmat(dmix->shmid, 0, 0)
>>>>>>>>>>     (shm_nattch becomes 1)
>>>>>>>>>>  2. invoke fork() in some thread.
>>>>>>>>>>     (shm_nattch becomes 2)
>>>>>>>>>>  3. execute the following statement:
>>>>>>>>>>       pcm_direct.c:122: if (buf.shm_nattch == 1)
>>>>>>>>>>  4. execute the following statement:
>>>>>>>>>>       pcm_direct.c:131: if (dmix->shmptr->magic != SND_PCM_DIRECT_MAGIC)
>>>>>>>>>>     (As stated in manpage SHMGET(2), "When a new shared memory segment
>>>>>>>>>>      is created, its contents are initialized to zero values", so
>>>>>>>>>>      dmix->shmptr->magic is 0)
>>>>>>>>>>  5. execute the following statements:
>>>>>>>>>>       pcm_direct.c:132: snd_pcm_direct_shm_discard(dmix)
>>>>>>>>>>       pcm_direct.c:133: return -EINVAL
>>>>>>>>>> The above execution sequence will cause the following error:
>>>>>>>>>>   unable to create IPC shm instance
>>>>>>>>>> This error causes multimedia application has no sound. This error rarely
>>>>>>>>>> occurs, probability is about 1%.
>>>>>>>>>> Because the first user of the shared memory will get that
>>>>>>>>>> dmix->shmptr->magic is 0, check dmix->shmptr->magic's value to determine
>>>>>>>>>> if "we're the first user" is OK.
>>>>>>>>>> Tests have been made 400+ times after this fix, and the issue no longer
>>>>>>>>>> exists.
>>>>>>>>>
>>>>>>>>> I think this is still racy.  Multiple users can grab the shmem at the
>>>>>>>>> very same time.  Maybe it looks as if working just because both users
>>>>>>>>> behavior as the first user and do clear and initialize.
>>>>>>>> I think this won't be a race condition. Since
>>>>>>>> snd_pcm_direct_shm_create_or_connect() is protected by
>>>>>>>> snd_pcm_direct_semaphore_down() and snd_pcm_direct_semaphore_up(),
>>>>>>>> multiple processes won't do clear and initialization concurrently.
>>>>>>>
>>>>>>> Hrm, OK, now understood the situation.
>>>>>>>
>>>>>>>>> The check of bus.shm_nattach=1 should be fine, per se.  The problem is
>>>>>>>>> the magic key check of the secondary.  In the current code, as you
>>>>>>>>> pointed out, this may happen before the first client finishes the
>>>>>>>>> initialization.
>>>>>>>> I think the check of dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC is more
>>>>>>>> robust. Assume there are two clients of shmem and shmem is initialized,
>>>>>>>> if one of the clients exits, shm_nattach will become 1 and shmem may be
>>>>>>>> initialized again.
>>>>>>>
>>>>>>> Yeah, in the fork from a thread, the shm_nattch check doesn't work
>>>>>>> reliably, indeed.
>>>>>>>
>>>>>>>> In the current code, because there is semaphore, magic key check won't
>>>>>>>> happen before initialization of shmem.
>>>>>>>> In the scenario that I want to tell, there is only one process, and only
>>>>>>>> one thread doning the clear and initialization of shmem, and the fork()
>>>>>>>> is invoked in another thread of non alsa-lib context (i.e., the fork()
>>>>>>>> code is not in alsa-lib). If the fork() is invoked before shmat(), the
>>>>>>>> problem won't happen; If the fork() is invoked after the check of
>>>>>>>> bus.shm_nattach=1, the problem won't happen too. Only if the fork() is
>>>>>>>> invoked just after shmat() and just before the check of
>>>>>>>> bus.shm_nattach=1, the problem happens.
>>>>>>>
>>>>>>> Well, the reason why the magic key check was introduced was about the
>>>>>>> safety.  Since the shmid is given explicitly, it might override some
>>>>>>> shmem usages other than dmix accidentally.  With your patch, it
>>>>>>> forcibly clears the region -- this is quite opposite to the intention 
>>>>>>> of the magic key check.
>>>>>>
>>>>>> I just think it clears the region only once at the first time. Could you
>>>>>> please show me the scenario in detail?
>>>>>
>>>>> The shmid ipc key is given explicitly by an alsa-lib configuration,
>>>>> but there is no guarantee that this key has never been used by some
>>>>> other programs for a completely different purpose.  So, for non-first
>>>>> user, it verifies whether the attached region really belongs to
>>>>> alsa-lib dmix.
>>>>>
>>>>>
>>>>> Takashi
>>>>>
>>>>
>>>> Thank you for explanation! Now I understood.
>>>> I have tried changing ipc_key many times when I have been investigating
>>>> the problem, and changing ipc_key doesn't solve the problem I
>>>> encounterd.
>>>>
>>>> I think there is more benefits to do magic key check instead of
>>>> shm_nattch check, because the situation that I encounterd happens more
>>>> often than the situation that other program has a same ipc key with
>>>> alsa-lib.
>>>
>>> Yes, possibly.  However, which may cause a severe problem?  It's the
>>> question, too.
>>
>> By 'severe problem', do you mean that alsa-lib may clear other program's
>> shmem? If the key is unique, will the 'severe problem' still happen?
> 
> If it were unique...  Actually it's not with SYSV shmem.
> 
>>>> The problem left is to make ipc key as unique as possible.
>>>> Every program/library should fulfil the responsibility of generating a
>>>> unique key. As stated on http://www.tldp.org/LDP/lpg/node24.html, ftok()
>>>> should be used to generate a unique key.
>>>
>>> This doesn't guarantee the uniqueness completely, obviously :)
>>> In addition, we'd need more than 256 variants...
>>
>> Oh, sorry, I thought it would be unique if all program/library use their
>> own full filename as arg to ftok(). Actually this doesn't guarantee
>> uniqueness. :)
>>
>> What if alsa-lib changed to use POSIX shmem (i.e. shm_open()...)?
> 
> This would be one option.  I thought it in the past, too, but
> postponed by some reason I forgot.
> 
> Another possible way I can think of is to determine the first user by
> shmget() call without IPC_CREAT.  If this succeeds, we are no first
> user.  I'm not sure, though, whether this method is more robust.

Sounds a good method, I think it can't be impacted by fork() from a
thread.

Assume there is other program/process uses same ipc key to create shmem,
and runs concurrently with alsa-lib, I can think of some methods and
situations (I only think of situations with failure).

Method 1: call shmget() without IPC_CREAT firstly, if fail then call
shmget(IPC_CREAT).
Possible situation 1.1:
(1) other program creates a shmem with same key
(2) alsa-lib calls shmget() without IPC_CREAT
(3) if (2) succeeds, alsa-lib connects to other program's shmem
(4) alsa-lib is not the first user, so alsa-lib checks
    dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC (true)
(5) alsa-lib return -EINVAL
Possible situation 1.2:
(1) alsa-lib calls shmget() without IPC_CREAT, and fails
(2) other program creates a shmem with same key
(3) alsa-lib calls shmget(IPC_CREAT), and succeeds, but connects to
    other program's shmem
(4) alsa-lib doesn't know if the shmem is created by alsa-lib or not, so
    method 1 is not applicable

Method 2: call shmget() without IPC_CREAT firstly, if fail then call
shmget(IPC_CREAT|IPC_EXCL).
Possible situation 2.1:
(1) alsa-lib calls shmget() without IPC_CREAT, and fails
(2) other program creates a shmem with same key
(3) alsa-lib calls shmget(IPC_CREAT|IPC_EXCL), and fails

Method 3: call shmget(IPC_CREAT|IPC_EXCL) firstly, if fail then call
shmget() without IPC_CREAT.
Possible situation 3.1:
(1) alsa-lib calls shmget(IPC_CREAT|IPC_EXCL), and succeeds
(2) other program tries to create a shmem with same key, and maybe fail
(3) alsa-lib is the first user, so alsa-lib does clear and
    initialization
Possible situation 3.2:
(1) other program creates a shmem with same key, and succeeds
(2) alsa-lib calls shmget(IPC_CREAT|IPC_EXCL), and fails
(3) alsa-lib calls shmget() without IPC_CREAT, and succeeds, but
    connects to other program's shmem
(4) alsa-lib is not the first user, so alsa-lib checks
    dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC (true)
(5) alsa-lib return -EINVAL

So, which method should be used? Method 2 or Method 3 or something else?


thanks,

Qing Cai


More information about the Alsa-devel mailing list