[alsa-devel] [PATCH - pcm 1/1] pcm: fix "unable to create IPC shm instance" in some case

Takashi Iwai tiwai at suse.de
Sat Mar 5 11:07:51 CET 2016


On Sat, 05 Mar 2016 09:35:51 +0100,
bsiice wrote:
> 
> On 03/05/2016 03:36 PM, Takashi Iwai wrote:
> > On Sat, 05 Mar 2016 07:37:34 +0100,
> > bsiice wrote:
> >>
> >> On 03/05/2016 02:08 AM, Takashi Iwai wrote:
> >>> On Fri, 04 Mar 2016 18:36:46 +0100,
> >>> bsiice wrote:
> >>>>
> >>>> On 03/05/2016 12:37 AM, Takashi Iwai wrote:
> >>>>> On Fri, 04 Mar 2016 17:22:58 +0100,
> >>>>> bsiice wrote:
> >>>>>>
> >>>>>> On 03/03/2016 09:53 PM, Takashi Iwai wrote:
> >>>>>>> On Thu, 03 Mar 2016 13:40:42 +0100,
> >>>>>>> IceBsi wrote:
> >>>>>>>>
> >>>>>>>> From: Qing Cai <caiqing at neusoft.com>
> >>>>>>>>
> >>>>>>>> As stated in manpage SHMCTL(2), shm_nattch is "No. of current attaches"
> >>>>>>>> (i.e., number of processes attached to the shared memeory). If an
> >>>>>>>> application uses alsa-lib and invokes fork() at some point, there should
> >>>>>>>> be the following execution sequence:
> >>>>>>>>  1. execute the following statement:
> >>>>>>>>       pcm_direct.c:110: dmix->shmptr = shmat(dmix->shmid, 0, 0)
> >>>>>>>>     (shm_nattch becomes 1)
> >>>>>>>>  2. invoke fork() in some thread.
> >>>>>>>>     (shm_nattch becomes 2)
> >>>>>>>>  3. execute the following statement:
> >>>>>>>>       pcm_direct.c:122: if (buf.shm_nattch == 1)
> >>>>>>>>  4. execute the following statement:
> >>>>>>>>       pcm_direct.c:131: if (dmix->shmptr->magic != SND_PCM_DIRECT_MAGIC)
> >>>>>>>>     (As stated in manpage SHMGET(2), "When a new shared memory segment
> >>>>>>>>      is created, its contents are initialized to zero values", so
> >>>>>>>>      dmix->shmptr->magic is 0)
> >>>>>>>>  5. execute the following statements:
> >>>>>>>>       pcm_direct.c:132: snd_pcm_direct_shm_discard(dmix)
> >>>>>>>>       pcm_direct.c:133: return -EINVAL
> >>>>>>>> The above execution sequence will cause the following error:
> >>>>>>>>   unable to create IPC shm instance
> >>>>>>>> This error causes multimedia application has no sound. This error rarely
> >>>>>>>> occurs, probability is about 1%.
> >>>>>>>> Because the first user of the shared memory will get that
> >>>>>>>> dmix->shmptr->magic is 0, check dmix->shmptr->magic's value to determine
> >>>>>>>> if "we're the first user" is OK.
> >>>>>>>> Tests have been made 400+ times after this fix, and the issue no longer
> >>>>>>>> exists.
> >>>>>>>
> >>>>>>> I think this is still racy.  Multiple users can grab the shmem at the
> >>>>>>> very same time.  Maybe it looks as if working just because both users
> >>>>>>> behavior as the first user and do clear and initialize.
> >>>>>> I think this won't be a race condition. Since
> >>>>>> snd_pcm_direct_shm_create_or_connect() is protected by
> >>>>>> snd_pcm_direct_semaphore_down() and snd_pcm_direct_semaphore_up(),
> >>>>>> multiple processes won't do clear and initialization concurrently.
> >>>>>
> >>>>> Hrm, OK, now understood the situation.
> >>>>>
> >>>>>>> The check of bus.shm_nattach=1 should be fine, per se.  The problem is
> >>>>>>> the magic key check of the secondary.  In the current code, as you
> >>>>>>> pointed out, this may happen before the first client finishes the
> >>>>>>> initialization.
> >>>>>> I think the check of dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC is more
> >>>>>> robust. Assume there are two clients of shmem and shmem is initialized,
> >>>>>> if one of the clients exits, shm_nattach will become 1 and shmem may be
> >>>>>> initialized again.
> >>>>>
> >>>>> Yeah, in the fork from a thread, the shm_nattch check doesn't work
> >>>>> reliably, indeed.
> >>>>>
> >>>>>> In the current code, because there is semaphore, magic key check won't
> >>>>>> happen before initialization of shmem.
> >>>>>> In the scenario that I want to tell, there is only one process, and only
> >>>>>> one thread doning the clear and initialization of shmem, and the fork()
> >>>>>> is invoked in another thread of non alsa-lib context (i.e., the fork()
> >>>>>> code is not in alsa-lib). If the fork() is invoked before shmat(), the
> >>>>>> problem won't happen; If the fork() is invoked after the check of
> >>>>>> bus.shm_nattach=1, the problem won't happen too. Only if the fork() is
> >>>>>> invoked just after shmat() and just before the check of
> >>>>>> bus.shm_nattach=1, the problem happens.
> >>>>>
> >>>>> Well, the reason why the magic key check was introduced was about the
> >>>>> safety.  Since the shmid is given explicitly, it might override some
> >>>>> shmem usages other than dmix accidentally.  With your patch, it
> >>>>> forcibly clears the region -- this is quite opposite to the intention 
> >>>>> of the magic key check.
> >>>>
> >>>> I just think it clears the region only once at the first time. Could you
> >>>> please show me the scenario in detail?
> >>>
> >>> The shmid ipc key is given explicitly by an alsa-lib configuration,
> >>> but there is no guarantee that this key has never been used by some
> >>> other programs for a completely different purpose.  So, for non-first
> >>> user, it verifies whether the attached region really belongs to
> >>> alsa-lib dmix.
> >>>
> >>>
> >>> Takashi
> >>>
> >>
> >> Thank you for explanation! Now I understood.
> >> I have tried changing ipc_key many times when I have been investigating
> >> the problem, and changing ipc_key doesn't solve the problem I
> >> encounterd.
> >>
> >> I think there is more benefits to do magic key check instead of
> >> shm_nattch check, because the situation that I encounterd happens more
> >> often than the situation that other program has a same ipc key with
> >> alsa-lib.
> > 
> > Yes, possibly.  However, which may cause a severe problem?  It's the
> > question, too.
> 
> By 'severe problem', do you mean that alsa-lib may clear other program's
> shmem? If the key is unique, will the 'severe problem' still happen?

If it were unique...  Actually it's not with SYSV shmem.

> >> The problem left is to make ipc key as unique as possible.
> >> Every program/library should fulfil the responsibility of generating a
> >> unique key. As stated on http://www.tldp.org/LDP/lpg/node24.html, ftok()
> >> should be used to generate a unique key.
> > 
> > This doesn't guarantee the uniqueness completely, obviously :)
> > In addition, we'd need more than 256 variants...
> 
> Oh, sorry, I thought it would be unique if all program/library use their
> own full filename as arg to ftok(). Actually this doesn't guarantee
> uniqueness. :)
> 
> What if alsa-lib changed to use POSIX shmem (i.e. shm_open()...)?

This would be one option.  I thought it in the past, too, but
postponed by some reason I forgot.

Another possible way I can think of is to determine the first user by
shmget() call without IPC_CREAT.  If this succeeds, we are no first
user.  I'm not sure, though, whether this method is more robust.


Takashi


More information about the Alsa-devel mailing list