On Sat, 05 Mar 2016 09:35:51 +0100, bsiice wrote:
On 03/05/2016 03:36 PM, Takashi Iwai wrote:
On Sat, 05 Mar 2016 07:37:34 +0100, bsiice wrote:
On 03/05/2016 02:08 AM, Takashi Iwai wrote:
On Fri, 04 Mar 2016 18:36:46 +0100, bsiice wrote:
On 03/05/2016 12:37 AM, Takashi Iwai wrote:
On Fri, 04 Mar 2016 17:22:58 +0100, bsiice wrote: > > On 03/03/2016 09:53 PM, Takashi Iwai wrote: >> On Thu, 03 Mar 2016 13:40:42 +0100, >> IceBsi wrote: >>> >>> From: Qing Cai caiqing@neusoft.com >>> >>> As stated in manpage SHMCTL(2), shm_nattch is "No. of current attaches" >>> (i.e., number of processes attached to the shared memeory). If an >>> application uses alsa-lib and invokes fork() at some point, there should >>> be the following execution sequence: >>> 1. execute the following statement: >>> pcm_direct.c:110: dmix->shmptr = shmat(dmix->shmid, 0, 0) >>> (shm_nattch becomes 1) >>> 2. invoke fork() in some thread. >>> (shm_nattch becomes 2) >>> 3. execute the following statement: >>> pcm_direct.c:122: if (buf.shm_nattch == 1) >>> 4. execute the following statement: >>> pcm_direct.c:131: if (dmix->shmptr->magic != SND_PCM_DIRECT_MAGIC) >>> (As stated in manpage SHMGET(2), "When a new shared memory segment >>> is created, its contents are initialized to zero values", so >>> dmix->shmptr->magic is 0) >>> 5. execute the following statements: >>> pcm_direct.c:132: snd_pcm_direct_shm_discard(dmix) >>> pcm_direct.c:133: return -EINVAL >>> The above execution sequence will cause the following error: >>> unable to create IPC shm instance >>> This error causes multimedia application has no sound. This error rarely >>> occurs, probability is about 1%. >>> Because the first user of the shared memory will get that >>> dmix->shmptr->magic is 0, check dmix->shmptr->magic's value to determine >>> if "we're the first user" is OK. >>> Tests have been made 400+ times after this fix, and the issue no longer >>> exists. >> >> I think this is still racy. Multiple users can grab the shmem at the >> very same time. Maybe it looks as if working just because both users >> behavior as the first user and do clear and initialize. > I think this won't be a race condition. Since > snd_pcm_direct_shm_create_or_connect() is protected by > snd_pcm_direct_semaphore_down() and snd_pcm_direct_semaphore_up(), > multiple processes won't do clear and initialization concurrently.
Hrm, OK, now understood the situation.
>> The check of bus.shm_nattach=1 should be fine, per se. The problem is >> the magic key check of the secondary. In the current code, as you >> pointed out, this may happen before the first client finishes the >> initialization. > I think the check of dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC is more > robust. Assume there are two clients of shmem and shmem is initialized, > if one of the clients exits, shm_nattach will become 1 and shmem may be > initialized again.
Yeah, in the fork from a thread, the shm_nattch check doesn't work reliably, indeed.
> In the current code, because there is semaphore, magic key check won't > happen before initialization of shmem. > In the scenario that I want to tell, there is only one process, and only > one thread doning the clear and initialization of shmem, and the fork() > is invoked in another thread of non alsa-lib context (i.e., the fork() > code is not in alsa-lib). If the fork() is invoked before shmat(), the > problem won't happen; If the fork() is invoked after the check of > bus.shm_nattach=1, the problem won't happen too. Only if the fork() is > invoked just after shmat() and just before the check of > bus.shm_nattach=1, the problem happens.
Well, the reason why the magic key check was introduced was about the safety. Since the shmid is given explicitly, it might override some shmem usages other than dmix accidentally. With your patch, it forcibly clears the region -- this is quite opposite to the intention of the magic key check.
I just think it clears the region only once at the first time. Could you please show me the scenario in detail?
The shmid ipc key is given explicitly by an alsa-lib configuration, but there is no guarantee that this key has never been used by some other programs for a completely different purpose. So, for non-first user, it verifies whether the attached region really belongs to alsa-lib dmix.
Takashi
Thank you for explanation! Now I understood. I have tried changing ipc_key many times when I have been investigating the problem, and changing ipc_key doesn't solve the problem I encounterd.
I think there is more benefits to do magic key check instead of shm_nattch check, because the situation that I encounterd happens more often than the situation that other program has a same ipc key with alsa-lib.
Yes, possibly. However, which may cause a severe problem? It's the question, too.
By 'severe problem', do you mean that alsa-lib may clear other program's shmem? If the key is unique, will the 'severe problem' still happen?
If it were unique... Actually it's not with SYSV shmem.
The problem left is to make ipc key as unique as possible. Every program/library should fulfil the responsibility of generating a unique key. As stated on http://www.tldp.org/LDP/lpg/node24.html, ftok() should be used to generate a unique key.
This doesn't guarantee the uniqueness completely, obviously :) In addition, we'd need more than 256 variants...
Oh, sorry, I thought it would be unique if all program/library use their own full filename as arg to ftok(). Actually this doesn't guarantee uniqueness. :)
What if alsa-lib changed to use POSIX shmem (i.e. shm_open()...)?
This would be one option. I thought it in the past, too, but postponed by some reason I forgot.
Another possible way I can think of is to determine the first user by shmget() call without IPC_CREAT. If this succeeds, we are no first user. I'm not sure, though, whether this method is more robust.
Takashi