[alsa-devel] EBADFD caused by commit dec428c352217010e4b8bd750d302b8062339d32

Lars Lindqvist lars.lindqvist at yandex.ru
Mon Apr 11 17:08:01 CEST 2016


On 2016-04-11 Takashi Iwai wrote:
> [Added Qing Cai to Cc, who was the author of the patch in question]
> 
> On Sun, 10 Apr 2016 23:57:11 +0200,
> Lars Lindqvist wrote:
> > 
> > Hi!
> > 
> > Since alsa-lib commit  dec428c352217010e4b8bd750d302b8062339d32, I've
> > occationally been hit by an EBADFD whenever any program tries to play
> > sound.  The  situation  I get is  that the  first shmget succeds,  so
> > dmix->shmid >= 0, therefore first_instance = 0.
> 
> I wonder how does this succeed?  It's a leftover shmem?
> But then why it contains the garbage...?

I seem to be able to trigger it by having one client open, starting another,
and quickly closing the first one. I'm not sufficiently familiar with the
alsa source, but I would guess that shmget succeeds since someone is already
attached, and then the shmem gets deconstructed when the first one closes.
Leaving it in a bad state for the second client. So a new type of race
situation is possible.
However, in this scenario, if I just stop playback or kill the processes,
on the next startup of an alsa-lib user everything is fine again. So
first_instance == 1 and buf.shm_nattch == 1.

This is in contrast to the occational problem I've been having, that I don't
know exactly how to trigger, where shmget apparently always succeeds, giving
first_instace == 0 and buf.shm_nattch == 1. Which I've only been able to
fix by completely resetting the driver by rmmod + modprobe.

> 
> > But buf.shm_nattach = 1,  so before the commit shmptr would have been
> > zeroed out, but isn't anymore. And since I still have:
> > dmix->shmptr->magic == SND_PCM_DIRECT_MAGIC,  I don't get EINVAL, but
> > EBADFD, somewhere down the line.
> 
> Could you give which line actually gives EBADFD?

In the case where I can trigger it by will, it is from pcm_dmix.c:1074, where
snd_pcm_open_slave returns -EBADFD.
But I don't know where it comes from in the spontaneous, more persistent,
case. I'm sure it is not the same place, since SNDERR("unable to open slave")
is run when snd_pcm_open_slave() < 0, but I get no such message in this case.
I'll try to pinpoint it the next chance I get.

> 
> > >From what I understand,  the race condition that was fixed would still
> > be avoided if shmptr was zeroed on (first_instance || buf.shm_nattch == 1).
> > If that is the case, would you please consider applying attached diff?
>  
> This may work, but I still would like to see how another unexpected
> situation happens.
> 
> 
> thanks,
> 
> Takashi
> 
> > Regards,
> > Lars Lindqvist
> > diff -Naur alsa-lib-1.1.1.orig/src/pcm/pcm_direct.c alsa-lib-1.1.1/src/pcm/pcm_direct.c
> > --- alsa-lib-1.1.1.orig/src/pcm/pcm_direct.c	2016-03-31 15:10:39.000000000 +0200
> > +++ alsa-lib-1.1.1/src/pcm/pcm_direct.c	2016-04-10 17:44:08.815456305 +0200
> > @@ -125,7 +125,7 @@
> >  		snd_pcm_direct_shm_discard(dmix);
> >  		return err;
> >  	}
> > -	if (first_instance) {	/* we're the first user, clear the segment */
> > +	if (first_instance || buf.shm_nattch == 1) {	/* we're the first user, clear the segment */
> >  		memset(dmix->shmptr, 0, sizeof(snd_pcm_direct_share_t));
> >  		if (dmix->ipc_gid >= 0) {
> >  			buf.shm_perm.gid = dmix->ipc_gid;


More information about the Alsa-devel mailing list