Re: [alsa-devel] 1.1.4 bug report: dmix fails when using both 32-bit and 64-bit clients
On Thu, 25 May 2017 22:17:45 +0200, Cheng Sun wrote:
On 25 May 2017 at 19:55, Takashi Iwai tiwai@suse.de wrote:
On Thu, 25 May 2017 20:34:23 +0200, Cheng Sun wrote:
On 25 May 2017 at 19:17, Takashi Iwai tiwai@suse.de wrote:
On Thu, 25 May 2017 18:51:10 +0200, Cheng Sun wrote:
On 25 May 2017 at 17:31, Takashi Iwai tiwai@suse.de wrote:
On Thu, 25 May 2017 17:43:44 +0200, Cheng Sun wrote: > > On 25 May 2017 14:49, "Takashi Iwai" tiwai@suse.de wrote: > > > > On Wed, 24 May 2017 21:13:46 +0200, > > Cheng Sun wrote: > > > > > > Hi all, > > > > > > The following commit (released in v1.1.4) introduced an issue where a > > > client compiled against a 32-bit (x86) compile of alsa-lib cannot use > > > the same dmix as a client compiled against a 64-bit alsa-lib. > > > > > > 1a9bd0f0448106b917ae7f7bedccfcbf6ce84802 > > > > > > The error message is: > > > > > > ALSA lib pcm_dmix.c:1077:(snd_pcm_dmix_open) unable to create IPC > > > shm instance > > > > > > The following forum post (not mine) describes the symptoms of the > > > issue in more detail (although the git commit the poster specified is > > > not correct): > > > > > > > https://forums.gentoo.org/viewtopic-p-8072290.html?sid=c2fbca45102933ddf9ca7... > > > > There is a known problem when you mix up two alsa-lib versions because > > of the changes in dmix shmem struct. The solution should upgrade all > > users. > > (Apologies for duplicate email --accidentally sent the first reply to the > wrong address.) > > Arch Linux has upgraded both its 32 and 64 bit packages to v1.1.4, and I'm > still experiencing this problem. > > While bisecting the bug I linked against two compiles of the same alsa-lib > versions each time. > > Let me know if I can provide any other information.
OK, at first to see whether it's the issue of the struct size mismatch, check the following patch. If it is, you'll see the error message. To be sure, check both cases: 64bit -> 32bit, and 32bit -> 64bit.
The behaviour is different for the two cases.
- If 64bit client is started first, then the 32bit client prints the following:
magic mismatch: shmptr=a15ad4f0, expected=a15ad4ec ALSA lib pcm_dmix.c:1077:(snd_pcm_dmix_open) unable to create IPC shm instance
- If 32bit client is started first, then the 64bit client only prints:
ALSA lib pcm_dmix.c:1077:(snd_pcm_dmix_open) unable to create IPC shm instance
I've added some further print statements and it turns out that this is because the "return err;" on line 115 is triggered.
Hm, so at least the struct size difference is a problem. Doese the patch below change the behavior?
Your patch did not change the behaviour.
Hmm, then it's 8 bytes alignment.
However, as an experiment I added "__attribute__((__packed__))" to the struct, which fixed the problem. So this does seem to be caused by gcc padding structs differently when compiling 32-bit vs 64-bit.
I don't think "__attribute__((__packed__))" is necessarily the correct solution here though.
OK, that's at least one way to fix. Another way would be to put another 4 bytes integer as a padding. Could you give it a try, too?
Takashi
The following patch works. I'm a little worried that future rearrangement/additions to the struct will cause it to break again though...
Yes, maybe the combination of both is the better way. That is, we should add packed attribute, but OTOH we should avoid the incompatibility again from 1.1.4 at least on x86-64. If a padding + packed doesn't change the struct size on x86-64 from the unpatched struct, we can go with it.
Alternatively, we may move the new field into some of unused old fields. Below is an example. This will make it backward compatible with 1.1.3 and older versions.
thanks,
Takashi
-- 8< -- --- a/src/pcm/pcm_direct.c +++ b/src/pcm/pcm_direct.c @@ -615,7 +615,7 @@ int snd_pcm_direct_slave_recover(snd_pcm } return ret; } - direct->shmptr->recoveries++; + direct->shmptr->s.recoveries++; semerr = snd_pcm_direct_semaphore_up(direct, DIRECT_IPC_SEM_CLIENT); if (semerr < 0) { @@ -631,11 +631,11 @@ int snd_pcm_direct_slave_recover(snd_pcm */ int snd_pcm_direct_client_chk_xrun(snd_pcm_direct_t *direct, snd_pcm_t *pcm) { - if (direct->shmptr->recoveries != direct->recoveries) { + if (direct->shmptr->s.recoveries != direct->recoveries) { /* no matter how many xruns we missed - * so don't increment but just update to actual counter */ - direct->recoveries = direct->shmptr->recoveries; + direct->recoveries = direct->shmptr->s.recoveries; pcm->fast_ops->drop(pcm); /* trigger_tstamp update is missing in drop callbacks */ gettimestamp(&direct->trigger_tstamp, pcm->tstamp_type); @@ -1539,7 +1539,7 @@ int snd_pcm_direct_open_secondary_client dmix->slave_buffer_size = spcm->buffer_size; dmix->slave_period_size = dmix->shmptr->s.period_size; dmix->slave_boundary = spcm->boundary; - dmix->recoveries = dmix->shmptr->recoveries; + dmix->recoveries = dmix->shmptr->s.recoveries;
ret = snd_pcm_mmap(spcm); if (ret < 0) { --- a/src/pcm/pcm_direct.h +++ b/src/pcm/pcm_direct.h @@ -66,7 +66,6 @@ typedef struct { char socket_name[256]; /* name of communication socket */ snd_pcm_type_t type; /* PCM type (currently only hw) */ int use_server; - unsigned int recoveries; /* no of executed recoveries on slave*/ struct { unsigned int format; snd_interval_t rate; @@ -95,7 +94,7 @@ typedef struct { unsigned int stop_threshold; unsigned int silence_threshold; unsigned int silence_size; - unsigned int xfer_align; /* not used */ + unsigned int recoveries; /* no of executed recoveries on slave*/ unsigned long long boundary; unsigned int info; unsigned int msbits;
On 25 May 2017 at 22:32, Takashi Iwai tiwai@suse.de wrote:
On Thu, 25 May 2017 22:17:45 +0200, Cheng Sun wrote:
On 25 May 2017 at 19:55, Takashi Iwai tiwai@suse.de wrote:
On Thu, 25 May 2017 20:34:23 +0200, Cheng Sun wrote:
On 25 May 2017 at 19:17, Takashi Iwai tiwai@suse.de wrote:
On Thu, 25 May 2017 18:51:10 +0200, Cheng Sun wrote:
On 25 May 2017 at 17:31, Takashi Iwai tiwai@suse.de wrote: > On Thu, 25 May 2017 17:43:44 +0200, > Cheng Sun wrote: >> >> On 25 May 2017 14:49, "Takashi Iwai" tiwai@suse.de wrote: >> > >> > On Wed, 24 May 2017 21:13:46 +0200, >> > Cheng Sun wrote: >> > > >> > > Hi all, >> > > >> > > The following commit (released in v1.1.4) introduced an issue where a >> > > client compiled against a 32-bit (x86) compile of alsa-lib cannot use >> > > the same dmix as a client compiled against a 64-bit alsa-lib. >> > > >> > > 1a9bd0f0448106b917ae7f7bedccfcbf6ce84802 >> > > >> > > The error message is: >> > > >> > > ALSA lib pcm_dmix.c:1077:(snd_pcm_dmix_open) unable to create IPC >> > > shm instance >> > > >> > > The following forum post (not mine) describes the symptoms of the >> > > issue in more detail (although the git commit the poster specified is >> > > not correct): >> > > >> > > >> https://forums.gentoo.org/viewtopic-p-8072290.html?sid=c2fbca45102933ddf9ca7... >> > >> > There is a known problem when you mix up two alsa-lib versions because >> > of the changes in dmix shmem struct. The solution should upgrade all >> > users. >> >> (Apologies for duplicate email --accidentally sent the first reply to the >> wrong address.) >> >> Arch Linux has upgraded both its 32 and 64 bit packages to v1.1.4, and I'm >> still experiencing this problem. >> >> While bisecting the bug I linked against two compiles of the same alsa-lib >> versions each time. >> >> Let me know if I can provide any other information. > > OK, at first to see whether it's the issue of the struct size > mismatch, check the following patch. If it is, you'll see the error > message. To be sure, check both cases: 64bit -> 32bit, and 32bit -> > 64bit.
The behaviour is different for the two cases.
- If 64bit client is started first, then the 32bit client prints the following:
magic mismatch: shmptr=a15ad4f0, expected=a15ad4ec ALSA lib pcm_dmix.c:1077:(snd_pcm_dmix_open) unable to create IPC shm instance
- If 32bit client is started first, then the 64bit client only prints:
ALSA lib pcm_dmix.c:1077:(snd_pcm_dmix_open) unable to create IPC shm instance
I've added some further print statements and it turns out that this is because the "return err;" on line 115 is triggered.
Hm, so at least the struct size difference is a problem. Doese the patch below change the behavior?
Your patch did not change the behaviour.
Hmm, then it's 8 bytes alignment.
However, as an experiment I added "__attribute__((__packed__))" to the struct, which fixed the problem. So this does seem to be caused by gcc padding structs differently when compiling 32-bit vs 64-bit.
I don't think "__attribute__((__packed__))" is necessarily the correct solution here though.
OK, that's at least one way to fix. Another way would be to put another 4 bytes integer as a padding. Could you give it a try, too?
Takashi
The following patch works. I'm a little worried that future rearrangement/additions to the struct will cause it to break again though...
Yes, maybe the combination of both is the better way. That is, we should add packed attribute, but OTOH we should avoid the incompatibility again from 1.1.4 at least on x86-64. If a padding + packed doesn't change the struct size on x86-64 from the unpatched struct, we can go with it.
Alternatively, we may move the new field into some of unused old fields. Below is an example. This will make it backward compatible with 1.1.3 and older versions.
thanks,
Takashi
Yes, that sounds like a good idea. I can confirm that this patch works (32-bit, 64-bit and v1.1.3 all tested to be compatible).
Cheers, Cheng
participants (2)
-
Cheng Sun
-
Takashi Iwai