On Wed, 04 Sep 2024 11:29:05 +0200, Jaroslav Kysela wrote:
On 04. 09. 24 11:07, Chancel Liu wrote:
Hi Takashi,
Thanks for your reply and suggestions. Finally we have found the root cause. Seems it's related to both drivers and alsa-lib.
When two dmix clients run in parallel we get two direct dmix instances. 1st dmix instance: snd_pcm_dmix_open() snd_pcm_direct_initialize_slave() save_slave_setting() Since the driver we are using has SND_PCM_INFO_RESUME flag, dmix->spcm->info has this flag. Then this flag is cleared in dmix->shmptr->s.info. 2nd dmix instance: snd_pcm_dmix_open() snd_pcm_direct_open_secondary_client() copy_slave_setting() 2nd dmix->spcm->info is copied from dmix->shmptr->s.info so it doesn' has this flag.
If 1st dmix instance resumes firstly it should implement recovery of slave pcm in snd_pcm_direct_slave_recover(). Because 1st dmix->spcm->info has SND_PCM_INFO_RESUMEļ¼snd_pcm_resume(direct->spcm) can be called correctly to resume slave pcm.
However if 2nd dmix instance resumes firstly, snd_pcm_resume(direct->spcm) will not be called because it's spcm->info doesn't has SND_PCM_INFO_RESUME flag. The 1st dmix instance assumes someone else already did recovery so snd_pcm_resume(direct->spcm) won't be called neither. In result the slave pcm fails to resume.
The snd_pcm_direct_slave_recover() function should be called for both dmix instances. It calls snd_pcm_prepare() for the "driver" PCM, so the driver should recover from suspend in this case, too.
IIUC, it's called. snd_pcm_direct_check_xrun() is called in many places to get the state synced, and this sets the PCM state to SND_PCM_STATE_SUSPENDED when shmptr->s.recoveries changes. Then the application calls snd_pcm_resume(), and it calls snd_pcm_direct_slave_recovery().
See the "some buggy drivers" comment in snd_pcm_direct_slave_recover(). It looks like a driver issue, the "resume" flag mangling is just a workaround.
A sort of, yes. The clearance of INFO_RESUME flag should have been done no matter whether we do this workaround or not, though.
SND_PCM_INFO_RESUME flag has impact on the flow of dmix resume. In my opinion the first resumed dmix instance should make sure slave pcm can be recovered properly no matter it's the first opened instance or secondary opened instance. Do you know why the secondary opened instance clear the SND_PCM_INFO_RESUME flag? Can we do the following modification?
diff --git a/src/pcm/pcm_direct.c b/src/pcm/pcm_direct.c @@ -1183,8 +1226,6 @@ static void save_slave_setting(snd_pcm_direct_t *dmix, snd_pcm_t *spcm) COPY_SLAVE(buffer_time); COPY_SLAVE(sample_bits); COPY_SLAVE(frame_bits);
dmix->shmptr->s.info &= ~SND_PCM_INFO_RESUME;
Another option is to fix the buggy drivers and remove the workround (or make it configurable) from alsa-lib (revert commit 6d1d620eadf32c6d963468ce56ff52cc3a2f32e2).
I'm afraid that this problem is irrelevant with it.
Although I wrote it as "buggy", it might be better phrased as "fragile". We haven't defined strictly how the state should be changed when SUSPENDED or PAUSED to PREPARE. Ideally, we could just jump to PREPARE without clearing the state, but some devices seem assuming the clearance of those state at first.
Takashi