On Wed, 24 Mar 2021 15:53:33 +0100, Andrey Grodzovsky wrote:
Appreciate you investing the effort in helping on this. I will start to merge it now as it doesn't apply cleanly on my branch.
If I understand correctly your main HW access prevention mechanism during the PCI prepare-rescan period is by bailing out on IOCTLs with the check of power state == SNDRV_CTL_POWER_D0 or waiting when a user process closes it's device file descriptor in patches 2 and 5. For command submission prevention you use the freeze flag from patch 6. If I haven't missed anything I don't see how those all protect when new device is plugged while any of those operations are already in flight. What prevents concurrent HW access from an IOCTL already running and HW suspend and MMIO unampping in rescan_preapre which starts after IOCTL began ?
The call of snd_pcm_suspend_all() is the key. That puts all PCM streams in the stopped and suspended state. And the codec devices as well as the controller device will be put into the suspended state. So, for the PCM side, this should be fine with it, I guess.
However, after sending the patches, I noticed that they won't suffice for the pending control calls. For the control get/put callbacks that have been pending before setting the power_state=D3hot, they would still kick off the runtime PM and bad things may happen. Due to that, the bus.frozen flag won't work reliably, I'm afraid.
So, in that part, we need the code to sync the execution of the pending get/put calls in addition. Maybe refcounting the control ioctls that are involved with get/put calls (SNDRV_CTL_IOCTL_ELEM_READ, _WRITE, maybe with _TLV_READ, too) and wait for those ioctls finishing before actually starting the codec suspend procedure.
Takashi