Adding movable PCIe BARs support in snd_hda_intel
Hi, I am working on adding AMD related drivers support for PCIe BARs move feature developed by Sergei (https://linuxplumbersconf.org/event/7/contributions/847/attachments/584/1035...). His feature is still not upstream, all his code and mine on top can be seen here - https://cgit.freedesktop.org/~agrodzov/linux/log/?h=yadro/pcie_hotplug/movab... I did basic implementation fro amdgpu driver and now I am doing the same for snd_hda_intel to support our on GPU Azalia audio chips. This relevant commit is here - https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/mo... Any driver that wants to support movable BARs needs to implement rescan_prepare, rescan_done and bar_fixed callbacks where rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and rescan_done needs to ioremap the BAR from it's new MMIO location and restart HW/SW.
I am able currently to trigger BARs move by sysfs using "/sys/bus/pci/rescan" and the driver will go through the sequence I described above without hangs. Problem is that after this when i try to use mplayer I am getting following errors:
andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 Downloads/file_example_MP3_5MG.mp3 MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team do_connect: could not connect to socket connect: No such file or directory Failed to open LIRC support. You will not be able to use your remote control.
Playing Downloads/file_example_MP3_5MG.mp3. libavformat version 58.29.100 (external) Audio only file format detected. Load subtitles in Downloads/ ========================================================================== Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers I, II, III) ========================================================================== AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) Video: no video Starting playback... A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% Audio device got stuck!
and in dmesg I see [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset timeout#2, CORBRP = 65535 error during the rescan_done callback execution
I know it has to do with the move of BAR's MMIO address because when i disallow BAR migration by returning true from bar_fixed callback I have no such errors and mplayer works fine.
I enabled MMIO trace and didn't see that post BAR move there is a wrong MMIO access - all of them are from the new MMIO base offset - 0xfcd80000 (trace attached including mmio trace and dmesg)
I would be happy for any idea on this and any comment on the correctness of my sequence in the code
Andrey
Just an update, i found the issue which was actually to wake up the HW before doing stop/restart (using pm_runtime_get_sync), also handled protecting from concurrent snd_pcm_ioctls accessing the registers while the BAR is unmapped. Can go through BAR move while aplay is running now.
Once again, would be happy for any comments on the code - https://cgit.freedesktop.org/~agrodzov/linux/log/?h=yadro/pcie_hotplug/movab...
Andrey
On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote:
Hi, I am working on adding AMD related drivers support for PCIe BARs move feature developed by Sergei (https://linuxplumbersconf.org/event/7/contributions/847/attachments/584/1035...).
His feature is still not upstream, all his code and mine on top can be seen here - https://cgit.freedesktop.org/~agrodzov/linux/log/?h=yadro/pcie_hotplug/movab...
I did basic implementation fro amdgpu driver and now I am doing the same for snd_hda_intel to support our on GPU Azalia audio chips. This relevant commit is here - https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/mo...
Any driver that wants to support movable BARs needs to implement rescan_prepare, rescan_done and bar_fixed callbacks where rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and rescan_done needs to ioremap the BAR from it's new MMIO location and restart HW/SW.
I am able currently to trigger BARs move by sysfs using "/sys/bus/pci/rescan" and the driver will go through the sequence I described above without hangs. Problem is that after this when i try to use mplayer I am getting following errors:
andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 Downloads/file_example_MP3_5MG.mp3 MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team do_connect: could not connect to socket connect: No such file or directory Failed to open LIRC support. You will not be able to use your remote control.
Playing Downloads/file_example_MP3_5MG.mp3. libavformat version 58.29.100 (external) Audio only file format detected. Load subtitles in Downloads/ ========================================================================== Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers I, II, III) ========================================================================== AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) Video: no video Starting playback... A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% Audio device got stuck!
and in dmesg I see [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset timeout#2, CORBRP = 65535 error during the rescan_done callback execution
I know it has to do with the move of BAR's MMIO address because when i disallow BAR migration by returning true from bar_fixed callback I have no such errors and mplayer works fine.
I enabled MMIO trace and didn't see that post BAR move there is a wrong MMIO access - all of them are from the new MMIO base offset - 0xfcd80000 (trace attached including mmio trace and dmesg)
I would be happy for any idea on this and any comment on the correctness of my sequence in the code
Andrey
On Tue, 23 Mar 2021 12:23:16 +0100, Andrey Grodzovsky wrote:
Just an update, i found the issue which was actually to wake up the HW before doing stop/restart (using pm_runtime_get_sync), also handled protecting from concurrent snd_pcm_ioctls accessing the registers while the BAR is unmapped. Can go through BAR move while aplay is running now.
Once again, would be happy for any comments on the code - https://cgit.freedesktop.org/~agrodzov/linux/log/?h=yadro/pcie_hotplug/movab...
Hrm, this kind of change is a bit scary; we have some mechanism to stop the HD-audio hardware operation (e.g. for loading the DSP for CA0132 codec), but it's pretty hackish and error-prone.
Do we really need the BAR remap while operating the sound streaming? That is, can't we simply refuse the remap if it's in operation? If it can be refused, we may avoid a big locking or such.
thanks,
Takashi
Andrey
On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote:
Hi, I am working on adding AMD related drivers support for PCIe BARs move feature developed by Sergei (https://linuxplumbersconf.org/event/7/contributions/847/attachments/584/1035...).
His feature is still not upstream, all his code and mine on top can be seen here - https://cgit.freedesktop.org/~agrodzov/linux/log/?h=yadro/pcie_hotplug/movab...
I did basic implementation fro amdgpu driver and now I am doing the same for snd_hda_intel to support our on GPU Azalia audio chips. This relevant commit is here - https://cgit.freedesktop.org/~agrodzov/linux/commit/?h=yadro/pcie_hotplug/mo...
Any driver that wants to support movable BARs needs to implement rescan_prepare, rescan_done and bar_fixed callbacks where rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and rescan_done needs to ioremap the BAR from it's new MMIO location and restart HW/SW.
I am able currently to trigger BARs move by sysfs using "/sys/bus/pci/rescan" and the driver will go through the sequence I described above without hangs. Problem is that after this when i try to use mplayer I am getting following errors:
andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 Downloads/file_example_MP3_5MG.mp3 MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team do_connect: could not connect to socket connect: No such file or directory Failed to open LIRC support. You will not be able to use your remote control.
Playing Downloads/file_example_MP3_5MG.mp3. libavformat version 58.29.100 (external) Audio only file format detected. Load subtitles in Downloads/ ========================================================================== Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers I, II, III) ========================================================================== AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) Video: no video Starting playback... A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% Audio device got stuck!
and in dmesg I see [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset timeout#2, CORBRP = 65535 error during the rescan_done callback execution
I know it has to do with the move of BAR's MMIO address because when i disallow BAR migration by returning true from bar_fixed callback I have no such errors and mplayer works fine.
I enabled MMIO trace and didn't see that post BAR move there is a wrong MMIO access - all of them are from the new MMIO base offset - 0xfcd80000 (trace attached including mmio trace and dmesg)
I would be happy for any idea on this and any comment on the correctness of my sequence in the code
Andrey
On 2021-03-23 7:39 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 12:23:16 +0100, Andrey Grodzovsky wrote:
Just an update, i found the issue which was actually to wake up the HW before doing stop/restart (using pm_runtime_get_sync), also handled protecting from concurrent snd_pcm_ioctls accessing the registers while the BAR is unmapped. Can go through BAR move while aplay is running now.
Once again, would be happy for any comments on the code - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Hrm, this kind of change is a bit scary; we have some mechanism to stop the HD-audio hardware operation (e.g. for loading the DSP for CA0132 codec), but it's pretty hackish and error-prone.
At least I didn't have issue with HW stopping, the problem is with unampping hdac_bus->remap_addr - as long as it's not iorampped back no code besides the PCI BAR move should run at all as any register access from this address will cause fatal page fault. I took care of most of the IOCTls i think, the interrupts are disabled in azx_stop_chip but I am probably still missing some stuff like some background work items or other IOCTls code path.
Do we really need the BAR remap while operating the sound streaming? That is, can't we simply refuse the remap if it's in operation? If it can be refused, we may avoid a big locking or such.
Problem here is that this feature is for the sake of hot-plug of a device, this supposed to give enough MMIO space to place all the BARs of newly plugging in device. Refusing to move BARs because there is some app using the sound card on the background might then lead to failure of plugging in the new device and this doesn't sound as a reasonable behavior to me. Also note that the lock's impact is minimal as it's read side lock only in the IOCTLs an so as long as hot plug not taking place it has no impact.
Andrey
thanks,
Takashi
Andrey
On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote:
Hi, I am working on adding AMD related drivers support for PCIe BARs move feature developed by Sergei (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinuxplumb...).
His feature is still not upstream, all his code and mine on top can be seen here - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
I did basic implementation fro amdgpu driver and now I am doing the same for snd_hda_intel to support our on GPU Azalia audio chips. This relevant commit is here - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Any driver that wants to support movable BARs needs to implement rescan_prepare, rescan_done and bar_fixed callbacks where rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and rescan_done needs to ioremap the BAR from it's new MMIO location and restart HW/SW.
I am able currently to trigger BARs move by sysfs using "/sys/bus/pci/rescan" and the driver will go through the sequence I described above without hangs. Problem is that after this when i try to use mplayer I am getting following errors:
andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 Downloads/file_example_MP3_5MG.mp3 MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team do_connect: could not connect to socket connect: No such file or directory Failed to open LIRC support. You will not be able to use your remote control.
Playing Downloads/file_example_MP3_5MG.mp3. libavformat version 58.29.100 (external) Audio only file format detected. Load subtitles in Downloads/ ========================================================================== Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers I, II, III) ========================================================================== AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) Video: no video Starting playback... A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% Audio device got stuck!
and in dmesg I see [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset timeout#2, CORBRP = 65535 error during the rescan_done callback execution
I know it has to do with the move of BAR's MMIO address because when i disallow BAR migration by returning true from bar_fixed callback I have no such errors and mplayer works fine.
I enabled MMIO trace and didn't see that post BAR move there is a wrong MMIO access - all of them are from the new MMIO base offset - 0xfcd80000 (trace attached including mmio trace and dmesg)
I would be happy for any idea on this and any comment on the correctness of my sequence in the code
Andrey
On Tue, 23 Mar 2021 15:22:01 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 7:39 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 12:23:16 +0100, Andrey Grodzovsky wrote:
Just an update, i found the issue which was actually to wake up the HW before doing stop/restart (using pm_runtime_get_sync), also handled protecting from concurrent snd_pcm_ioctls accessing the registers while the BAR is unmapped. Can go through BAR move while aplay is running now.
Once again, would be happy for any comments on the code - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Hrm, this kind of change is a bit scary; we have some mechanism to stop the HD-audio hardware operation (e.g. for loading the DSP for CA0132 codec), but it's pretty hackish and error-prone.
At least I didn't have issue with HW stopping, the problem is with unampping hdac_bus->remap_addr - as long as it's not iorampped back no code besides the PCI BAR move should run at all as any register access from this address will cause fatal page fault. I took care of most of the IOCTls i think, the interrupts are disabled in azx_stop_chip but I am probably still missing some stuff like some background work items or other IOCTls code path.
Are the all running processes stopped before entering rescan_prepare callback? Otherwise, e.g. a process accessing PCM stream via mmap can run without ioctl and this may trigger the PCM period elapsed eventually. Similarly, a control interface may issue the verb that is processed via CORB/RIRB, too. So what you need to cover is not only about PCM.
Do we really need the BAR remap while operating the sound streaming? That is, can't we simply refuse the remap if it's in operation? If it can be refused, we may avoid a big locking or such.
Problem here is that this feature is for the sake of hot-plug of a device, this supposed to give enough MMIO space to place all the BARs of newly plugging in device. Refusing to move BARs because there is some app using the sound card on the background might then lead to failure of plugging in the new device and this doesn't sound as a reasonable behavior to me. Also note that the lock's impact is minimal as it's read side lock only in the IOCTLs an so as long as hot plug not taking place it has no impact.
But a hot-plugged device can't have any running files, no? So at which timing is the BAR movement _must_ happen?
And, even if we need some synchronization, the current lock implementation looks fragile to me. We likely overlook something when trying to cover each branch code path.
Basically it needs to wait until all running processes via PCM or control API are released. We have a mechanism to block the new opens like snd_hda_lock_devices(). We'll need an additional stuff to wait until all opened devices are released there, and this requires some help in the core code, I suppose. (Actually this hackish lock_devices/unlock_devices should have been in the core stuff from the beginning...)
thanks,
Takashi
Andrey
thanks,
Takashi
Andrey
On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote:
Hi, I am working on adding AMD related drivers support for PCIe BARs move feature developed by Sergei (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinuxplumb...).
His feature is still not upstream, all his code and mine on top can be seen here - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
I did basic implementation fro amdgpu driver and now I am doing the same for snd_hda_intel to support our on GPU Azalia audio chips. This relevant commit is here - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Any driver that wants to support movable BARs needs to implement rescan_prepare, rescan_done and bar_fixed callbacks where rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and rescan_done needs to ioremap the BAR from it's new MMIO location and restart HW/SW.
I am able currently to trigger BARs move by sysfs using "/sys/bus/pci/rescan" and the driver will go through the sequence I described above without hangs. Problem is that after this when i try to use mplayer I am getting following errors:
andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 Downloads/file_example_MP3_5MG.mp3 MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team do_connect: could not connect to socket connect: No such file or directory Failed to open LIRC support. You will not be able to use your remote control.
Playing Downloads/file_example_MP3_5MG.mp3. libavformat version 58.29.100 (external) Audio only file format detected. Load subtitles in Downloads/ ========================================================================== Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers I, II, III) ========================================================================== AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) Video: no video Starting playback... A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% Audio device got stuck!
and in dmesg I see [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset timeout#2, CORBRP = 65535 error during the rescan_done callback execution
I know it has to do with the move of BAR's MMIO address because when i disallow BAR migration by returning true from bar_fixed callback I have no such errors and mplayer works fine.
I enabled MMIO trace and didn't see that post BAR move there is a wrong MMIO access - all of them are from the new MMIO base offset - 0xfcd80000 (trace attached including mmio trace and dmesg)
I would be happy for any idea on this and any comment on the correctness of my sequence in the code
Andrey
On 2021-03-23 10:54 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 15:22:01 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 7:39 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 12:23:16 +0100, Andrey Grodzovsky wrote:
Just an update, i found the issue which was actually to wake up the HW before doing stop/restart (using pm_runtime_get_sync), also handled protecting from concurrent snd_pcm_ioctls accessing the registers while the BAR is unmapped. Can go through BAR move while aplay is running now.
Once again, would be happy for any comments on the code - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Hrm, this kind of change is a bit scary; we have some mechanism to stop the HD-audio hardware operation (e.g. for loading the DSP for CA0132 codec), but it's pretty hackish and error-prone.
At least I didn't have issue with HW stopping, the problem is with unampping hdac_bus->remap_addr - as long as it's not iorampped back no code besides the PCI BAR move should run at all as any register access from this address will cause fatal page fault. I took care of most of the IOCTls i think, the interrupts are disabled in azx_stop_chip but I am probably still missing some stuff like some background work items or other IOCTls code path.
Are the all running processes stopped before entering rescan_prepare callback? Otherwise, e.g. a process accessing PCM stream via mmap can run without ioctl and this may trigger the PCM period elapsed eventually.
Is any of them remaps the BAR into user process VMA ? The only two I see in the sound subsystem doing remapping are are had_pcm_mmap and snd_pcm_lib_default_mmap and both of them remap dma buffers which are in system memory and not MMIO space and hence should not be impacted by the BAR move. Or maybe I didn't get your concern correctly ? Why there would be a problem to access the stream otherwise during the BAR move ?
Similarly, a control interface may issue the verb that is
processed via CORB/RIRB, too. So what you need to cover is not only about PCM.
Can you point me to the code path for this - is this through another type of IOCTL or interrupt handlers ?
Do we really need the BAR remap while operating the sound streaming? That is, can't we simply refuse the remap if it's in operation? If it can be refused, we may avoid a big locking or such.
Problem here is that this feature is for the sake of hot-plug of a device, this supposed to give enough MMIO space to place all the BARs of newly plugging in device. Refusing to move BARs because there is some app using the sound card on the background might then lead to failure of plugging in the new device and this doesn't sound as a reasonable behavior to me. Also note that the lock's impact is minimal as it's read side lock only in the IOCTLs an so as long as hot plug not taking place it has no impact.
But a hot-plugged device can't have any running files, no? So at which timing is the BAR movement _must_ happen?
No, the devices this code is written for are those already in the system before the new device is plugged in, the new device needs enough free MMIO space for his own BARs and so this features enables existing devices to allow the PCI core to move their BARs within the physical address space such that enough free space will be available inside the MMIO window allocated for the new device's upstream PCI bridge.
And, even if we need some synchronization, the current lock implementation looks fragile to me. We likely overlook something when trying to cover each branch code path.
Basically it needs to wait until all running processes via PCM or control API are released.
This approach brings a couple of hard questions -
How we make them stop at once - they might run for long time, you suggest something like invalidate all their mmaps and on next page fault return SIGBUS to terminate them ? Then I need also to know they actually are done - should I wait for their device file instance release? What if they don't access any mmapped are for a long time ? Sending SIGKILL explicitly ? But do I have the task id's of all of them ?
We had those scenarios discussed during my initial work on GPU hot unplug and came to a conclusssion that trying to wait for current processes to die is not a good approach - see the this respone I got at the time for example - https://lists.freedesktop.org/archives/dri-devel/2020-May/265467.html
Andrey
We have a mechanism to block the new opens
like snd_hda_lock_devices(). We'll need an additional stuff to wait until all opened devices are released there, and this requires some help in the core code, I suppose. (Actually this hackish lock_devices/unlock_devices should have been in the core stuff from the beginning...)
thanks,
Takashi
Andrey
thanks,
Takashi
Andrey
On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote:
Hi, I am working on adding AMD related drivers support for PCIe BARs move feature developed by Sergei (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinuxplumb...).
His feature is still not upstream, all his code and mine on top can be seen here - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
I did basic implementation fro amdgpu driver and now I am doing the same for snd_hda_intel to support our on GPU Azalia audio chips. This relevant commit is here - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Any driver that wants to support movable BARs needs to implement rescan_prepare, rescan_done and bar_fixed callbacks where rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and rescan_done needs to ioremap the BAR from it's new MMIO location and restart HW/SW.
I am able currently to trigger BARs move by sysfs using "/sys/bus/pci/rescan" and the driver will go through the sequence I described above without hangs. Problem is that after this when i try to use mplayer I am getting following errors:
andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 Downloads/file_example_MP3_5MG.mp3 MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team do_connect: could not connect to socket connect: No such file or directory Failed to open LIRC support. You will not be able to use your remote control.
Playing Downloads/file_example_MP3_5MG.mp3. libavformat version 58.29.100 (external) Audio only file format detected. Load subtitles in Downloads/ ========================================================================== Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers I, II, III) ========================================================================== AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) Video: no video Starting playback... A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% Audio device got stuck!
and in dmesg I see [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset timeout#2, CORBRP = 65535 error during the rescan_done callback execution
I know it has to do with the move of BAR's MMIO address because when i disallow BAR migration by returning true from bar_fixed callback I have no such errors and mplayer works fine.
I enabled MMIO trace and didn't see that post BAR move there is a wrong MMIO access - all of them are from the new MMIO base offset - 0xfcd80000 (trace attached including mmio trace and dmesg)
I would be happy for any idea on this and any comment on the correctness of my sequence in the code
Andrey
On Tue, 23 Mar 2021 17:11:20 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 10:54 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 15:22:01 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 7:39 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 12:23:16 +0100, Andrey Grodzovsky wrote:
Just an update, i found the issue which was actually to wake up the HW before doing stop/restart (using pm_runtime_get_sync), also handled protecting from concurrent snd_pcm_ioctls accessing the registers while the BAR is unmapped. Can go through BAR move while aplay is running now.
Once again, would be happy for any comments on the code - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Hrm, this kind of change is a bit scary; we have some mechanism to stop the HD-audio hardware operation (e.g. for loading the DSP for CA0132 codec), but it's pretty hackish and error-prone.
At least I didn't have issue with HW stopping, the problem is with unampping hdac_bus->remap_addr - as long as it's not iorampped back no code besides the PCI BAR move should run at all as any register access from this address will cause fatal page fault. I took care of most of the IOCTls i think, the interrupts are disabled in azx_stop_chip but I am probably still missing some stuff like some background work items or other IOCTls code path.
Are the all running processes stopped before entering rescan_prepare callback? Otherwise, e.g. a process accessing PCM stream via mmap can run without ioctl and this may trigger the PCM period elapsed eventually.
Is any of them remaps the BAR into user process VMA ? The only two I see in the sound subsystem doing remapping are are had_pcm_mmap and snd_pcm_lib_default_mmap and both of them remap dma buffers which are in system memory and not MMIO space and hence should not be impacted by the BAR move.
When a stream is running, it can cause an interrupt that may issue an unsolicited event that is deals with CORB/RIRB later in a work asynchronously.
Or maybe I didn't get your concern correctly ? Why there would be a problem to access the stream otherwise during the BAR move ?
Similarly, a control interface may issue the verb that is
processed via CORB/RIRB, too. So what you need to cover is not only about PCM.
Can you point me to the code path for this - is this through another type of IOCTL or interrupt handlers ?
Yes, it's handled in the ALSA core control ioctls. Many of the control *_get()/*_put() callbacks in HD-audio are involved with the amp or other HD-audio verbs (typically calling snd_hda_codec_amp*(), etc). Those are the mixer elements and handled via regmap with HD-audio backend that deals with CORB/RIRB eventually.
Do we really need the BAR remap while operating the sound streaming? That is, can't we simply refuse the remap if it's in operation? If it can be refused, we may avoid a big locking or such.
Problem here is that this feature is for the sake of hot-plug of a device, this supposed to give enough MMIO space to place all the BARs of newly plugging in device. Refusing to move BARs because there is some app using the sound card on the background might then lead to failure of plugging in the new device and this doesn't sound as a reasonable behavior to me. Also note that the lock's impact is minimal as it's read side lock only in the IOCTLs an so as long as hot plug not taking place it has no impact.
But a hot-plugged device can't have any running files, no? So at which timing is the BAR movement _must_ happen?
No, the devices this code is written for are those already in the system before the new device is plugged in, the new device needs enough free MMIO space for his own BARs and so this features enables existing devices to allow the PCI core to move their BARs within the physical address space such that enough free space will be available inside the MMIO window allocated for the new device's upstream PCI bridge.
Hmm, OK, so it's other way round as I thought of.
And, even if we need some synchronization, the current lock implementation looks fragile to me. We likely overlook something when trying to cover each branch code path.
Basically it needs to wait until all running processes via PCM or control API are released.
This approach brings a couple of hard questions -
How we make them stop at once - they might run for long time, you suggest something like invalidate all their mmaps and on next page fault return SIGBUS to terminate them ? Then I need also to know they actually are done - should I wait for their device file instance release? What if they don't access any mmapped are for a long time ? Sending SIGKILL explicitly ? But do I have the task id's of all of them ?
You shouldn't kill them but wait for them aborting gracefully, IMO. Essentially a kind of hot-unplug. Most of Linux drivers won't kill processes at the hot-unplug, AFAIK.
But if it has to be done so, we have a list of file descriptors that are attached to the PCM and the control devices, too...
We had those scenarios discussed during my initial work on GPU hot unplug and came to a conclusssion that trying to wait for current processes to die is not a good approach - see the this respone I got at the time for example - https://lists.freedesktop.org/archives/dri-devel/2020-May/265467.html
But should this be seen rather as an analogy of hot-replug? That won't kill processes.
Takashi
Andrey
We have a mechanism to block the new opens
like snd_hda_lock_devices(). We'll need an additional stuff to wait until all opened devices are released there, and this requires some help in the core code, I suppose. (Actually this hackish lock_devices/unlock_devices should have been in the core stuff from the beginning...)
thanks,
Takashi
Andrey
thanks,
Takashi
Andrey
On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote:
Hi, I am working on adding AMD related drivers support for PCIe BARs move feature developed by Sergei (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinuxplumb...).
His feature is still not upstream, all his code and mine on top can be seen here - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
I did basic implementation fro amdgpu driver and now I am doing the same for snd_hda_intel to support our on GPU Azalia audio chips. This relevant commit is here - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Any driver that wants to support movable BARs needs to implement rescan_prepare, rescan_done and bar_fixed callbacks where rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and rescan_done needs to ioremap the BAR from it's new MMIO location and restart HW/SW.
I am able currently to trigger BARs move by sysfs using "/sys/bus/pci/rescan" and the driver will go through the sequence I described above without hangs. Problem is that after this when i try to use mplayer I am getting following errors:
andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 Downloads/file_example_MP3_5MG.mp3 MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team do_connect: could not connect to socket connect: No such file or directory Failed to open LIRC support. You will not be able to use your remote control.
Playing Downloads/file_example_MP3_5MG.mp3. libavformat version 58.29.100 (external) Audio only file format detected. Load subtitles in Downloads/ ========================================================================== Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers I, II, III) ========================================================================== AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) Video: no video Starting playback... A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% Audio device got stuck!
and in dmesg I see [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync register 0x2f0d00. -5
Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset timeout#2, CORBRP = 65535 error during the rescan_done callback execution
I know it has to do with the move of BAR's MMIO address because when i disallow BAR migration by returning true from bar_fixed callback I have no such errors and mplayer works fine.
I enabled MMIO trace and didn't see that post BAR move there is a wrong MMIO access - all of them are from the new MMIO base offset - 0xfcd80000 (trace attached including mmio trace and dmesg)
I would be happy for any idea on this and any comment on the correctness of my sequence in the code
Andrey
On 2021-03-23 12:50 p.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 17:11:20 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 10:54 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 15:22:01 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 7:39 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 12:23:16 +0100, Andrey Grodzovsky wrote:
Just an update, i found the issue which was actually to wake up the HW before doing stop/restart (using pm_runtime_get_sync), also handled protecting from concurrent snd_pcm_ioctls accessing the registers while the BAR is unmapped. Can go through BAR move while aplay is running now.
Once again, would be happy for any comments on the code - https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Hrm, this kind of change is a bit scary; we have some mechanism to stop the HD-audio hardware operation (e.g. for loading the DSP for CA0132 codec), but it's pretty hackish and error-prone.
At least I didn't have issue with HW stopping, the problem is with unampping hdac_bus->remap_addr - as long as it's not iorampped back no code besides the PCI BAR move should run at all as any register access from this address will cause fatal page fault. I took care of most of the IOCTls i think, the interrupts are disabled in azx_stop_chip but I am probably still missing some stuff like some background work items or other IOCTls code path.
Are the all running processes stopped before entering rescan_prepare callback? Otherwise, e.g. a process accessing PCM stream via mmap can run without ioctl and this may trigger the PCM period elapsed eventually.
Is any of them remaps the BAR into user process VMA ? The only two I see in the sound subsystem doing remapping are are had_pcm_mmap and snd_pcm_lib_default_mmap and both of them remap dma buffers which are in system memory and not MMIO space and hence should not be impacted by the BAR move.
When a stream is running, it can cause an interrupt that may issue an unsolicited event that is deals with CORB/RIRB later in a work asynchronously.
On rescan_prepare I am calling snd_hdac_bus_stop_chip which in turn seems to disable interrupts - is this enough to guarantee all those won't be happening past that code ? Should I also flush any work queue items scheduled from within those ISRs before proceeding ?
Or maybe I didn't get your concern correctly ? Why there would be a problem to access the stream otherwise during the BAR move ?
Similarly, a control interface may issue the verb that is
processed via CORB/RIRB, too. So what you need to cover is not only about PCM.
Can you point me to the code path for this - is this through another type of IOCTL or interrupt handlers ?
Yes, it's handled in the ALSA core control ioctls. Many of the control *_get()/*_put() callbacks in HD-audio are involved with the amp or other HD-audio verbs (typically calling snd_hda_codec_amp*(), etc). Those are the mixer elements and handled via regmap with HD-audio backend that deals with CORB/RIRB eventually.
Do we really need the BAR remap while operating the sound streaming? That is, can't we simply refuse the remap if it's in operation? If it can be refused, we may avoid a big locking or such.
Problem here is that this feature is for the sake of hot-plug of a device, this supposed to give enough MMIO space to place all the BARs of newly plugging in device. Refusing to move BARs because there is some app using the sound card on the background might then lead to failure of plugging in the new device and this doesn't sound as a reasonable behavior to me. Also note that the lock's impact is minimal as it's read side lock only in the IOCTLs an so as long as hot plug not taking place it has no impact.
But a hot-plugged device can't have any running files, no? So at which timing is the BAR movement _must_ happen?
No, the devices this code is written for are those already in the system before the new device is plugged in, the new device needs enough free MMIO space for his own BARs and so this features enables existing devices to allow the PCI core to move their BARs within the physical address space such that enough free space will be available inside the MMIO window allocated for the new device's upstream PCI bridge.
Hmm, OK, so it's other way round as I thought of.
And, even if we need some synchronization, the current lock implementation looks fragile to me. We likely overlook something when trying to cover each branch code path.
Basically it needs to wait until all running processes via PCM or control API are released.
This approach brings a couple of hard questions -
How we make them stop at once - they might run for long time, you suggest something like invalidate all their mmaps and on next page fault return SIGBUS to terminate them ? Then I need also to know they actually are done - should I wait for their device file instance release? What if they don't access any mmapped are for a long time ? Sending SIGKILL explicitly ? But do I have the task id's of all of them ?
You shouldn't kill them but wait for them aborting gracefully, IMO. Essentially a kind of hot-unplug. Most of Linux drivers won't kill processes at the hot-unplug, AFAIK.
But see, the difference is such that on hot-plug I may just let those process pointing at zombie device to hang around indefently because I am not waiting for any output from them. That indeed what I am doing for amdgou hot-unplug work. Here i need to blocking wait for them since I need to finish the procedure so the PCI core can finish shuffling the existing BARs around and by this freeing enough space for the new device to be accepted. I user can't plug in his thunderbolt GPU and observe that nothing is happening actually because we are stuck waiting for some process just sitting idle.
But if it has to be done so, we have a list of file descriptors that are attached to the PCM and the control devices, too...
So you suggest iterate all of them, sigkill them and confirm all of them exit instead of using the rw_sem ? This will cover IOCTLs and any mmapped accesses i guess. Interrupts we discussed above. What above any possible background kernel work going on in dedicated threads or work items ? Any pointers there what should be blocked and waited for ?
Andrey
We had those scenarios discussed during my initial work on GPU hot unplug and came to a conclusssion that trying to wait for current processes to die is not a good approach - see the this respone I got at the time for example - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free...
But should this be seen rather as an analogy of hot-replug? That won't kill processes.
Takashi
Andrey
We have a mechanism to block the new opens
like snd_hda_lock_devices(). We'll need an additional stuff to wait until all opened devices are released there, and this requires some help in the core code, I suppose. (Actually this hackish lock_devices/unlock_devices should have been in the core stuff from the beginning...)
thanks,
Takashi
Andrey
thanks,
Takashi
Andrey
On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote: > Hi, I am working on adding AMD related drivers support for PCIe BARs > move feature developed by Sergei > (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinuxplumb...). > > His feature is still not upstream, all his code and mine on top can > be seen here - > https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes... > > I did basic implementation fro amdgpu driver and now I am doing the > same for snd_hda_intel to support our on GPU Azalia audio > chips. This relevant commit is here - > https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes... > > Any driver that wants to support movable BARs needs to implement > rescan_prepare, rescan_done and bar_fixed callbacks where > rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and > rescan_done needs to ioremap the BAR from it's new MMIO location and > restart HW/SW. > > I am able currently to trigger BARs move by sysfs using > "/sys/bus/pci/rescan" and the driver will go through the sequence I > described above without hangs. Problem is that after this when i try > to use mplayer I am getting following errors: > > andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 > Downloads/file_example_MP3_5MG.mp3 > MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team > do_connect: could not connect to socket > connect: No such file or directory > Failed to open LIRC support. You will not be able to use your remote > control. > > Playing Downloads/file_example_MP3_5MG.mp3. > libavformat version 58.29.100 (external) > Audio only file format detected. > Load subtitles in Downloads/ > ========================================================================== > Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III > AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) > Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers > I, II, III) > ========================================================================== > AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) > Video: no video > Starting playback... > A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% > Audio device got stuck! > > and in dmesg I see > [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync > register 0x2f0d00. -5 > > Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset > timeout#2, CORBRP = 65535 error during the rescan_done callback > execution > > I know it has to do with the move of BAR's MMIO address because when i > disallow BAR migration by returning true from bar_fixed callback I > have no such errors and mplayer works fine. > > I enabled MMIO trace and didn't see that post BAR move there is a > wrong MMIO access - all of them are from the new MMIO base offset - > 0xfcd80000 (trace attached including mmio trace and dmesg) > > I would be happy for any idea on this and any comment on the > correctness of my sequence in the code > > Andrey
On Tue, 23 Mar 2021 18:08:14 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 12:50 p.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 17:11:20 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 10:54 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 15:22:01 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 7:39 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 12:23:16 +0100, Andrey Grodzovsky wrote: > > Just an update, i found the issue which was actually to wake up the HW > before doing stop/restart (using pm_runtime_get_sync), also handled > protecting from concurrent snd_pcm_ioctls accessing the registers > while the BAR is unmapped. Can go through BAR move while aplay is > running now. > > Once again, would be happy for any comments on the code - > https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes...
Hrm, this kind of change is a bit scary; we have some mechanism to stop the HD-audio hardware operation (e.g. for loading the DSP for CA0132 codec), but it's pretty hackish and error-prone.
At least I didn't have issue with HW stopping, the problem is with unampping hdac_bus->remap_addr - as long as it's not iorampped back no code besides the PCI BAR move should run at all as any register access from this address will cause fatal page fault. I took care of most of the IOCTls i think, the interrupts are disabled in azx_stop_chip but I am probably still missing some stuff like some background work items or other IOCTls code path.
Are the all running processes stopped before entering rescan_prepare callback? Otherwise, e.g. a process accessing PCM stream via mmap can run without ioctl and this may trigger the PCM period elapsed eventually.
Is any of them remaps the BAR into user process VMA ? The only two I see in the sound subsystem doing remapping are are had_pcm_mmap and snd_pcm_lib_default_mmap and both of them remap dma buffers which are in system memory and not MMIO space and hence should not be impacted by the BAR move.
When a stream is running, it can cause an interrupt that may issue an unsolicited event that is deals with CORB/RIRB later in a work asynchronously.
On rescan_prepare I am calling snd_hdac_bus_stop_chip which in turn seems to disable interrupts - is this enough to guarantee all those won't be happening past that code ? Should I also flush any work queue items scheduled from within those ISRs before proceeding ?
Well, calling snd_hdac_bus_stop_chip() during operation itself isn't safe at all. It's called safely in the power management because we know all things have been already stopped. But calling it out of sudden is a completely different matter.
Or maybe I didn't get your concern correctly ? Why there would be a problem to access the stream otherwise during the BAR move ?
Similarly, a control interface may issue the verb that is
processed via CORB/RIRB, too. So what you need to cover is not only about PCM.
Can you point me to the code path for this - is this through another type of IOCTL or interrupt handlers ?
Yes, it's handled in the ALSA core control ioctls. Many of the control *_get()/*_put() callbacks in HD-audio are involved with the amp or other HD-audio verbs (typically calling snd_hda_codec_amp*(), etc). Those are the mixer elements and handled via regmap with HD-audio backend that deals with CORB/RIRB eventually.
Do we really need the BAR remap while operating the sound streaming? That is, can't we simply refuse the remap if it's in operation? If it can be refused, we may avoid a big locking or such.
Problem here is that this feature is for the sake of hot-plug of a device, this supposed to give enough MMIO space to place all the BARs of newly plugging in device. Refusing to move BARs because there is some app using the sound card on the background might then lead to failure of plugging in the new device and this doesn't sound as a reasonable behavior to me. Also note that the lock's impact is minimal as it's read side lock only in the IOCTLs an so as long as hot plug not taking place it has no impact.
But a hot-plugged device can't have any running files, no? So at which timing is the BAR movement _must_ happen?
No, the devices this code is written for are those already in the system before the new device is plugged in, the new device needs enough free MMIO space for his own BARs and so this features enables existing devices to allow the PCI core to move their BARs within the physical address space such that enough free space will be available inside the MMIO window allocated for the new device's upstream PCI bridge.
Hmm, OK, so it's other way round as I thought of.
And, even if we need some synchronization, the current lock implementation looks fragile to me. We likely overlook something when trying to cover each branch code path.
Basically it needs to wait until all running processes via PCM or control API are released.
This approach brings a couple of hard questions -
How we make them stop at once - they might run for long time, you suggest something like invalidate all their mmaps and on next page fault return SIGBUS to terminate them ? Then I need also to know they actually are done - should I wait for their device file instance release? What if they don't access any mmapped are for a long time ? Sending SIGKILL explicitly ? But do I have the task id's of all of them ?
You shouldn't kill them but wait for them aborting gracefully, IMO. Essentially a kind of hot-unplug. Most of Linux drivers won't kill processes at the hot-unplug, AFAIK.
But see, the difference is such that on hot-plug I may just let those process pointing at zombie device to hang around indefently because I am not waiting for any output from them. That indeed what I am doing for amdgou hot-unplug work. Here i need to blocking wait for them since I need to finish the procedure so the PCI core can finish shuffling the existing BARs around and by this freeing enough space for the new device to be accepted. I user can't plug in his thunderbolt GPU and observe that nothing is happening actually because we are stuck waiting for some process just sitting idle.
Then we can see it from a different angle; read below.
But if it has to be done so, we have a list of file descriptors that are attached to the PCM and the control devices, too...
So you suggest iterate all of them, sigkill them and confirm all of them exit instead of using the rw_sem ?
Note that, for the sound applications, we don't kill at disconnection. But the ALSA core stuff shuts the all operations up but for closing (also some event notification for the disconnection was done), hence we just wait for applications releasing resources. In anyway...
This will cover IOCTLs and any mmapped accesses i guess. Interrupts we discussed above. What above any possible background kernel work going on in dedicated threads or work items ? Any pointers there what should be blocked and waited for ?
An alternative idea would be the analogy of the system suspend / resume. That is, we forcibly suspend the devices at first somehow, and also restricts the further accesses by some way. Then do remap, and resume. Not sure whether this will fly high, but if it works, it should be more systematic and more robust than papering over each piece manually, I suppose.
Takashi
Andrey
We had those scenarios discussed during my initial work on GPU hot unplug and came to a conclusssion that trying to wait for current processes to die is not a good approach - see the this respone I got at the time for example - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free...
But should this be seen rather as an analogy of hot-replug? That won't kill processes.
Takashi
Andrey
We have a mechanism to block the new opens
like snd_hda_lock_devices(). We'll need an additional stuff to wait until all opened devices are released there, and this requires some help in the core code, I suppose. (Actually this hackish lock_devices/unlock_devices should have been in the core stuff from the beginning...)
thanks,
Takashi
Andrey
thanks,
Takashi
> > > Andrey > > On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote: >> Hi, I am working on adding AMD related drivers support for PCIe BARs >> move feature developed by Sergei >> (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinuxplumb...). >> >> His feature is still not upstream, all his code and mine on top can >> be seen here - >> https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes... >> >> I did basic implementation fro amdgpu driver and now I am doing the >> same for snd_hda_intel to support our on GPU Azalia audio >> chips. This relevant commit is here - >> https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes... >> >> Any driver that wants to support movable BARs needs to implement >> rescan_prepare, rescan_done and bar_fixed callbacks where >> rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and >> rescan_done needs to ioremap the BAR from it's new MMIO location and >> restart HW/SW. >> >> I am able currently to trigger BARs move by sysfs using >> "/sys/bus/pci/rescan" and the driver will go through the sequence I >> described above without hangs. Problem is that after this when i try >> to use mplayer I am getting following errors: >> >> andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 >> Downloads/file_example_MP3_5MG.mp3 >> MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team >> do_connect: could not connect to socket >> connect: No such file or directory >> Failed to open LIRC support. You will not be able to use your remote >> control. >> >> Playing Downloads/file_example_MP3_5MG.mp3. >> libavformat version 58.29.100 (external) >> Audio only file format detected. >> Load subtitles in Downloads/ >> ========================================================================== >> Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III >> AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) >> Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers >> I, II, III) >> ========================================================================== >> AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) >> Video: no video >> Starting playback... >> A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% >> Audio device got stuck! >> >> and in dmesg I see >> [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync >> register 0x2f0d00. -5 >> >> Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset >> timeout#2, CORBRP = 65535 error during the rescan_done callback >> execution >> >> I know it has to do with the move of BAR's MMIO address because when i >> disallow BAR migration by returning true from bar_fixed callback I >> have no such errors and mplayer works fine. >> >> I enabled MMIO trace and didn't see that post BAR move there is a >> wrong MMIO access - all of them are from the new MMIO base offset - >> 0xfcd80000 (trace attached including mmio trace and dmesg) >> >> I would be happy for any idea on this and any comment on the >> correctness of my sequence in the code >> >> Andrey >
On 2021-03-23 1:29 p.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 18:08:14 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 12:50 p.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 17:11:20 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 10:54 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 15:22:01 +0100, Andrey Grodzovsky wrote:
On 2021-03-23 7:39 a.m., Takashi Iwai wrote: > On Tue, 23 Mar 2021 12:23:16 +0100, > Andrey Grodzovsky wrote: >> >> Just an update, i found the issue which was actually to wake up the HW >> before doing stop/restart (using pm_runtime_get_sync), also handled >> protecting from concurrent snd_pcm_ioctls accessing the registers >> while the BAR is unmapped. Can go through BAR move while aplay is >> running now. >> >> Once again, would be happy for any comments on the code - >> https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes... > > Hrm, this kind of change is a bit scary; we have some mechanism to > stop the HD-audio hardware operation (e.g. for loading the DSP for > CA0132 codec), but it's pretty hackish and error-prone.
At least I didn't have issue with HW stopping, the problem is with unampping hdac_bus->remap_addr - as long as it's not iorampped back no code besides the PCI BAR move should run at all as any register access from this address will cause fatal page fault. I took care of most of the IOCTls i think, the interrupts are disabled in azx_stop_chip but I am probably still missing some stuff like some background work items or other IOCTls code path.
Are the all running processes stopped before entering rescan_prepare callback? Otherwise, e.g. a process accessing PCM stream via mmap can run without ioctl and this may trigger the PCM period elapsed eventually.
Is any of them remaps the BAR into user process VMA ? The only two I see in the sound subsystem doing remapping are are had_pcm_mmap and snd_pcm_lib_default_mmap and both of them remap dma buffers which are in system memory and not MMIO space and hence should not be impacted by the BAR move.
When a stream is running, it can cause an interrupt that may issue an unsolicited event that is deals with CORB/RIRB later in a work asynchronously.
On rescan_prepare I am calling snd_hdac_bus_stop_chip which in turn seems to disable interrupts - is this enough to guarantee all those won't be happening past that code ? Should I also flush any work queue items scheduled from within those ISRs before proceeding ?
Well, calling snd_hdac_bus_stop_chip() during operation itself isn't safe at all. It's called safely in the power management because we know all things have been already stopped. But calling it out of sudden is a completely different matter.
Or maybe I didn't get your concern correctly ? Why there would be a problem to access the stream otherwise during the BAR move ?
Similarly, a control interface may issue the verb that is
processed via CORB/RIRB, too. So what you need to cover is not only about PCM.
Can you point me to the code path for this - is this through another type of IOCTL or interrupt handlers ?
Yes, it's handled in the ALSA core control ioctls. Many of the control *_get()/*_put() callbacks in HD-audio are involved with the amp or other HD-audio verbs (typically calling snd_hda_codec_amp*(), etc). Those are the mixer elements and handled via regmap with HD-audio backend that deals with CORB/RIRB eventually.
> Do we really need the BAR remap while operating the sound streaming? > That is, can't we simply refuse the remap if it's in operation? > If it can be refused, we may avoid a big locking or such.
Problem here is that this feature is for the sake of hot-plug of a device, this supposed to give enough MMIO space to place all the BARs of newly plugging in device. Refusing to move BARs because there is some app using the sound card on the background might then lead to failure of plugging in the new device and this doesn't sound as a reasonable behavior to me. Also note that the lock's impact is minimal as it's read side lock only in the IOCTLs an so as long as hot plug not taking place it has no impact.
But a hot-plugged device can't have any running files, no? So at which timing is the BAR movement _must_ happen?
No, the devices this code is written for are those already in the system before the new device is plugged in, the new device needs enough free MMIO space for his own BARs and so this features enables existing devices to allow the PCI core to move their BARs within the physical address space such that enough free space will be available inside the MMIO window allocated for the new device's upstream PCI bridge.
Hmm, OK, so it's other way round as I thought of.
And, even if we need some synchronization, the current lock implementation looks fragile to me. We likely overlook something when trying to cover each branch code path.
Basically it needs to wait until all running processes via PCM or control API are released.
This approach brings a couple of hard questions -
How we make them stop at once - they might run for long time, you suggest something like invalidate all their mmaps and on next page fault return SIGBUS to terminate them ? Then I need also to know they actually are done - should I wait for their device file instance release? What if they don't access any mmapped are for a long time ? Sending SIGKILL explicitly ? But do I have the task id's of all of them ?
You shouldn't kill them but wait for them aborting gracefully, IMO. Essentially a kind of hot-unplug. Most of Linux drivers won't kill processes at the hot-unplug, AFAIK.
But see, the difference is such that on hot-plug I may just let those process pointing at zombie device to hang around indefently because I am not waiting for any output from them. That indeed what I am doing for amdgou hot-unplug work. Here i need to blocking wait for them since I need to finish the procedure so the PCI core can finish shuffling the existing BARs around and by this freeing enough space for the new device to be accepted. I user can't plug in his thunderbolt GPU and observe that nothing is happening actually because we are stuck waiting for some process just sitting idle.
Then we can see it from a different angle; read below.
But if it has to be done so, we have a list of file descriptors that are attached to the PCM and the control devices, too...
So you suggest iterate all of them, sigkill them and confirm all of them exit instead of using the rw_sem ?
Note that, for the sound applications, we don't kill at disconnection. But the ALSA core stuff shuts the all operations up but for closing (also some event notification for the disconnection was done), hence we just wait for applications releasing resources. In anyway...
This will cover IOCTLs and any mmapped accesses i guess. Interrupts we discussed above. What above any possible background kernel work going on in dedicated threads or work items ? Any pointers there what should be blocked and waited for ?
An alternative idea would be the analogy of the system suspend / resume. That is, we forcibly suspend the devices at first somehow, and also restricts the further accesses by some way. Then do remap,
But that the point I guess, how you block further accesses without those big locks, during S3 i believe user mode gets suspended before the driver and so you don't need to worry about concurrent IOCTLs when going through suspend sequence
Andrey
and resume. Not sure whether this will fly high, but if it works, it should be more systematic and more robust than papering over each piece manually, I suppose.
Takashi
Andrey
We had those scenarios discussed during my initial work on GPU hot unplug and came to a conclusssion that trying to wait for current processes to die is not a good approach - see the this respone I got at the time for example - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.free...
But should this be seen rather as an analogy of hot-replug? That won't kill processes.
Takashi
Andrey
We have a mechanism to block the new opens
like snd_hda_lock_devices(). We'll need an additional stuff to wait until all opened devices are released there, and this requires some help in the core code, I suppose. (Actually this hackish lock_devices/unlock_devices should have been in the core stuff from the beginning...)
thanks,
Takashi
Andrey
> > > thanks, > > Takashi > >> >> >> Andrey >> >> On 2021-03-19 5:22 p.m., Andrey Grodzovsky wrote: >>> Hi, I am working on adding AMD related drivers support for PCIe BARs >>> move feature developed by Sergei >>> (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinuxplumb...). >>> >>> His feature is still not upstream, all his code and mine on top can >>> be seen here - >>> https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes... >>> >>> I did basic implementation fro amdgpu driver and now I am doing the >>> same for snd_hda_intel to support our on GPU Azalia audio >>> chips. This relevant commit is here - >>> https://nam11.safelinks.protection.outlook.com/?url=https:%2F%2Fcgit.freedes... >>> >>> Any driver that wants to support movable BARs needs to implement >>> rescan_prepare, rescan_done and bar_fixed callbacks where >>> rescan_prepare needs to stop HW/SW and unamp all MMIO mappings and >>> rescan_done needs to ioremap the BAR from it's new MMIO location and >>> restart HW/SW. >>> >>> I am able currently to trigger BARs move by sysfs using >>> "/sys/bus/pci/rescan" and the driver will go through the sequence I >>> described above without hangs. Problem is that after this when i try >>> to use mplayer I am getting following errors: >>> >>> andrey@andrey-test:~$ sudo mplayer -ao alsa:device=hw=0.3 >>> Downloads/file_example_MP3_5MG.mp3 >>> MPlayer 1.3.0 (Debian), built with gcc-9 (C) 2000-2016 MPlayer Team >>> do_connect: could not connect to socket >>> connect: No such file or directory >>> Failed to open LIRC support. You will not be able to use your remote >>> control. >>> >>> Playing Downloads/file_example_MP3_5MG.mp3. >>> libavformat version 58.29.100 (external) >>> Audio only file format detected. >>> Load subtitles in Downloads/ >>> ========================================================================== >>> Opening audio decoder: [mpg123] MPEG 1.0/2.0/2.5 layers I, II, III >>> AUDIO: 44100 Hz, 2 ch, s16le, 320.0 kbit/22.68% (ratio: 40000->176400) >>> Selected audio codec: [mpg123] afm: mpg123 (MPEG 1.0/2.0/2.5 layers >>> I, II, III) >>> ========================================================================== >>> AO: [alsa] 44100Hz 2ch s16le (2 bytes per sample) >>> Video: no video >>> Starting playback... >>> A: 0.1 (00.0) of 132.0 (02:12.0) ??,?% >>> Audio device got stuck! >>> >>> and in dmesg I see >>> [ 365.355518] snd_hda_codec_hdmi hdaudioC0D0: Unable to sync >>> register 0x2f0d00. -5 >>> >>> Also I see 296.619014] snd_hda_intel 0000:0a:00.1: CORB reset >>> timeout#2, CORBRP = 65535 error during the rescan_done callback >>> execution >>> >>> I know it has to do with the move of BAR's MMIO address because when i >>> disallow BAR migration by returning true from bar_fixed callback I >>> have no such errors and mplayer works fine. >>> >>> I enabled MMIO trace and didn't see that post BAR move there is a >>> wrong MMIO access - all of them are from the new MMIO base offset - >>> 0xfcd80000 (trace attached including mmio trace and dmesg) >>> >>> I would be happy for any idea on this and any comment on the >>> correctness of my sequence in the code >>> >>> Andrey >>
On Tue, 23 Mar 2021 19:25:53 +0100, Andrey Grodzovsky wrote:
This will cover IOCTLs and any mmapped accesses i guess. Interrupts we discussed above. What above any possible background kernel work going on in dedicated threads or work items ? Any pointers there what should be blocked and waited for ?
An alternative idea would be the analogy of the system suspend / resume. That is, we forcibly suspend the devices at first somehow, and also restricts the further accesses by some way. Then do remap,
But that the point I guess, how you block further accesses without those big locks, during S3 i believe user mode gets suspended before the driver and so you don't need to worry about concurrent IOCTLs when going through suspend sequence
ALSA core still has some legacy card-level power management code, which was introduced many years ago at the time we still managed the power state via an extra ioctl (hence working individually from the base PM code), and a few pieces are still effective for this kind of purposes. Through a quick glance, a couple of places need band-aids, but the rest should work.
A bit more difficult problem is the floating control API calls. The get/put calls might be still in flight when we perform the PCI rescan. This has to be filtered out additionally.
Below are a patch series I cooked quickly. Totally untested, just checked the compilation. The first patch is a fix I'll merge in anyway, while the rest are RFC.
thanks,
Takashi
Appreciate you investing the effort in helping on this. I will start to merge it now as it doesn't apply cleanly on my branch.
If I understand correctly your main HW access prevention mechanism during the PCI prepare-rescan period is by bailing out on IOCTLs with the check of power state == SNDRV_CTL_POWER_D0 or waiting when a user process closes it's device file descriptor in patches 2 and 5. For command submission prevention you use the freeze flag from patch 6. If I haven't missed anything I don't see how those all protect when new device is plugged while any of those operations are already in flight. What prevents concurrent HW access from an IOCTL already running and HW suspend and MMIO unampping in rescan_preapre which starts after IOCTL began ?
Andrey
On 2021-03-24 6:00 a.m., Takashi Iwai wrote:
On Tue, 23 Mar 2021 19:25:53 +0100, Andrey Grodzovsky wrote:
This will cover IOCTLs and any mmapped accesses i guess. Interrupts we discussed above. What above any possible background kernel work going on in dedicated threads or work items ? Any pointers there what should be blocked and waited for ?
An alternative idea would be the analogy of the system suspend / resume. That is, we forcibly suspend the devices at first somehow, and also restricts the further accesses by some way. Then do remap,
But that the point I guess, how you block further accesses without those big locks, during S3 i believe user mode gets suspended before the driver and so you don't need to worry about concurrent IOCTLs when going through suspend sequence
ALSA core still has some legacy card-level power management code, which was introduced many years ago at the time we still managed the power state via an extra ioctl (hence working individually from the base PM code), and a few pieces are still effective for this kind of purposes. Through a quick glance, a couple of places need band-aids, but the rest should work.
A bit more difficult problem is the floating control API calls. The get/put calls might be still in flight when we perform the PCI rescan. This has to be filtered out additionally.
Below are a patch series I cooked quickly. Totally untested, just checked the compilation. The first patch is a fix I'll merge in anyway, while the rest are RFC.
thanks,
Takashi
On Wed, 24 Mar 2021 15:53:33 +0100, Andrey Grodzovsky wrote:
Appreciate you investing the effort in helping on this. I will start to merge it now as it doesn't apply cleanly on my branch.
If I understand correctly your main HW access prevention mechanism during the PCI prepare-rescan period is by bailing out on IOCTLs with the check of power state == SNDRV_CTL_POWER_D0 or waiting when a user process closes it's device file descriptor in patches 2 and 5. For command submission prevention you use the freeze flag from patch 6. If I haven't missed anything I don't see how those all protect when new device is plugged while any of those operations are already in flight. What prevents concurrent HW access from an IOCTL already running and HW suspend and MMIO unampping in rescan_preapre which starts after IOCTL began ?
The call of snd_pcm_suspend_all() is the key. That puts all PCM streams in the stopped and suspended state. And the codec devices as well as the controller device will be put into the suspended state. So, for the PCM side, this should be fine with it, I guess.
However, after sending the patches, I noticed that they won't suffice for the pending control calls. For the control get/put callbacks that have been pending before setting the power_state=D3hot, they would still kick off the runtime PM and bad things may happen. Due to that, the bus.frozen flag won't work reliably, I'm afraid.
So, in that part, we need the code to sync the execution of the pending get/put calls in addition. Maybe refcounting the control ioctls that are involved with get/put calls (SNDRV_CTL_IOCTL_ELEM_READ, _WRITE, maybe with _TLV_READ, too) and wait for those ioctls finishing before actually starting the codec suspend procedure.
Takashi
I merged the patches, since it looks like my branch is older I didn't have chip->init_failed member and also seems there was a typo on ur side as there is no flush_work_sync, only flush_work which waits synchronously anyway.
Everything works fine when testing with maplyer running concurrently to PCI rescan cycle.
Can you be more specific what are those get/put calls, I am thinking about some waitqueue to wait on in rescan_prepare after setting snd_hdac_bus_freeze, on wakeup it checks that a counter dropped back to zero. Not sure on which entity to hang this counter ?
Andrey
On 2021-03-24 11:06 a.m., Takashi Iwai wrote:
On Wed, 24 Mar 2021 15:53:33 +0100, Andrey Grodzovsky wrote:
Appreciate you investing the effort in helping on this. I will start to merge it now as it doesn't apply cleanly on my branch.
If I understand correctly your main HW access prevention mechanism during the PCI prepare-rescan period is by bailing out on IOCTLs with the check of power state == SNDRV_CTL_POWER_D0 or waiting when a user process closes it's device file descriptor in patches 2 and 5. For command submission prevention you use the freeze flag from patch 6. If I haven't missed anything I don't see how those all protect when new device is plugged while any of those operations are already in flight. What prevents concurrent HW access from an IOCTL already running and HW suspend and MMIO unampping in rescan_preapre which starts after IOCTL began ?
The call of snd_pcm_suspend_all() is the key. That puts all PCM streams in the stopped and suspended state. And the codec devices as well as the controller device will be put into the suspended state. So, for the PCM side, this should be fine with it, I guess.
However, after sending the patches, I noticed that they won't suffice for the pending control calls. For the control get/put callbacks that have been pending before setting the power_state=D3hot, they would still kick off the runtime PM and bad things may happen. Due to that, the bus.frozen flag won't work reliably, I'm afraid.
So, in that part, we need the code to sync the execution of the pending get/put calls in addition. Maybe refcounting the control ioctls that are involved with get/put calls (SNDRV_CTL_IOCTL_ELEM_READ, _WRITE, maybe with _TLV_READ, too) and wait for those ioctls finishing before actually starting the codec suspend procedure.
Takashi
On Wed, 24 Mar 2021 16:43:02 +0100, Andrey Grodzovsky wrote:
I merged the patches, since it looks like my branch is older I didn't have chip->init_failed member and also seems there was a typo on ur side as there is no flush_work_sync, only flush_work which waits synchronously anyway.
Ah that must be some typos I forgot to refresh.
Everything works fine when testing with maplyer running concurrently to PCI rescan cycle.
Can you be more specific what are those get/put calls, I am thinking about some waitqueue to wait on in rescan_prepare after setting snd_hdac_bus_freeze, on wakeup it checks that a counter dropped back to zero. Not sure on which entity to hang this counter ?
It's something like the patch 6 in the v2 series below. At this time, I dropped snd_hdac_bus_freeze() as this is basically useless.
Maybe this is no optimal implementation and we might improve a bit, but I guess you get an idea from that.
Also I added ifdef CONFIG_PM in the last patch as it obviously depends on it. It could be put in Kconfig somehow, too, but a simple ifdef should suffice.
Takashi
Few comments -
1) Why we don't use snd_power_wait_and_ref in patch 3 in the common handler ? Don't we want the PCI rescan sequence to 'wait for' any in flight taks that might be accessing registers and not only read/write/tlv accesses ?
2) Possible deadlock - In azx_rescan_prepare - you put the card into SNDRV_CTL_POWER_D3hot first and then 'wait for' all in flight tasks with the refcount. The in flight tasks on the other hand, using snd_power_wait_and_ref, may have already bumped up the refcount and now 'wait for' the card to go into SNDRV_CTL_POWER_D0 which can't happen since PCI rescan waits for the refocunt to drop to 0 before proceeding.
Instead of snd_power_wait_and_ref can't we just call snd_power_ref in common IOCTL before checking for power_state != SNDRV_CTL_POWER_D0 ? Or is it because you don't want to fail IOCTLs ?
Andrey
On 2021-03-24 4:36 p.m., Takashi Iwai wrote:
On Wed, 24 Mar 2021 16:43:02 +0100, Andrey Grodzovsky wrote:
I merged the patches, since it looks like my branch is older I didn't have chip->init_failed member and also seems there was a typo on ur side as there is no flush_work_sync, only flush_work which waits synchronously anyway.
Ah that must be some typos I forgot to refresh.
Everything works fine when testing with maplyer running concurrently to PCI rescan cycle.
Can you be more specific what are those get/put calls, I am thinking about some waitqueue to wait on in rescan_prepare after setting snd_hdac_bus_freeze, on wakeup it checks that a counter dropped back to zero. Not sure on which entity to hang this counter ?
It's something like the patch 6 in the v2 series below. At this time, I dropped snd_hdac_bus_freeze() as this is basically useless.
Maybe this is no optimal implementation and we might improve a bit, but I guess you get an idea from that.
Also I added ifdef CONFIG_PM in the last patch as it obviously depends on it. It could be put in Kconfig somehow, too, but a simple ifdef should suffice.
Takashi
On Wed, 24 Mar 2021 22:43:24 +0100, Andrey Grodzovsky wrote:
Few comments -
- Why we don't use snd_power_wait_and_ref in patch 3 in the common
handler ? Don't we want the PCI rescan sequence to 'wait for' any in flight taks that might be accessing registers and not only read/write/tlv accesses ?
Right, we don't need to block all ioctls, but only the ones that may access the hardware. So basically the patches 3-5 can be dropped if we take the patch 6. The current patch was written on top of the previous series, that's why it has both.
- Possible deadlock -
In azx_rescan_prepare - you put the card into SNDRV_CTL_POWER_D3hot first and then 'wait for' all in flight tasks with the refcount. The in flight tasks on the other hand, using snd_power_wait_and_ref, may have already bumped up the refcount and now 'wait for' the card to go into SNDRV_CTL_POWER_D0 which can't happen since PCI rescan waits for the refocunt to drop to 0 before proceeding.
Instead of snd_power_wait_and_ref can't we just call snd_power_ref in common IOCTL before checking for power_state != SNDRV_CTL_POWER_D0 ? Or is it because you don't want to fail IOCTLs ?
No, this is the intended behavior and should work as-is because snd_power_wait_and_ref() drops the refcount in the loop before sleeping. The inc before the state check is a must for covering the possible race, and ditto for changing the power_state to D3hot before syncing.
Takashi
On 2021-03-25 2:58 a.m., Takashi Iwai wrote:
On Wed, 24 Mar 2021 22:43:24 +0100, Andrey Grodzovsky wrote:
Few comments -
- Why we don't use snd_power_wait_and_ref in patch 3 in the common
handler ? Don't we want the PCI rescan sequence to 'wait for' any in flight taks that might be accessing registers and not only read/write/tlv accesses ?
Right, we don't need to block all ioctls, but only the ones that may access the hardware. So basically the patches 3-5 can be dropped if we take the patch 6. The current patch was written on top of the previous series, that's why it has both.
- Possible deadlock -
In azx_rescan_prepare - you put the card into SNDRV_CTL_POWER_D3hot first and then 'wait for' all in flight tasks with the refcount. The in flight tasks on the other hand, using snd_power_wait_and_ref, may have already bumped up the refcount and now 'wait for' the card to go into SNDRV_CTL_POWER_D0 which can't happen since PCI rescan waits for the refocunt to drop to 0 before proceeding.
Instead of snd_power_wait_and_ref can't we just call snd_power_ref in common IOCTL before checking for power_state != SNDRV_CTL_POWER_D0 ? Or is it because you don't want to fail IOCTLs ?
No, this is the intended behavior and should work as-is because snd_power_wait_and_ref() drops the refcount in the loop before sleeping. The inc before the state check is a must for covering the possible race, and ditto for changing the power_state to D3hot before syncing.
Ohh, missed the refcount dec in the wait loop... My bad.
On boot the latest patch-set was throwing refcount warnings since u are not supposed to inc from zerro count and so I fixed with the attached patch. That way seems to work fine.
Andrey
Takashi
On Thu, 25 Mar 2021 17:13:41 +0100, Andrey Grodzovsky wrote:
On 2021-03-25 2:58 a.m., Takashi Iwai wrote:
On Wed, 24 Mar 2021 22:43:24 +0100, Andrey Grodzovsky wrote:
Few comments -
- Why we don't use snd_power_wait_and_ref in patch 3 in the common
handler ? Don't we want the PCI rescan sequence to 'wait for' any in flight taks that might be accessing registers and not only read/write/tlv accesses ?
Right, we don't need to block all ioctls, but only the ones that may access the hardware. So basically the patches 3-5 can be dropped if we take the patch 6. The current patch was written on top of the previous series, that's why it has both.
- Possible deadlock -
In azx_rescan_prepare - you put the card into SNDRV_CTL_POWER_D3hot first and then 'wait for' all in flight tasks with the refcount. The in flight tasks on the other hand, using snd_power_wait_and_ref, may have already bumped up the refcount and now 'wait for' the card to go into SNDRV_CTL_POWER_D0 which can't happen since PCI rescan waits for the refocunt to drop to 0 before proceeding.
Instead of snd_power_wait_and_ref can't we just call snd_power_ref in common IOCTL before checking for power_state != SNDRV_CTL_POWER_D0 ? Or is it because you don't want to fail IOCTLs ?
No, this is the intended behavior and should work as-is because snd_power_wait_and_ref() drops the refcount in the loop before sleeping. The inc before the state check is a must for covering the possible race, and ditto for changing the power_state to D3hot before syncing.
Ohh, missed the refcount dec in the wait loop... My bad.
On boot the latest patch-set was throwing refcount warnings since u are not supposed to inc from zerro count and so I fixed with the attached patch. That way seems to work fine.
Ah, I completely forgot about the oddness of refcount_t. Then I guess it's simpler to replace with atomic_*() instead.
thanks,
Takashi
Attached.
Andrey
On 2021-03-25 12:38 p.m., Takashi Iwai wrote:
On Thu, 25 Mar 2021 17:13:41 +0100, Andrey Grodzovsky wrote:
On 2021-03-25 2:58 a.m., Takashi Iwai wrote:
On Wed, 24 Mar 2021 22:43:24 +0100, Andrey Grodzovsky wrote:
Few comments -
- Why we don't use snd_power_wait_and_ref in patch 3 in the common
handler ? Don't we want the PCI rescan sequence to 'wait for' any in flight taks that might be accessing registers and not only read/write/tlv accesses ?
Right, we don't need to block all ioctls, but only the ones that may access the hardware. So basically the patches 3-5 can be dropped if we take the patch 6. The current patch was written on top of the previous series, that's why it has both.
- Possible deadlock -
In azx_rescan_prepare - you put the card into SNDRV_CTL_POWER_D3hot first and then 'wait for' all in flight tasks with the refcount. The in flight tasks on the other hand, using snd_power_wait_and_ref, may have already bumped up the refcount and now 'wait for' the card to go into SNDRV_CTL_POWER_D0 which can't happen since PCI rescan waits for the refocunt to drop to 0 before proceeding.
Instead of snd_power_wait_and_ref can't we just call snd_power_ref in common IOCTL before checking for power_state != SNDRV_CTL_POWER_D0 ? Or is it because you don't want to fail IOCTLs ?
No, this is the intended behavior and should work as-is because snd_power_wait_and_ref() drops the refcount in the loop before sleeping. The inc before the state check is a must for covering the possible race, and ditto for changing the power_state to D3hot before syncing.
Ohh, missed the refcount dec in the wait loop... My bad.
On boot the latest patch-set was throwing refcount warnings since u are not supposed to inc from zerro count and so I fixed with the attached patch. That way seems to work fine.
Ah, I completely forgot about the oddness of refcount_t. Then I guess it's simpler to replace with atomic_*() instead.
thanks,
Takashi
On Fri, 26 Mar 2021 22:27:26 +0100, Andrey Grodzovsky wrote:
Attached.
Thanks. I found that there are a few other places that may enter the hardware access and might need the power_state blocking similarly.
The patch set below is a refreshed one including those.
Takashi
Tested and Looks good.
Thanks a lot for ur help! I will be moving now to work on USB driver as this is another on CARD controller we need to take care of.
Andrey
On 2021-03-27 4:17 a.m., Takashi Iwai wrote:
On Fri, 26 Mar 2021 22:27:26 +0100, Andrey Grodzovsky wrote:
Attached.
Thanks. I found that there are a few other places that may enter the hardware access and might need the power_state blocking similarly.
The patch set below is a refreshed one including those.
Takashi
On Mon, 29 Mar 2021 16:47:04 +0200, Andrey Grodzovsky wrote:
Tested and Looks good.
Thanks a lot for ur help! I will be moving now to work on USB driver as this is another on CARD controller we need to take care of.
OK, let me know if you need the final version some time later. The last (main) patch has no description, and I corrected a few minor things from the previous patch set.
Or, if you can give some branch I can put those patches on the top, I can keep it in my tree, too.
thanks,
Takashi
Andrey
On 2021-03-27 4:17 a.m., Takashi Iwai wrote:
On Fri, 26 Mar 2021 22:27:26 +0100, Andrey Grodzovsky wrote:
Attached.
Thanks. I found that there are a few other places that may enter the hardware access and might need the power_state blocking similarly.
The patch set below is a refreshed one including those.
Takashi
Yea, better give me final version now as I might forget later.
I keep my work here - https://cgit.freedesktop.org/~agrodzov/linux/log/?h=yadro/pcie_hotplug/movab... It's a clone of Segei's original branch here https://github.com/YADRO-KNS/linux/commits/yadro/pcie_hotplug/movable_bars_v... + My amdgpu work and your ALSA work on top, once I finish the USB part and resolve a few issues I still observe with AMDGPU i will ask Sergei to merge everything to his branch and form there I guess he will move on to submit another RFC.
Andrey
On 2021-03-29 10:52 a.m., Takashi Iwai wrote:
On Mon, 29 Mar 2021 16:47:04 +0200, Andrey Grodzovsky wrote:
Tested and Looks good.
Thanks a lot for ur help! I will be moving now to work on USB driver as this is another on CARD controller we need to take care of.
OK, let me know if you need the final version some time later. The last (main) patch has no description, and I corrected a few minor things from the previous patch set.
Or, if you can give some branch I can put those patches on the top, I can keep it in my tree, too.
thanks,
Takashi
Andrey
On 2021-03-27 4:17 a.m., Takashi Iwai wrote:
On Fri, 26 Mar 2021 22:27:26 +0100, Andrey Grodzovsky wrote:
Attached.
Thanks. I found that there are a few other places that may enter the hardware access and might need the power_state blocking similarly.
The patch set below is a refreshed one including those.
Takashi
Hi Takashi,
I love your patch! Perhaps something to improve:
[auto build test WARNING on v5.12-rc4] [also build test WARNING on next-20210324] [cannot apply to sound/for-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Takashi-Iwai/ALSA-hda-Re-add-droppe... base: 0d02ec6b3136c73c09e7859f0d0e4e2c4c07b49b config: arc-allyesconfig (attached as .config) compiler: arceb-elf-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/9d72f7bfcc53f9e360ceba244596818bb00f... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Takashi-Iwai/ALSA-hda-Re-add-dropped-snd_poewr_change_state/20210325-043958 git checkout 9d72f7bfcc53f9e360ceba244596818bb00f1f7d # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=arc
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All warnings (new ones prefixed by >>):
sound/pci/hda/hda_intel.c: In function 'azx_rescan_prepare': sound/pci/hda/hda_intel.c:2419:3: error: implicit declaration of function 'azx_prepare'; did you mean 'azx_create'? [-Werror=implicit-function-declaration] 2419 | azx_prepare(&pdev->dev); | ^~~~~~~~~~~ | azx_create sound/pci/hda/hda_intel.c:2427:3: error: implicit declaration of function 'azx_suspend' [-Werror=implicit-function-declaration] 2427 | azx_suspend(&pdev->dev); | ^~~~~~~~~~~ sound/pci/hda/hda_intel.c: In function 'azx_rescan_done': sound/pci/hda/hda_intel.c:2466:3: error: implicit declaration of function 'azx_resume'; did you mean 'azx_remove'? [-Werror=implicit-function-declaration] 2466 | azx_resume(&pdev->dev); | ^~~~~~~~~~ | azx_remove sound/pci/hda/hda_intel.c:2471:3: error: implicit declaration of function 'azx_complete'; did you mean 'complete'? [-Werror=implicit-function-declaration] 2471 | azx_complete(&pdev->dev); | ^~~~~~~~~~~~ | complete sound/pci/hda/hda_intel.c: At top level: sound/pci/hda/hda_intel.c:2845:3: error: 'struct pci_driver' has no member named 'rescan_prepare' 2845 | .rescan_prepare = azx_rescan_prepare, | ^~~~~~~~~~~~~~
sound/pci/hda/hda_intel.c:2845:20: warning: initialization of 'unsigned int' from 'void (*)(struct pci_dev *)' makes integer from pointer without a cast [-Wint-conversion]
2845 | .rescan_prepare = azx_rescan_prepare, | ^~~~~~~~~~~~~~~~~~ sound/pci/hda/hda_intel.c:2845:20: note: (near initialization for 'azx_driver.dynids.lock.<anonymous>.rlock.raw_lock.slock') sound/pci/hda/hda_intel.c:2846:3: error: 'struct pci_driver' has no member named 'rescan_done' 2846 | .rescan_done = azx_rescan_done, | ^~~~~~~~~~~
sound/pci/hda/hda_intel.c:2846:17: warning: excess elements in struct initializer
2846 | .rescan_done = azx_rescan_done, | ^~~~~~~~~~~~~~~ sound/pci/hda/hda_intel.c:2846:17: note: (near initialization for 'azx_driver') sound/pci/hda/hda_intel.c:2847:3: error: 'struct pci_driver' has no member named 'bar_fixed' 2847 | .bar_fixed = azx_bar_fixed, | ^~~~~~~~~ sound/pci/hda/hda_intel.c:2847:15: warning: excess elements in struct initializer 2847 | .bar_fixed = azx_bar_fixed, | ^~~~~~~~~~~~~ sound/pci/hda/hda_intel.c:2847:15: note: (near initialization for 'azx_driver')
sound/pci/hda/hda_intel.c:2835:39: warning: missing braces around initializer [-Wmissing-braces]
2835 | static struct pci_driver azx_driver = { | ^ ...... 2845 | .rescan_prepare = azx_rescan_prepare, | {{{{{ }}}}} In file included from include/linux/perf_event.h:25, from include/linux/trace_events.h:10, from include/trace/trace_events.h:21, from include/trace/define_trace.h:102, from sound/pci/hda/hda_intel_trace.h:54, from sound/pci/hda/hda_intel.c:59: arch/arc/include/asm/perf_event.h:126:23: warning: 'arc_pmu_cache_map' defined but not used [-Wunused-const-variable=] 126 | static const unsigned arc_pmu_cache_map[C(MAX)][C(OP_MAX)][C(RESULT_MAX)] = { | ^~~~~~~~~~~~~~~~~ arch/arc/include/asm/perf_event.h:91:27: warning: 'arc_pmu_ev_hw_map' defined but not used [-Wunused-const-variable=] 91 | static const char * const arc_pmu_ev_hw_map[] = { | ^~~~~~~~~~~~~~~~~ cc1: some warnings being treated as errors
vim +2845 sound/pci/hda/hda_intel.c
2833 2834 /* pci_driver definition */
2835 static struct pci_driver azx_driver = {
2836 .name = KBUILD_MODNAME, 2837 .id_table = azx_ids, 2838 .probe = azx_probe, 2839 .remove = azx_remove, 2840 .shutdown = azx_shutdown, 2841 .driver = { 2842 .pm = AZX_PM_OPS, 2843 }, 2844 #ifdef CONFIG_PM
2845 .rescan_prepare = azx_rescan_prepare, 2846 .rescan_done = azx_rescan_done,
2847 .bar_fixed = azx_bar_fixed, 2848 #endif 2849 }; 2850
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Hi Takashi,
I love your patch! Yet something to improve:
[auto build test ERROR on v5.12-rc4] [also build test ERROR on next-20210324] [cannot apply to sound/for-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Takashi-Iwai/ALSA-hda-Re-add-droppe... base: 0d02ec6b3136c73c09e7859f0d0e4e2c4c07b49b config: m68k-allmodconfig (attached as .config) compiler: m68k-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/9d72f7bfcc53f9e360ceba244596818bb00f... git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Takashi-Iwai/ALSA-hda-Re-add-dropped-snd_poewr_change_state/20210325-043958 git checkout 9d72f7bfcc53f9e360ceba244596818bb00f1f7d # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k
If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>):
In file included from sound/core/control.c:16: sound/core/control.c: In function 'snd_ctl_elem_read':
include/sound/core.h:176:43: error: expected expression before 'do'
176 | #define snd_power_wait_and_ref(card, ref) do { (void)(card); } while (0) | ^~ sound/core/control.c:1060:8: note: in expansion of macro 'snd_power_wait_and_ref' 1060 | ret = snd_power_wait_and_ref(card, true); | ^~~~~~~~~~~~~~~~~~~~~~ sound/core/control.c: In function 'snd_ctl_elem_write':
include/sound/core.h:176:43: error: expected expression before 'do'
176 | #define snd_power_wait_and_ref(card, ref) do { (void)(card); } while (0) | ^~ sound/core/control.c:1126:11: note: in expansion of macro 'snd_power_wait_and_ref' 1126 | result = snd_power_wait_and_ref(card, true); | ^~~~~~~~~~~~~~~~~~~~~~ sound/core/control.c: In function 'call_tlv_handler':
include/sound/core.h:176:43: error: expected expression before 'do'
176 | #define snd_power_wait_and_ref(card, ref) do { (void)(card); } while (0) | ^~ sound/core/control.c:1629:8: note: in expansion of macro 'snd_power_wait_and_ref' 1629 | ret = snd_power_wait_and_ref(file->card, true); | ^~~~~~~~~~~~~~~~~~~~~~
vim +/do +176 include/sound/core.h
172 173 static inline int snd_power_wait(struct snd_card *card, unsigned int state) { return 0; } 174 #define snd_power_get_state(card) ({ (void)(card); SNDRV_CTL_POWER_D0; }) 175 #define snd_power_change_state(card, state) do { (void)(card); } while (0)
176 #define snd_power_wait_and_ref(card, ref) do { (void)(card); } while (0)
177 #define snd_power_ref(card) do { (void)(card); } while (0) 178 #define snd_power_unref(card) do { (void)(card); } while (0) 179
--- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
participants (3)
-
Andrey Grodzovsky
-
kernel test robot
-
Takashi Iwai