On Mon, 27 Apr 2020 16:22:21 +0200, Deucher, Alexander wrote:
[AMD Public Use]
-----Original Message----- From: Nicholas Johnson nicholas.johnson-opensource@outlook.com.au Sent: Sunday, April 26, 2020 12:02 PM To: linux-kernel@vger.kernel.org Cc: Deucher, Alexander Alexander.Deucher@amd.com; Koenig, Christian Christian.Koenig@amd.com; Zhou, David(ChunMing) David1.Zhou@amd.com; Nicholas Johnson <nicholas.johnson- opensource@outlook.com.au> Subject: [PATCH 0/1] Fiji GPU audio register timeout when in BACO state
Hi all,
Since Linux v5.7-rc1 / commit 4fdda2e66de0 ("drm/amdgpu/runpm: enable runpm on baco capable VI+ asics"), my AMD R9 Nano has been using runpm / BACO. You can tell visually when it sleeps, because the fan on the graphics card is switched off to save power. It did not spin down the fan in v5.6.x.
This is great (I love it), except that when it is sleeping, the PCIe audio function of the GPU has issues if anything tries to access it. You get dmesg errors such as these:
snd_hda_intel 0000:08:00.1: spurious response 0x0:0x0, last cmd=0x170500 snd_hda_intel 0000:08:00.1: azx_get_response timeout, switching to polling mode: last cmd=0x001f0500 snd_hda_intel 0000:08:00.1: No response from codec, disabling MSI: last cmd=0x001f0500 snd_hda_intel 0000:08:00.1: No response from codec, resetting bus: last cmd=0x001f0500 snd_hda_codec_hdmi hdaudioC1D0: Unable to sync register 0x2f0d00. -11
The above is with the Fiji XT GPU at 0000:08:00.0 in a Thunderbolt enclosure (not that Thunderbolt should affect it, but I feel I should mention it just in case). I dropped a lot of duplicate dmesg lines, as some of them repeated a lot of times before the driver gave up.
I offer this patch to disable runpm for Fiji while a fix is found, if you decide that is the best approach. Regardless, I will gladly test any patches you come up with instead and confirm that the above issue has been fixed.
I cannot tell if any other GPUs are affected. The only other cards to which I have access are a couple of AMD R9 280X (Tahiti XT), which use radeon driver instead of amdgpu driver.
Adding a few more people. Do you know what is accessing the audio? The audio should have a dependency on the GPU device. The GPU won't enter runtime pm until the audio has entered runtime pm and vice versa on resume. Please attach a copy of your dmesg output and lspci output.
Also, please retest with the fresh 5.7-rc3. There was a known regression regarding HD-audio PM in 5.7-rc1/rc2, and it's been fixed there (commit 8d6762af302d).
thanks,
Takashi