-----Original Message----- From: Takashi Iwai tiwai@suse.de Sent: Tuesday, December 10, 2019 11:11 AM To: Deucher, Alexander Alexander.Deucher@amd.com Cc: Lukas Wunner lukas@wunner.de; Jaroslav Kysela perex@perex.cz; Mika Westerberg mika.westerberg@linux.intel.com; Bjorn Helgaas helgaas@kernel.org; Nicholas Johnson <nicholas.johnson- opensource@outlook.com.au>; alsa-devel@alsa-project.org; linux- kernel@vger.kernel.org; linux-pci@vger.kernel.org Subject: Re: [PATCH] ALSA: hda/hdmi - Fix duplicate unref of pci_dev
On Tue, 10 Dec 2019 16:53:20 +0100, Deucher, Alexander wrote:
-----Original Message----- From: Lukas Wunner lukas@wunner.de Sent: Tuesday, December 10, 2019 10:47 AM To: Deucher, Alexander Alexander.Deucher@amd.com Cc: Takashi Iwai tiwai@suse.de; Jaroslav Kysela perex@perex.cz; Mika Westerberg mika.westerberg@linux.intel.com; Bjorn Helgaas helgaas@kernel.org; Nicholas Johnson <nicholas.johnson- opensource@outlook.com.au>; alsa-devel@alsa-project.org; linux- kernel@vger.kernel.org; linux-pci@vger.kernel.org Subject: Re: [PATCH] ALSA: hda/hdmi - Fix duplicate unref of pci_dev
On Tue, Dec 10, 2019 at 03:34:27PM +0000, Deucher, Alexander wrote:
Nicholas Johnson reports a null pointer deref as well as a refcount underflow upon hot-removal of a Thunderbolt-attached
AMD eGPU.
He's bisected the issue down to commit 586bc4aab878 ("ALSA: hda/hdmi
- fix vgaswitcheroo detection for AMD").
The commit iterates over PCI devices using pci_get_class() and unreferences each device found, even though pci_get_class() subsequently unreferences the device as well. Fix it.
The pci_dev_put() a few lines above should probably be dropped as
well.
That one looks fine to me. The refcount is already increased in the caller get_bound_vga() via pci_get_domain_bus_and_slot() and it's increased again in atpx_present() via pci_get_class(). It needs to be decremented in atpx_present() to avoid leaking a ref.
I'm not following. This is part of the same loop as the one you removed. All
we are doing is checking whether the ATPX method exists or not om the platform. The pdev may not be the same one as the one in pci_get_domain_bus_and_slot(). The APTX method in the APU's ACPI namespace, not the dGPUs.
Well, the tricky part is that pci_get_class() itself does unrefeference the old object and reference the new object (if found). At the end of the loop, nothing is referenced, so it's fine. OTOH, if you go out of the loop in the middle, you're still keeping the pdev object reference, so you need to manually unreference it.
Ah, I see what you are saying. Thanks. Patch is: Reviewed-by: Alex Deucher alexander.deucher@amd.com
Takashi
Alex
Thanks,
Lukas
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 35b4526f0d28..b856b89378ac 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1419,7 +1419,6 @@ static bool atpx_present(void) return true; } }
} return false;pci_dev_put(pdev);
}
2.24.0