Warning due to "ALSA: hda: intel: More comprehensive PM runtime setup for controller driver"

Takashi Iwai tiwai at suse.de
Fri Nov 19 14:51:14 CET 2021


On Thu, 18 Nov 2021 23:13:50 +0100,
Heiner Kallweit wrote:
> 
> On 18.11.2021 22:28, Takashi Iwai wrote:
> > On Thu, 18 Nov 2021 21:33:34 +0100,
> > Heiner Kallweit wrote:
> >>
> >> I get the following warning caused by 4f66a9ef37d3 ("ALSA: hda: intel: More
> >> comprehensive PM runtime setup for controller driver"):
> >>
> >> snd_hda_intel 0000:00:1f.3: Unbalanced pm_runtime_enable!
> >>
> >> Not sure how this patch was tested because the warning is obvious.
> >> The patch doesn't consider what the PCI sub-system does with regard to
> >> RPM. Have a look at pci_pm_init().
> >>
> >> I'd understand to add the call to pm_runtime_dont_use_autosuspend(),
> >> but for all other added calls I see no justification.
> >>
> >> If being unsure about when to use which RPM call best involve
> >> linux-pm at vger.kernel.org.
> > 
> > Thanks for the notice.  It's been through Intel CI and tests on a few
> > local machines, maybe we haven't checked carefully those errors but
> > only concentrated on the other issues, as it seems.
> > 
> > There were two problems: one was the runtime PM being kicked off even
> > during the PCI driver remove call, and another was the proper runtime
> > PM setup after re-binding.
> > 
> 
> Having a look at the commit message of "ALSA: hda: fix general protection
> fault in azx_runtime_idle" the following sounds weird:
> 
>   - pci-driver.c:pm_runtime_put_sync() leads to a call
>     to rpm_idle() which again calls azx_runtime_idle()
> 
> rpm_idle() is only called if usage_count is 1 when entering
> pm_runtime_put_sync. And this should not be the case.
> pm_runtime_get_sync() increments the usage counter before remove()
> is called, and remove() should also increment the usage counter.
> This doesn't seem to happen. Maybe for whatever reason 
> pm_runtime_get_noresume() isn't called in azx_free(), or azx_free()
> isn't called from remove().
> I think you should trace the call chain from the PCI core calling
> remove() to pm_runtime_get_noresume() getting called or not.

Neither of them, supposedly.  Now I took a deeper look at the code
around it and dug into the git log, and found that the likely problem
was the recent PCI core code refactoring (removal of pci->driver, etc)
that have been already reverted; that was why linux-next-20211109 was
broken and linux-next-20211110 worked.  With the leftover pci->driver,
the stale runtime PM callback was called at the pm_runtime_put_sync()
call in pci_device_remove().

In anyway, I'll drop the invalid calls of pm_runtime_enable() /
disable() & co.  Maybe keeping pm_runtime_forbid() and
pm_runtime_dont_use_autosuspend() at remove still makes some sense as
a counter-part for the probe calls, though.


thanks,

Takashi


More information about the Alsa-devel mailing list