[PATCH 0/6] ALSA: hda: Fix/improve no-codec bus
Hi,
here is a patch set for fixing / improving the error handling of HD-audio bus without codecs, which can be seen on some GPUs, for example.
It's been debugged and tested in bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207043
Takashi
===
Roy Spliet (1): ALSA: hda: Explicitly permit using autosuspend if runtime PM is supported
Takashi Iwai (5): ALSA: hda: Don't release card at firmware loading error ALSA: hda: Honor PM disablement in PM freeze and thaw_noirq ops ALSA: hda: Release resources at error in delayed probe ALSA: hda: Keep the controller initialization even if no codecs found ALSA: hda: Skip controller resume if not needed
include/sound/hda_codec.h | 5 +++ sound/pci/hda/hda_codec.c | 2 +- sound/pci/hda/hda_intel.c | 106 +++++++++++++++++++++++++++------------------- sound/pci/hda/hda_intel.h | 1 + 4 files changed, 69 insertions(+), 45 deletions(-)
At the error path of the firmware loading error, the driver tries to release the card object and set NULL to drvdata. This may be referred badly at the possible PM action, as the driver itself is still bound and the PM callbacks read the card object.
Instead, we continue the probing as if it were no option set. This is often a better choice than the forced abort, too.
Fixes: 5cb543dba986 ("ALSA: hda - Deferred probing with request_firmware_nowait()") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043 Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/pci/hda/hda_intel.c | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index bd093593f8fb..a2e811375750 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -2027,24 +2027,15 @@ static void azx_firmware_cb(const struct firmware *fw, void *context) { struct snd_card *card = context; struct azx *chip = card->private_data; - struct pci_dev *pci = chip->pci; - - if (!fw) { - dev_err(card->dev, "Cannot load firmware, aborting\n"); - goto error; - }
- chip->fw = fw; + if (fw) + chip->fw = fw; + else + dev_err(card->dev, "Cannot load firmware, continue without patching\n"); if (!chip->disabled) { /* continue probing */ - if (azx_probe_continue(chip)) - goto error; + azx_probe_continue(chip); } - return; /* OK */ - - error: - snd_card_free(card); - pci_set_drvdata(pci, NULL); } #endif
freeze_noirq and thaw_noirq need to check the PM availability like other PM ops. There are cases where the device got disabled due to the error, and the PM operation should be ignored for that.
Fixes: 3e6db33aaf1d ("ALSA: hda - Set SKL+ hda controller power at freeze() and thaw()") BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043 Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/pci/hda/hda_intel.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index a2e811375750..f41d8b7864c1 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1071,6 +1071,8 @@ static int azx_freeze_noirq(struct device *dev) struct azx *chip = card->private_data; struct pci_dev *pci = to_pci_dev(dev);
+ if (!azx_is_pm_ready(card)) + return 0; if (chip->driver_type == AZX_DRIVER_SKL) pci_set_power_state(pci, PCI_D3hot);
@@ -1083,6 +1085,8 @@ static int azx_thaw_noirq(struct device *dev) struct azx *chip = card->private_data; struct pci_dev *pci = to_pci_dev(dev);
+ if (!azx_is_pm_ready(card)) + return 0; if (chip->driver_type == AZX_DRIVER_SKL) pci_set_power_state(pci, PCI_D0);
snd-hda-intel driver handles the most of its probe task in the delayed work (either via workqueue or via firmware loader). When an error happens in the later delayed probe, we can't deregister the device itself because the probe callback already returned success and the device was bound. So, for now, we set hda->init_failed flag and make the rest untouched until the device gets really unbound. However, this leaves the device up running, keeping the resources without any use that prevents other operations.
In this patch, we release the resources at first when a probe error happens in the delayed probe stage, but keeps the top-level object, so that the PM and other ops can still refer to the object itself.
Also for simplicity, snd_hda_intel object is allocated via devm, so that we can get rid of the explicit kfree calls.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043 Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/pci/hda/hda_intel.c | 29 ++++++++++++++++------------- sound/pci/hda/hda_intel.h | 1 + 2 files changed, 17 insertions(+), 13 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index f41d8b7864c1..692857904d49 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1203,10 +1203,8 @@ static void azx_vs_set_state(struct pci_dev *pci, if (!disabled) { dev_info(chip->card->dev, "Start delayed initialization\n"); - if (azx_probe_continue(chip) < 0) { + if (azx_probe_continue(chip) < 0) dev_err(chip->card->dev, "initialization error\n"); - hda->init_failed = true; - } } } else { dev_info(chip->card->dev, "%s via vga_switcheroo\n", @@ -1339,12 +1337,15 @@ static int register_vga_switcheroo(struct azx *chip) /* * destructor */ -static int azx_free(struct azx *chip) +static void azx_free(struct azx *chip) { struct pci_dev *pci = chip->pci; struct hda_intel *hda = container_of(chip, struct hda_intel, chip); struct hdac_bus *bus = azx_bus(chip);
+ if (hda->freed) + return; + if (azx_has_pm_runtime(chip) && chip->running) pm_runtime_get_noresume(&pci->dev); chip->running = 0; @@ -1388,9 +1389,8 @@ static int azx_free(struct azx *chip)
if (chip->driver_caps & AZX_DCAPS_I915_COMPONENT) snd_hdac_i915_exit(bus); - kfree(hda);
- return 0; + hda->freed = 1; }
static int azx_dev_disconnect(struct snd_device *device) @@ -1406,7 +1406,8 @@ static int azx_dev_disconnect(struct snd_device *device)
static int azx_dev_free(struct snd_device *device) { - return azx_free(device->device_data); + azx_free(device->device_data); + return 0; }
#ifdef SUPPORT_VGA_SWITCHEROO @@ -1773,7 +1774,7 @@ static int azx_create(struct snd_card *card, struct pci_dev *pci, if (err < 0) return err;
- hda = kzalloc(sizeof(*hda), GFP_KERNEL); + hda = devm_kzalloc(&pci->dev, sizeof(*hda), GFP_KERNEL); if (!hda) { pci_disable_device(pci); return -ENOMEM; @@ -1814,7 +1815,6 @@ static int azx_create(struct snd_card *card, struct pci_dev *pci,
err = azx_bus_init(chip, model[dev]); if (err < 0) { - kfree(hda); pci_disable_device(pci); return err; } @@ -2340,13 +2340,16 @@ static int azx_probe_continue(struct azx *chip) pm_runtime_put_autosuspend(&pci->dev);
out_free: - if (err < 0 || !hda->need_i915_power) + if (err < 0) { + azx_free(chip); + return err; + } + + if (!hda->need_i915_power) display_power(chip, false); - if (err < 0) - hda->init_failed = 1; complete_all(&hda->probe_wait); to_hda_bus(bus)->bus_probing = 0; - return err; + return 0; }
static void azx_remove(struct pci_dev *pci) diff --git a/sound/pci/hda/hda_intel.h b/sound/pci/hda/hda_intel.h index 2acfff3da1a0..3fb119f09040 100644 --- a/sound/pci/hda/hda_intel.h +++ b/sound/pci/hda/hda_intel.h @@ -27,6 +27,7 @@ struct hda_intel { unsigned int use_vga_switcheroo:1; unsigned int vga_switcheroo_registered:1; unsigned int init_failed:1; /* delayed init failed */ + unsigned int freed:1; /* resources already released */
bool need_i915_power:1; /* the hda controller needs i915 power */ };
Currently, when the HD-audio controller driver doesn't detect any codecs, it tries to abort the probe. But this abort happens at the delayed probe, i.e. the primary probe call already returned success, hence the driver is never unbound until user does so explicitly. As a result, it may leave the HD-audio device in the running state without the runtime PM. More badly, if the device is a HD-audio bus that is tied with a GPU, GPU cannot reach to the full power down and consumes unnecessarily much power.
This patch changes the logic after no-codec situation; it continues probing without the further codec initialization but keep the controller driver running normally.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043 Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/pci/hda/hda_intel.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 692857904d49..aa0be85614b6 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -2009,7 +2009,7 @@ static int azx_first_init(struct azx *chip) /* codec detection */ if (!azx_bus(chip)->codec_mask) { dev_err(card->dev, "no codecs found!\n"); - return -ENODEV; + /* keep running the rest for the runtime PM */ }
if (azx_acquire_irq(chip, 0) < 0) @@ -2303,9 +2303,11 @@ static int azx_probe_continue(struct azx *chip) #endif
/* create codec instances */ - err = azx_probe_codecs(chip, azx_max_codecs[chip->driver_type]); - if (err < 0) - goto out_free; + if (bus->codec_mask) { + err = azx_probe_codecs(chip, azx_max_codecs[chip->driver_type]); + if (err < 0) + goto out_free; + }
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (chip->fw) { @@ -2319,7 +2321,7 @@ static int azx_probe_continue(struct azx *chip) #endif } #endif - if ((probe_only[dev] & 1) == 0) { + if (bus->codec_mask && !(probe_only[dev] & 1)) { err = azx_codec_configure(chip); if (err < 0) goto out_free;
Op 13-04-2020 om 09:20 schreef Takashi Iwai:
Currently, when the HD-audio controller driver doesn't detect any codecs, it tries to abort the probe. But this abort happens at the delayed probe, i.e. the primary probe call already returned success, hence the driver is never unbound until user does so explicitly. As a result, it may leave the HD-audio device in the running state without the runtime PM. More badly, if the device is a HD-audio bus that is tied with a GPU, GPU cannot reach to the full power down and consumes unnecessarily much power.
This patch changes the logic after no-codec situation; it continues probing without the further codec initialization but keep the controller driver running normally.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043 Signed-off-by: Takashi Iwai tiwai@suse.de
Tested-by: Roy Spliet nouveau@spliet.org
sound/pci/hda/hda_intel.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 692857904d49..aa0be85614b6 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -2009,7 +2009,7 @@ static int azx_first_init(struct azx *chip) /* codec detection */ if (!azx_bus(chip)->codec_mask) { dev_err(card->dev, "no codecs found!\n");
return -ENODEV;
/* keep running the rest for the runtime PM */
}
if (azx_acquire_irq(chip, 0) < 0)
@@ -2303,9 +2303,11 @@ static int azx_probe_continue(struct azx *chip) #endif
/* create codec instances */
- err = azx_probe_codecs(chip, azx_max_codecs[chip->driver_type]);
- if (err < 0)
goto out_free;
if (bus->codec_mask) {
err = azx_probe_codecs(chip, azx_max_codecs[chip->driver_type]);
if (err < 0)
goto out_free;
}
#ifdef CONFIG_SND_HDA_PATCH_LOADER if (chip->fw) {
@@ -2319,7 +2321,7 @@ static int azx_probe_continue(struct azx *chip) #endif } #endif
- if ((probe_only[dev] & 1) == 0) {
- if (bus->codec_mask && !(probe_only[dev] & 1)) { err = azx_codec_configure(chip); if (err < 0) goto out_free;
The HD-audio controller does system-suspend and resume operations by directly calling its helpers __azx_runtime_suspend() and __azx_runtime_resume(). However, in general, we don't have to resume always the device fully at the system resume; typically, if a device has been runtime-suspended, we can leave it to runtime resume.
Usually for achieving this, the driver would call pm_runtime_force_suspend() and pm_runtime_force_resume() pairs in the system suspend and resume ops. Unfortunately, this doesn't work for the resume path in our case. For handling the jack detection at the system resume, a child codec device may need the (literally) forcibly resume even if it's been runtime-suspended, and for that, the controller device must be also resumed even if it's been suspended.
This patch is an attempt to improve the situation. It replaces the direct __azx_runtime_suspend()/_resume() calls with with pm_runtime_force_suspend() and pm_runtime_force_resume() with a slight trick as we've done for the codec side. More exactly:
- azx_has_pm_runtime() check is dropped from azx_runtime_suspend() and azx_runtime_resume(), so that it can be properly executed from the system-suspend/resume path
- The WAKEEN handling depends on the card's power state now; it's set and cleared only for the runtime-suspend
- azx_resume() checks whether any codec may need the forcible resume beforehand. If the forcible resume is required, it does temporary PM refcount up/down for actually triggering the runtime resume.
- A new helper function, hda_codec_need_resume(), is introduced for checking whether the codec needs a forcible runtime-resume, and the existing code is rewritten with that.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043 Signed-off-by: Takashi Iwai tiwai@suse.de --- include/sound/hda_codec.h | 5 +++++ sound/pci/hda/hda_codec.c | 2 +- sound/pci/hda/hda_intel.c | 38 +++++++++++++++++++++++++++----------- 3 files changed, 33 insertions(+), 12 deletions(-)
diff --git a/include/sound/hda_codec.h b/include/sound/hda_codec.h index 3ee8036f5436..225154a4f2ed 100644 --- a/include/sound/hda_codec.h +++ b/include/sound/hda_codec.h @@ -494,6 +494,11 @@ void snd_hda_update_power_acct(struct hda_codec *codec); static inline void snd_hda_set_power_save(struct hda_bus *bus, int delay) {} #endif
+static inline bool hda_codec_need_resume(struct hda_codec *codec) +{ + return !codec->relaxed_resume && codec->jacktbl.used; +} + #ifdef CONFIG_SND_HDA_PATCH_LOADER /* * patch firmware diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index a34a2c9f4bcf..86a632bf4d50 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -2951,7 +2951,7 @@ static int hda_codec_runtime_resume(struct device *dev) static int hda_codec_force_resume(struct device *dev) { struct hda_codec *codec = dev_to_hda_codec(dev); - bool forced_resume = !codec->relaxed_resume && codec->jacktbl.used; + bool forced_resume = hda_codec_need_resume(codec); int ret;
/* The get/put pair below enforces the runtime resume even if the diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index aa0be85614b6..02c6308502b1 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1027,7 +1027,7 @@ static int azx_suspend(struct device *dev) chip = card->private_data; bus = azx_bus(chip); snd_power_change_state(card, SNDRV_CTL_POWER_D3hot); - __azx_runtime_suspend(chip); + pm_runtime_force_suspend(dev); if (bus->irq >= 0) { free_irq(bus->irq, chip); bus->irq = -1; @@ -1044,7 +1044,9 @@ static int azx_suspend(struct device *dev) static int azx_resume(struct device *dev) { struct snd_card *card = dev_get_drvdata(dev); + struct hda_codec *codec; struct azx *chip; + bool forced_resume = false;
if (!azx_is_pm_ready(card)) return 0; @@ -1055,7 +1057,20 @@ static int azx_resume(struct device *dev) chip->msi = 0; if (azx_acquire_irq(chip, 1) < 0) return -EIO; - __azx_runtime_resume(chip, false); + + /* check for the forced resume */ + list_for_each_codec(codec, &chip->bus) { + if (hda_codec_need_resume(codec)) { + forced_resume = true; + break; + } + } + + if (forced_resume) + pm_runtime_get_noresume(dev); + pm_runtime_force_resume(dev); + if (forced_resume) + pm_runtime_put(dev); snd_power_change_state(card, SNDRV_CTL_POWER_D0);
trace_azx_resume(chip); @@ -1102,12 +1117,12 @@ static int azx_runtime_suspend(struct device *dev) if (!azx_is_pm_ready(card)) return 0; chip = card->private_data; - if (!azx_has_pm_runtime(chip)) - return 0;
/* enable controller wake up event */ - azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) | - STATESTS_INT_MASK); + if (snd_power_get_state(card) == SNDRV_CTL_POWER_D0) { + azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) | + STATESTS_INT_MASK); + }
__azx_runtime_suspend(chip); trace_azx_runtime_suspend(chip); @@ -1118,17 +1133,18 @@ static int azx_runtime_resume(struct device *dev) { struct snd_card *card = dev_get_drvdata(dev); struct azx *chip; + bool from_rt = snd_power_get_state(card) == SNDRV_CTL_POWER_D0;
if (!azx_is_pm_ready(card)) return 0; chip = card->private_data; - if (!azx_has_pm_runtime(chip)) - return 0; - __azx_runtime_resume(chip, true); + __azx_runtime_resume(chip, from_rt);
/* disable controller Wake Up event*/ - azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) & - ~STATESTS_INT_MASK); + if (from_rt) { + azx_writew(chip, WAKEEN, azx_readw(chip, WAKEEN) & + ~STATESTS_INT_MASK); + }
trace_azx_runtime_resume(chip); return 0;
From: Roy Spliet nouveau@spliet.org
This fixes runtime PM not working after a suspend-to-RAM cycle at least for the codec-less HDA device found on NVIDIA GPUs.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=207043 Signed-off-by: Roy Spliet nouveau@spliet.org Signed-off-by: Takashi Iwai tiwai@suse.de --- sound/pci/hda/hda_intel.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 02c6308502b1..8519051a426e 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -2354,8 +2354,10 @@ static int azx_probe_continue(struct azx *chip)
set_default_power_save(chip);
- if (azx_has_pm_runtime(chip)) + if (azx_has_pm_runtime(chip)) { + pm_runtime_use_autosuspend(&pci->dev); pm_runtime_put_autosuspend(&pci->dev); + }
out_free: if (err < 0) {
participants (2)
-
Roy Spliet
-
Takashi Iwai