[alsa-devel] Timing issues between ALSA and i915 drivers
Hi,
I could use some feedback on HDMI audio issues exposed during the 4.21 merge window. By accident (misleading documentation) we ended up enabling the Skylake driver instead of the HDaudio legacy, and broke audio on a number of Skylake and ApolloLake devices where the HDMI/iDISP codec was not detected (bit 2 not set in the codec_mask). Linus' Dell XPS13 9350 was the first to be impacted of course...
After debugging a bit, this issue can be resolved by either
a) compiling both SOUND and DRM as built-ins (y instead of m)
b) moving the calls snd_hdac_i915_init() to the probe function instead of the worker queue (code at https://github.com/plbossart/sound/commits/fix/skl-hdmi)
Both solutions point to timing issues.
During internal reviews I was alerted to the fact that the suggested fix essentially reverts patch ab1b732d53c18 ('ASoC: Intel: Skylake: Move i915 registration to worker thread') which was introduced to solve DRM lockup issues.
In other words, we are at a point where we either have DRM lockups or can't detect the HDMI audio codec, none of which are too good.
I don't have the background for the DRM lockup stuff, nor do I understand why snd_hdac_i915_init has to be called from a thread context. Is this really a requirement?
I also don't get what sort of timing issues would come from an initialization. What happens on the i915 side and is there some sort of mandatory delay between the completion of the i915_init and issuing a snd_hdac_chip_readw(bus, STATESTS) to get the codec masks on the HDaudio links?
Thanks for any pointers or comments.
-Pierre
On Wed, 16 Jan 2019 18:48:25 +0100, Pierre-Louis Bossart wrote:
Hi,
I could use some feedback on HDMI audio issues exposed during the 4.21 merge window. By accident (misleading documentation) we ended up enabling the Skylake driver instead of the HDaudio legacy, and broke audio on a number of Skylake and ApolloLake devices where the HDMI/iDISP codec was not detected (bit 2 not set in the codec_mask). Linus' Dell XPS13 9350 was the first to be impacted of course...
After debugging a bit, this issue can be resolved by either
a) compiling both SOUND and DRM as built-ins (y instead of m)
b) moving the calls snd_hdac_i915_init() to the probe function instead of the worker queue (code at https://github.com/plbossart/sound/commits/fix/skl-hdmi)
This isn't guaranteed to work, as request_module() might be involved, I'm afraid.
Both solutions point to timing issues.
During internal reviews I was alerted to the fact that the suggested fix essentially reverts patch ab1b732d53c18 ('ASoC: Intel: Skylake: Move i915 registration to worker thread') which was introduced to solve DRM lockup issues.
In other words, we are at a point where we either have DRM lockups or can't detect the HDMI audio codec, none of which are too good.
I don't have the background for the DRM lockup stuff, nor do I understand why snd_hdac_i915_init has to be called from a thread context. Is this really a requirement?
I also don't get what sort of timing issues would come from an initialization. What happens on the i915 side and is there some sort of mandatory delay between the completion of the i915_init and issuing a snd_hdac_chip_readw(bus, STATESTS) to get the codec masks on the HDaudio links?
snd_hdac_i915_init() waits for the binding with the i915 audio component, so a possible solution would be just to delay the audio component registration at the really last stage, like the change below.
If this still doesn't work, it'll be more deeply inside, and I have little clue for now...
thanks,
Takashi
--- diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1577,8 +1577,6 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) if (IS_GEN5(dev_priv)) intel_gpu_ips_init(dev_priv);
- intel_audio_init(dev_priv); - /* * Some ports require correctly set-up hpd registers for detection to * work properly (leading to ghost connected connector status), e.g. VGA @@ -1597,6 +1595,8 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
intel_power_domains_enable(dev_priv); intel_runtime_pm_enable(dev_priv); + + intel_audio_init(dev_priv); }
/**
I could use some feedback on HDMI audio issues exposed during the 4.21 merge window. By accident (misleading documentation) we ended up enabling the Skylake driver instead of the HDaudio legacy, and broke audio on a number of Skylake and ApolloLake devices where the HDMI/iDISP codec was not detected (bit 2 not set in the codec_mask). Linus' Dell XPS13 9350 was the first to be impacted of course...
After debugging a bit, this issue can be resolved by either
a) compiling both SOUND and DRM as built-ins (y instead of m)
b) moving the calls snd_hdac_i915_init() to the probe function instead of the worker queue (code at https://github.com/plbossart/sound/commits/fix/skl-hdmi)
This isn't guaranteed to work, as request_module() might be involved, I'm afraid.
Sorry, what would be the impact of the request_module? it'd just delay the probe on the audio side, no?
And if the request_module failed then HDMI wouldn't be more broken anyways...
Both solutions point to timing issues.
During internal reviews I was alerted to the fact that the suggested fix essentially reverts patch ab1b732d53c18 ('ASoC: Intel: Skylake: Move i915 registration to worker thread') which was introduced to solve DRM lockup issues.
In other words, we are at a point where we either have DRM lockups or can't detect the HDMI audio codec, none of which are too good.
I don't have the background for the DRM lockup stuff, nor do I understand why snd_hdac_i915_init has to be called from a thread context. Is this really a requirement?
I also don't get what sort of timing issues would come from an initialization. What happens on the i915 side and is there some sort of mandatory delay between the completion of the i915_init and issuing a snd_hdac_chip_readw(bus, STATESTS) to get the codec masks on the HDaudio links?
snd_hdac_i915_init() waits for the binding with the i915 audio component, so a possible solution would be just to delay the audio component registration at the really last stage, like the change below.
If this still doesn't work, it'll be more deeply inside, and I have little clue for now...
I tried this suggestion and no luck, same error with the iDISP codec not exposed.
I added a delay after snd_hdac_i915_init(), doesn't seem to do anything.
One possibility is that this is a side effect of the Skylake driver initializing the link twice for some odd reason.
And I still don't get what the motivation for moving this init to a work queue was in the first place.
Doesn't seem like an easy one to fix...
thanks,
Takashi
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -1577,8 +1577,6 @@ static void i915_driver_register(struct drm_i915_private *dev_priv) if (IS_GEN5(dev_priv)) intel_gpu_ips_init(dev_priv);
- intel_audio_init(dev_priv);
- /*
- Some ports require correctly set-up hpd registers for detection to
- work properly (leading to ghost connected connector status), e.g. VGA
@@ -1597,6 +1595,8 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
intel_power_domains_enable(dev_priv); intel_runtime_pm_enable(dev_priv);
intel_audio_init(dev_priv); }
/**
a) compiling both SOUND and DRM as built-ins (y instead of m)
b) moving the calls snd_hdac_i915_init() to the probe function instead of the worker queue (code at https://github.com/plbossart/sound/commits/fix/skl-hdmi)
I added DRM+audio dmesg logs at the following link for reference:
I could use some feedback on HDMI audio issues exposed during the 4.21 merge window. By accident (misleading documentation) we ended up enabling the Skylake driver instead of the HDaudio legacy, and broke audio on a number of Skylake and ApolloLake devices where the HDMI/iDISP codec was not detected (bit 2 not set in the codec_mask). Linus' Dell XPS13 9350 was the first to be impacted of course...
After debugging a bit, this issue can be resolved by either
a) compiling both SOUND and DRM as built-ins (y instead of m)
b) moving the calls snd_hdac_i915_init() to the probe function instead of the worker queue (code at https://github.com/plbossart/sound/commits/fix/skl-hdmi)
Both solutions point to timing issues.
During internal reviews I was alerted to the fact that the suggested fix essentially reverts patch ab1b732d53c18 ('ASoC: Intel: Skylake: Move i915 registration to worker thread') which was introduced to solve DRM lockup issues.
I tried to narrow down the issue further and my current understanding is that the Skylake driver performs link reset operations without the display power turned on - which does not look like a very smart thing to do in hindsight.
In other words, it's not really when snd_hdac_i915_init() is called that matters as I assumed initially, but more when snd_hdac_display_power() is invoked. There are two cases where this happens, and for each of them turning the display power on results in HDMI detection. The attached diffs split the initialization from the power on, which provides a better understanding of the issue.
What would be really useful at this point is a confirmation that snd_hdac_i915_init() cannot be called in the initial probe but does need to be executed in a work queue. That would really impact the way the initialization sequence is reworked on the Skylake side as well as modify the way the SOF driver deals with i915 initialization.
Thanks
-Pierre
On Thu, 17 Jan 2019 20:53:06 +0100, Pierre-Louis Bossart wrote:
I could use some feedback on HDMI audio issues exposed during the 4.21 merge window. By accident (misleading documentation) we ended up enabling the Skylake driver instead of the HDaudio legacy, and broke audio on a number of Skylake and ApolloLake devices where the HDMI/iDISP codec was not detected (bit 2 not set in the codec_mask). Linus' Dell XPS13 9350 was the first to be impacted of course...
After debugging a bit, this issue can be resolved by either
a) compiling both SOUND and DRM as built-ins (y instead of m)
b) moving the calls snd_hdac_i915_init() to the probe function instead of the worker queue (code at https://github.com/plbossart/sound/commits/fix/skl-hdmi)
Both solutions point to timing issues.
During internal reviews I was alerted to the fact that the suggested fix essentially reverts patch ab1b732d53c18 ('ASoC: Intel: Skylake: Move i915 registration to worker thread') which was introduced to solve DRM lockup issues.
I tried to narrow down the issue further and my current understanding is that the Skylake driver performs link reset operations without the display power turned on - which does not look like a very smart thing to do in hindsight.
In other words, it's not really when snd_hdac_i915_init() is called that matters as I assumed initially, but more when snd_hdac_display_power() is invoked. There are two cases where this happens, and for each of them turning the display power on results in HDMI detection. The attached diffs split the initialization from the power on, which provides a better understanding of the issue.
OK, this makes some sense, and that's the very reason we have HDA_CODEC_IDX_CONTROLLER for snd_hdac_display_power(). IIRC, we needed to power on the display for probing of the legacy HDA, too. Once after that, for the normal operation, the display power is needed only when you output the HDMI stream.
What would be really useful at this point is a confirmation that snd_hdac_i915_init() cannot be called in the initial probe but does need to be executed in a work queue. That would really impact the way the initialization sequence is reworked on the Skylake side as well as modify the way the SOF driver deals with i915 initialization.
It's needed to be called in a work queue, yes.
Basically you shouldn't call request_module() in the driver's probe callback. When the probe callback is called from the module loading, it blocks the module loading itself, hence loading yet another module can't work. A situation might be easier than the past (which deadlocked), but still it's advised to use either the request_module_nowait() with the callback or call request_module() asynchronously from probe.
thanks,
Takashi
Thanks
-Pierre
diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c index 60c94836bf5b..56556d06a17f 100644 --- a/sound/soc/intel/skylake/skl.c +++ b/sound/soc/intel/skylake/skl.c @@ -767,23 +767,6 @@ static const struct hdac_bus_ops bus_core_ops = { .get_response = snd_hdac_bus_get_response, };
-static int skl_i915_init(struct hdac_bus *bus) -{
- int err;
- /*
* The HDMI codec is in GPU so we need to ensure that it is powered
* up and ready for probe
*/
- err = snd_hdac_i915_init(bus);
- if (err < 0)
return err;
- snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, true);
- return 0;
-}
static void skl_probe_work(struct work_struct *work) { struct skl *skl = container_of(work, struct skl, probe_work); @@ -791,12 +774,6 @@ static void skl_probe_work(struct work_struct *work) struct hdac_ext_link *hlink = NULL; int err;
- if (IS_ENABLED(CONFIG_SND_SOC_HDAC_HDMI)) {
err = skl_i915_init(bus);
if (err < 0)
return;
- }
- err = skl_init_chip(bus, true); if (err < 0) { dev_err(bus->dev, "Init chip failed with err: %d\n", err);
@@ -899,6 +876,11 @@ static int skl_first_init(struct hdac_bus *bus) unsigned short gcap; int cp_streams, pb_streams, start_idx;
- err = snd_hdac_i915_init(bus);
- if (err < 0)
return err;
- err = pci_request_regions(pci, "Skylake HD audio"); if (err < 0) return err;
@@ -910,7 +892,10 @@ static int skl_first_init(struct hdac_bus *bus) return -ENXIO; }
- snd_hdac_bus_reset_link(bus, true);
/* this bus_reset_link is unnecessary, and without the display
* power turned on prevents HDMI from being detected
*/
//snd_hdac_bus_reset_link(bus, true);
snd_hdac_bus_parse_capabilities(bus);
@@ -962,7 +947,14 @@ static int skl_first_init(struct hdac_bus *bus) /* initialize chip */ skl_init_pci(skl);
- return skl_init_chip(bus, true);
- /* turning the display power here works */
- snd_hdac_display_power(bus, HDA_CODEC_IDX_CONTROLLER, true);
- err = skl_init_chip(bus, true);
- /* turning the display power here does not work, HDMI not detected */
- return err;
}
static int skl_probe(struct pci_dev *pci,
I tried to narrow down the issue further and my current understanding is that the Skylake driver performs link reset operations without the display power turned on - which does not look like a very smart thing to do in hindsight.
In other words, it's not really when snd_hdac_i915_init() is called that matters as I assumed initially, but more when snd_hdac_display_power() is invoked. There are two cases where this happens, and for each of them turning the display power on results in HDMI detection. The attached diffs split the initialization from the power on, which provides a better understanding of the issue.
OK, this makes some sense, and that's the very reason we have HDA_CODEC_IDX_CONTROLLER for snd_hdac_display_power(). IIRC, we needed to power on the display for probing of the legacy HDA, too. Once after that, for the normal operation, the display power is needed only when you output the HDMI stream.
What would be really useful at this point is a confirmation that snd_hdac_i915_init() cannot be called in the initial probe but does need to be executed in a work queue. That would really impact the way the initialization sequence is reworked on the Skylake side as well as modify the way the SOF driver deals with i915 initialization.
It's needed to be called in a work queue, yes.
Basically you shouldn't call request_module() in the driver's probe callback. When the probe callback is called from the module loading, it blocks the module loading itself, hence loading yet another module can't work. A situation might be easier than the past (which deadlocked), but still it's advised to use either the request_module_nowait() with the callback or call request_module() asynchronously from probe.
Thanks Takashi, this is very useful. I guess that will require a complete rework of the Skylake initialization sequence then, my simple code translation isn't enough indeed and the current partition between probe/work queue can't comply with both requirements (request module asynchronously from probe, display turned on before mucking with links).
We also need this changed for SOF, the i915_init is done in the probe.
-Pierre
On Thu, 17 Jan 2019 21:47:05 +0100, Pierre-Louis Bossart wrote:
I tried to narrow down the issue further and my current understanding is that the Skylake driver performs link reset operations without the display power turned on - which does not look like a very smart thing to do in hindsight.
In other words, it's not really when snd_hdac_i915_init() is called that matters as I assumed initially, but more when snd_hdac_display_power() is invoked. There are two cases where this happens, and for each of them turning the display power on results in HDMI detection. The attached diffs split the initialization from the power on, which provides a better understanding of the issue.
OK, this makes some sense, and that's the very reason we have HDA_CODEC_IDX_CONTROLLER for snd_hdac_display_power(). IIRC, we needed to power on the display for probing of the legacy HDA, too. Once after that, for the normal operation, the display power is needed only when you output the HDMI stream.
What would be really useful at this point is a confirmation that snd_hdac_i915_init() cannot be called in the initial probe but does need to be executed in a work queue. That would really impact the way the initialization sequence is reworked on the Skylake side as well as modify the way the SOF driver deals with i915 initialization.
It's needed to be called in a work queue, yes.
Basically you shouldn't call request_module() in the driver's probe callback. When the probe callback is called from the module loading, it blocks the module loading itself, hence loading yet another module can't work. A situation might be easier than the past (which deadlocked), but still it's advised to use either the request_module_nowait() with the callback or call request_module() asynchronously from probe.
Thanks Takashi, this is very useful. I guess that will require a complete rework of the Skylake initialization sequence then, my simple code translation isn't enough indeed and the current partition between probe/work queue can't comply with both requirements (request module asynchronously from probe, display turned on before mucking with links).
We also need this changed for SOF, the i915_init is done in the probe.
Well, snd_hdac_i915_init() itself also calls already request_module() if i915 is missing, so this should be done in the async code, too...
The complication comes from the fact that HD-audio driver doesn't necessarily bind with i915 graphics. Otherwise we could have a fixed hard dependency that assures more or less i915 probing before HD-audio (though, i915 probing became async, so it's now harder).
thanks,
Takashi
participants (2)
-
Pierre-Louis Bossart
-
Takashi Iwai