On Tue, 26 Mar 2019 12:02:13 +0100, Yang, Libin wrote:
Hi Takashi,
On Tue, 26 Mar 2019 06:42:15 +0100, Yang, Libin wrote: (snip)
Hi Takashi, Below is my draft patch, which doesn't work with pm_runtime_get_sync(). Is there anything wrong in my code?
(snip)
And the dmesg is: [ 36.087224] HDMI HDA Codec ehdaudio0D2: in hdmi_codec_resume
1890
ylb
[ 36.087230] HDMI HDA Codec ehdaudio0D2: in
hdac_hdmi_runtime_resume 2122 ylb 1
[ 36.087335] HDMI HDA Codec ehdaudio0D2: in
hdac_hdmi_cvt_output_widget_event 812 ylb
[ 40.097586] HDMI HDA Codec ehdaudio0D2: in
hdac_hdmi_runtime_resume 2135 ylb 2
[ 40.097766] HDMI HDA Codec ehdaudio0D2: in
hdac_hdmi_pin_output_widget_event 767 ylb
[ 45.108632] HDMI HDA Codec ehdaudio0D2: in
hdac_hdmi_pin_mux_widget_event 857 ylb
From the dmesg, hdac_hdmi_cvt_output_widget_event()
is called between "ylb 1" and "ylb 2" in runtime_resume(). This means xxx_event() registers setting runs before display power is turned on.
OK, now I understood what went wrong. It's a similar issue as we've hit for the legacy HD-audio and figured out recently.
If my understanding is correct, the problem is the call of pm_runtime_force_resume() in PM resume callback combined with an async work. pm_runtime_force_resume() sets the runtime state immediately to RPM_ACTIVE even before finishing the task. The code seems assuming blindly that there is no other conflicting task while S3/S4 resume, but this is a problem. That's why the next pm_runtime_get_sync() wasn't blocked but just went through; since the RPM flag was already set to RPM_ACTIVE in pm_runtime_force_resume(), it thought as if it were already active, while the real runtime resume code
was still processing the callback.
Yes, exactly right. And I have checked dev->power.runtime_status, it's RPM_ACTIVE when xxx_event() calls pm_runtime_get_sync().
In the case of legacy HD-audio, we "fixed" the problem by avoiding the trigger of async work at resume, but let it be triggered from runtime resume. In ASoC case, however, it's no option.
Maybe a possible solution in the sound driver side would be to move from system suspend/resume to ASoC component suspend/resume. The runtime suspend/resume can be kept as is, and call pm_runtime_force_suspend and resume from the component
suspend/resume
callbacks. Since the component callbacks are certainly processed before DAPM events, this should assure the order.
I have worked out another patch. This patch decouples the xxx_event() and runtime suspend/resume. This means in whichever case, xxx_event() can make sure display power is in the correct status. What do you think? On the same time, I will try the component suspend/resume. My understanding is that snd_hdac_display_power() should be called either in runtime_suspend/ resume or in component suspend/resume.
This might work around the particular case, yes. However it still makes me uneasy as the root cause isn't full covered -- the other part of runtime resume might be still pending while executing the DAPM event.
Now, considering the idea with device_link_add() again: I guess it's snd_soc_resume() who kicks off the DAPM even work? If so, and if snd_soc_resume() gets called from the machine driver, we can assure the PM order via device_link_add() so that the codec driver is resumed before the machine driver. Then at the time the deferred resume work is executed, the codec is ready, so the display power is on.
Yes, snd_soc_resume() kicks off the DAPM event work. The device PM sequence is good. The root cause is when hdmi runtime_resume calls snd_hdac_display_power(), it may sleep. The flow is:
- HDMI runtime_resume runs
- HDMI runtime_resume calls snd_hdac_display_power(), but display
driver is also operating display power and the mutex_lock is hold. So HDMI runtime_resume sleeps. 3. the deferred work is scheduled and xxx_event() is called.
This is wrong. All these happen because of the mutex_lock is hold by display driver, which causes HDMI runtime_resume sleep.
Well, the mutex lock itself is OK, and it's designed to behave so.
As mentioned, the root cause is pm_runtime_force_resume() that sets RPM_ACTIVE while a concurrent task is running.
OTOH, this can be seen indeed as a PM dependency between devices, and if we set the order explicitly, the problem can be avoided, too. Then the runtime resume of codec finishes before snd_soc_resume() is called from the machine driver resume.
Also, moving to component PM would solve the problem, too, since it also assures finishing the codec resume before the schedule_work() call in snd_soc_resume().
thanks,
Takashi