At Fri, 5 Apr 2013 16:42:43 +0000, Lin, Mengdong wrote:
Hi Takashi and David,
-----Original Message----- From: Takashi Iwai [mailto:tiwai@suse.de] Sent: Friday, April 05, 2013 9:02 PM To: David Henningsson Cc: Lin, Mengdong; alsa-devel@alsa-project.org Subject: Re: [alsa-devel] [PATCH] ALSA: hda - delay resume haswell hdmi codec in system resume
At Tue, 02 Apr 2013 14:49:40 +0200, David Henningsson wrote:
On 04/02/2013 02:18 PM, Takashi Iwai wrote:
At Tue, 02 Apr 2013 13:53:16 +0200, David Henningsson wrote:
On 03/27/2013 11:01 AM, Lin, Mengdong wrote:
> -----Original Message----- > From: David Henningsson [mailto:david.henningsson@canonical.com] > Sent: Wednesday, March 27, 2013 4:19 PM
>> The Haswell codec set_power_state ops >> (intel_haswell_set_power_state) will > only wait if this is a delayed resume and clear this flag after > waiting. And actually, there is no waiting even in this case. > Because when 1st user operation after system resume happens, Gfx > already finishes resuming and audio initialization, so as long as > intel_haswell_wait_ready_to_resume() enable the unsol event, the > unsol event comes and so so waiting finishes. The 300ms time out > is set for safety consideration in case unsol event is lost. I've not
observed any unsol event lost till now.
> > As for the timeout, I suggest you use the codec->bus->workq > instead of creating a new workq. I think that will also give us > some serialisation, i e, protection against race conditions if > the timeout happen at the same time as the unsol event.
Hi David,
The new added "resume_wq" for hdmi codec is a wait queue, not a work
queue like codec->bus->workq.
It's expected to wake up as soon as the unsol event is got.
Sure; but I don't see why you need a wait queue for that? Why don't you just call the resume path from the unsol event handler (hdmi_present_sense, or its caller), and then also cancel the timeout handler (which can then be in the normal workq)?
Because the delayed resume actually fakes as if the resume is done. This is necessary not to block other device's resume operation.
Since it looks as if ready, user-space might restart things soon again before the delayed resume is really finished. So, some serialization is required there.
Lin, would it be possible to add some chain of events description to the bug commit? The different contexts are just boggling my mind :-)
Yeah, a nice code flow picture would be helpful :)
Assume we have two cases, system idle after S3 and immediate playback after S3.
Is this correct:
System idle:
- System skips hda_call_codec_resume and sets codec->resume_delayed.
- Since the codec is not reinitialized, nothing powers up the
codec, so this is all that happens. 3. Since unsol events are not enabled (?), we have a bug that jack detection does not work and cannot be detected from userspace...?
Yes. This is a side effect of this patch. The unsol event is delayed until the user operation triggers the delayed resume. Then intel_haswell_wait_ready_to_resume() will enable unsol events.
Immediate playback:
- System skips hda_call_codec_resume and sets codec->resume_delayed.
- Process context wants to start playback, which powers up the
codec and calls intel_haswell_set_power_state. 3. intel_haswell_wait_ready_to_resume enables unsol events and starts to wait on ready_to_resume 4. Gfx init finishes, and workq context fires the unsol event, which calls hdmi_intrinsic_event, which triggers the resume_wq. 5. Process context and workq context now continues in parallel, potentially (but hopefully not) leading to race conditions?
The current patch has a small race, but it can be avoided by clearing spec->ready_to_resume early enough, e.g. in suspend callback or in init callback, like the patch below.
I think there is no race for Haswell. In my test I observed that the unsol evnet will not be received until intel_haswell_wait_ready_to_resume() enables the unsol event on the pins. So I put "spec->ready_to_resume = 0" before enabling the unsol event.
Practically your patch would work well, but theoretically the unsol event might be issued and handled by hdmi_intrinsic_event() before you go into intel_haswell_wait_ready_to_resume(). That is, you _clear_ spec->ready_to_resume again and wait for the unsol event that had been already processed. That's the race David pointed.
The fix is simply not to clear spec->ready_to_resume in intel_haswell_wait_ready_to_resume() but clear it much earlier already before the resume path as my patch does. Then wait_event() passes immediately (and the unsol enablement of is anyway harmless).
Takashi