[Intel-gfx] [PATCH v2] ALSA: hda/i915 - avoid hung task timeout in i915 wait

Tvrtko Ursulin tvrtko.ursulin at linux.intel.com
Wed Mar 9 10:48:49 CET 2022


On 09/03/2022 09:23, Takashi Iwai wrote:
> On Wed, 09 Mar 2022 10:02:13 +0100,
> Tvrtko Ursulin wrote:
>>
>>
>> On 09/03/2022 08:39, Kai Vehmanen wrote:
>>> Hi,
>>>
>>> On Wed, 9 Mar 2022, Tvrtko Ursulin wrote:
>>>
>>>>> -			/* 60s timeout */
>>>>
>>>> Where does this 60s come from and why is the fix to work around
>>>> DEFAULT_HUNG_TASK_TIMEOUT in a hacky way deemed okay? For instance would
>>>> limiting the wait here to whatever the kconfig is set to be an option?
>>>
>>> this was discussed in
>>> https://lists.freedesktop.org/archives/intel-gfx/2022-February/290821.html
>>> ... and that thread concluded it's cleaner to split the wait than try
>>> to figure out hung-task configuration from middle of audio driver.
>>>
>>> The 60sec timeout comes from 2019 patch "ALSA: hda: Extend i915 component
>>> bind timeout" to fix an issue reported by Paul Menzel (cc'ed).
>>>
>>> This patch keeps the timeout intact.
>>
>> I did not spot discussion touching on the point I raised.
>>
>> How about not fight the hung task detector but mark your wait context
>> as "I really know what I'm doing - not stuck trust me".
> 
> The question is how often this problem hits.  Basically it's a very
> corner case, and I even think we may leave as is; that's a matter of
> configuration, and lowering such a bar should expect some
> side-effect. OTOH, if the problem happens in many cases, it's
> beneficial to fix in the core part, indeed.

Yes argument you raise can be made I agree.

>> Maybe using
>> wait_for_completion_killable_timeout would do it since
>> snd_hdac_i915_init is allowed to fail with an error already?
> 
> It makes it killable -- which is a complete behavior change.

Complete behaviour change how? Isn't this something ran on probe so 
likelihood of anyone sending SIGKILL to the modprobe process is only the 
init process? And in that case what is the fundamental difference in 
init giving up before the internal 60s in HDA driver does? I don't see a 
difference. Either party decided to abort the wait and code can just 
unwind and propagate the different error codes.

Regards,

Tvrtko


More information about the Alsa-devel mailing list