On 2022/1/29 12:27, Takashi Sakamoto wrote:
Hi,
On Sat, Jan 29, 2022 at 11:33:26AM +0800, Jia-Ju Bai wrote:
Hello,
My static analysis tool reports a possible deadlock in the sound driver in Linux 5.10:
snd_card_disconnect_sync() spin_lock_irq(&card->files_lock); --> Line 461 (Lock A) wait_event_lock_irq(card->remove_sleep, ...); --> Line 462 (Wait X) spin_unlock_irq(&card->files_lock); --> Line 465 (Unlock A)
snd_hwdep_release() mutex_lock(&hw->open_mutex); --> Line 152 (Lock B) mutex_unlock(&hw->open_mutex); --> Line 157 (Unlock B) snd_card_file_remove() wake_up_all(&card->remove_sleep); --> Line 976 (Wake X)
snd_hwdep_open() mutex_lock(&hw->open_mutex); --> Line 95 (Lock B) snd_card_file_add() spin_lock(&card->files_lock); --> Line 932 (Lock A) spin_unlock(&card->files_lock); --> Line 940 (Unlock A) mutex_unlock(&hw->open_mutex); --> Line 139 (Unlock B)
When snd_card_disconnect_sync() is executed, "Wait X" is performed by holding "Lock A". If snd_hwdep_open() is executed at this time, it holds "Lock B" and then waits for acquiring "Lock A". If snd_hwdep_release() is executed at this time, it waits for acquiring "Lock B", and thus "Wake X" cannot be performed to wake up "Wait X" in snd_card_disconnect_sync(), causing a possible deadlock.
I am not quite sure whether this possible problem is real and how to fix it if it is real. Any feedback would be appreciated, thanks :)
I'm interested in your report about the deadlock, and seek the cause of issue. Then I realized that we should take care of the replacement of file_operation before acquiring spinlock in snd_card_disconnect_sync().
snd_card_disconnect_sync() ->snd_card_disconnect() ->spin_lock() ->list_for_each_entry() mfile->file->f_op = snd_shutdown_f_ops ->spin_unlock() ->spin_lock_irq() ->wait_event_lock_irq() ->spin_unlock_irq()
The implementation of snd_shutdown_f_ops has no value for .open, therefore snd_hwdep_open() is not called anymore when waiting the event. The mutex (Lock B) is not acquired in process context of ALSA hwdep application.
The original .release function can be called by snd_disconnect_release() via replaced snd_shutdown_f_ops. In the case, as you can see, the spinlock (Lock A) is not acquired.
I think there are no race conditions against Lock A and B in process context of ALSA hwdep application after card disconnection. But it would be probable to overlook the other case. I would be glad to receive your check for the above procedure.
Thanks a lot for the quick reply :) Your explanation is reasonable, because snd_shutdown_f_ops indeed has no value for .open.
However, my static analysis tool finds another possible deadlock in the mentioned code:
snd_card_disconnect_sync() spin_lock_irq(&card->files_lock); --> Line 461 (Lock A) wait_event_lock_irq(card->remove_sleep, ...); --> Line 462 (Wait X) spin_unlock_irq(&card->files_lock); --> Line 465 (Unlock A)
snd_hwdep_release() snd_card_file_remove() spin_lock(&card->files_lock); --> Line 962 (Lock A) wake_up_all(&card->remove_sleep); --> Line 976 (Wake X) spin_unlock(&card->files_lock); --> Line 977 (Unlock A)
When snd_card_disconnect_sync() is executed, "Wait X" is performed by holding "Lock A". If snd_hwdep_release() is executed at this time, "Wake X" cannot be performed to wake up "Wait X", because "Lock A" has been already hold by snd_card_disconnect_sync().
I am not quite sure whether this possible problem is real. Any feedback would be appreciated, thanks :)
Best wishes, Jia-Ju Bai