On Mon, 07 Mar 2022 09:31:16 +0100 Takashi Iwai wrote:
On Mon, 07 Mar 2022 09:05:20 +0100 Hillf Danton wrote:
Walk around the deadlock by trying to lock tasklist_lock for write on timer irq and scheduling workqueue work if any lock owner detected.
Oh no, that's toooo ugly.
And the problem isn't only here; take a look at commits f671a691e299 and 2f488f698fda. There are other users of kill_fasync() with the hard-IRQ disabled, too.
So, IMO, the handling of tasklist_lock around kill_fasync() looks broken and the fix should be needed there (or other core part), instead of messing round each caller's code.
In addition to hard-IRQ mentioned above, it is a global rwlock reported in this case rather than the non-global locks addressed in the commits above and thus we need different fix.
Replace it with rcu read lock.
Hillf
#syz test: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/ 38f80f42147f
--- x/fs/fcntl.c +++ y/fs/fcntl.c @@ -807,11 +807,11 @@ void send_sigio(struct fown_struct *fown send_sigio_to_task(p, fown, fd, band, type); rcu_read_unlock(); } else { - read_lock(&tasklist_lock); + rcu_read_lock(); do_each_pid_task(pid, type, p) { send_sigio_to_task(p, fown, fd, band, type); } while_each_pid_task(pid, type, p); - read_unlock(&tasklist_lock); + rcu_read_unlock(); } out_unlock_fown: read_unlock_irqrestore(&fown->lock, flags); --