[PATCH] ALSA: seq: Fix RCU stall in snd_seq_write()
Zqiang
qiang.zhang1211 at gmail.com
Tue Nov 2 10:41:57 CET 2021
On 2021/11/2 下午4:33, Takashi Iwai wrote:
> On Tue, 02 Nov 2021 04:32:22 +0100,
> Zqiang wrote:
>> If we have a lot of cell object, this cycle may take a long time, and
>> trigger RCU stall. insert a conditional reschedule point to fix it.
>>
>> rcu: INFO: rcu_preempt self-detected stall on CPU
>> rcu: 1-....: (1 GPs behind) idle=9f5/1/0x4000000000000000
>> softirq=16474/16475 fqs=4916
>> (t=10500 jiffies g=19249 q=192515)
>> NMI backtrace for cpu 1
>> ......
>> asm_sysvec_apic_timer_interrupt
>> RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70
>> spin_unlock_irqrestore
>> snd_seq_prioq_cell_out+0x1dc/0x360
>> snd_seq_check_queue+0x1a6/0x3f0
>> snd_seq_enqueue_event+0x1ed/0x3e0
>> snd_seq_client_enqueue_event.constprop.0+0x19a/0x3c0
>> snd_seq_write+0x2db/0x510
>> vfs_write+0x1c4/0x900
>> ksys_write+0x171/0x1d0
>> do_syscall_64+0x35/0xb0
>>
>> Reported-by: syzbot+bb950e68b400ab4f65f8 at syzkaller.appspotmail.com
>> Signed-off-by: Zqiang <qiang.zhang1211 at gmail.com>
>> ---
>> sound/core/seq/seq_queue.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/sound/core/seq/seq_queue.c b/sound/core/seq/seq_queue.c
>> index d6c02dea976c..f5b1e4562a64 100644
>> --- a/sound/core/seq/seq_queue.c
>> +++ b/sound/core/seq/seq_queue.c
>> @@ -263,6 +263,7 @@ void snd_seq_check_queue(struct snd_seq_queue *q, int atomic, int hop)
>> if (!cell)
>> break;
>> snd_seq_dispatch_event(cell, atomic, hop);
>> + cond_resched();
>> }
>>
>> /* Process time queue... */
>> @@ -272,6 +273,7 @@ void snd_seq_check_queue(struct snd_seq_queue *q, int atomic, int hop)
>> if (!cell)
>> break;
>> snd_seq_dispatch_event(cell, atomic, hop);
>> + cond_resched();
>
> It's good to have cond_resched() in those places but it must be done
> more carefully, as the code path may be called from the non-atomic
> context, too. That is, it must have a check of atomic argument, and
> cond_resched() is applied only when atomic==false.
>
> But I still wonder how this gets a RCU stall out of sudden. Looking
> through https://syzkaller.appspot.com/bug?extid=bb950e68b400ab4f65f8
> it's triggered by many cases since the end of September...
I did not find useful information from the log, through calltrace, I
guess it may be triggered by the long cycle time, which caused the
static state of the RCU to
not be reported in time.
I ignore the atomic parameter check, I will resend v2 . in no-atomic
context, we can insert
cond_resched() to avoid this situation, but in atomic context,
the RCU stall maybe still trigger.
thanks
Zqiang
>
>
> thanks,
>
> Takashi
More information about the Alsa-devel
mailing list