lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <2d05ceab-b8b7-0c7b-f847-69950c6db14e@gmail.com> Date: Tue, 2 Nov 2021 17:41:57 +0800 From: Zqiang <qiang.zhang1211@...il.com> To: Takashi Iwai <tiwai@...e.de> Cc: tiwai@...e.com, alsa-devel@...a-project.org, linux-kernel@...r.kernel.org Subject: Re: [PATCH] ALSA: seq: Fix RCU stall in snd_seq_write() On 2021/11/2 下午4:33, Takashi Iwai wrote: > On Tue, 02 Nov 2021 04:32:22 +0100, > Zqiang wrote: >> If we have a lot of cell object, this cycle may take a long time, and >> trigger RCU stall. insert a conditional reschedule point to fix it. >> >> rcu: INFO: rcu_preempt self-detected stall on CPU >> rcu: 1-....: (1 GPs behind) idle=9f5/1/0x4000000000000000 >> softirq=16474/16475 fqs=4916 >> (t=10500 jiffies g=19249 q=192515) >> NMI backtrace for cpu 1 >> ...... >> asm_sysvec_apic_timer_interrupt >> RIP: 0010:_raw_spin_unlock_irqrestore+0x38/0x70 >> spin_unlock_irqrestore >> snd_seq_prioq_cell_out+0x1dc/0x360 >> snd_seq_check_queue+0x1a6/0x3f0 >> snd_seq_enqueue_event+0x1ed/0x3e0 >> snd_seq_client_enqueue_event.constprop.0+0x19a/0x3c0 >> snd_seq_write+0x2db/0x510 >> vfs_write+0x1c4/0x900 >> ksys_write+0x171/0x1d0 >> do_syscall_64+0x35/0xb0 >> >> Reported-by: syzbot+bb950e68b400ab4f65f8@...kaller.appspotmail.com >> Signed-off-by: Zqiang <qiang.zhang1211@...il.com> >> --- >> sound/core/seq/seq_queue.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/sound/core/seq/seq_queue.c b/sound/core/seq/seq_queue.c >> index d6c02dea976c..f5b1e4562a64 100644 >> --- a/sound/core/seq/seq_queue.c >> +++ b/sound/core/seq/seq_queue.c >> @@ -263,6 +263,7 @@ void snd_seq_check_queue(struct snd_seq_queue *q, int atomic, int hop) >> if (!cell) >> break; >> snd_seq_dispatch_event(cell, atomic, hop); >> + cond_resched(); >> } >> >> /* Process time queue... */ >> @@ -272,6 +273,7 @@ void snd_seq_check_queue(struct snd_seq_queue *q, int atomic, int hop) >> if (!cell) >> break; >> snd_seq_dispatch_event(cell, atomic, hop); >> + cond_resched(); > > It's good to have cond_resched() in those places but it must be done > more carefully, as the code path may be called from the non-atomic > context, too. That is, it must have a check of atomic argument, and > cond_resched() is applied only when atomic==false. > > But I still wonder how this gets a RCU stall out of sudden. Looking > through https://syzkaller.appspot.com/bug?extid=bb950e68b400ab4f65f8 > it's triggered by many cases since the end of September... I did not find useful information from the log, through calltrace, I guess it may be triggered by the long cycle time, which caused the static state of the RCU to not be reported in time. I ignore the atomic parameter check, I will resend v2 . in no-atomic context, we can insert cond_resched() to avoid this situation, but in atomic context, the RCU stall maybe still trigger. thanks Zqiang > > > thanks, > > Takashi
Powered by blists - more mailing lists