lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 18 Aug 2020 17:30:37 -0700 From: Yonghong Song <yhs@...com> To: Alexei Starovoitov <alexei.starovoitov@...il.com> CC: <bpf@...r.kernel.org>, <netdev@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, <kernel-team@...com>, "Paul E . McKenney" <paulmck@...nel.org> Subject: Re: [PATCH bpf v2 1/3] bpf: fix a rcu_sched stall issue with bpf task/task_file iterator On 8/18/20 5:05 PM, Alexei Starovoitov wrote: > On Tue, Aug 18, 2020 at 03:23:09PM -0700, Yonghong Song wrote: >> >> We did not use cond_resched() since for some iterators, e.g., >> netlink iterator, where rcu read_lock critical section spans between >> consecutive seq_ops->next(), which makes impossible to do cond_resched() >> in the key while loop of function bpf_seq_read(). > > but after this patch we can, right? We can do cond_resched() after seq->op->stop(). See more below. > >> >> +/* maximum visited objects before bailing out */ >> +#define MAX_ITER_OBJECTS 1000000 >> + >> /* bpf_seq_read, a customized and simpler version for bpf iterator. >> * no_llseek is assumed for this file. >> * The following are differences from seq_read(): >> @@ -79,7 +82,7 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size, >> { >> struct seq_file *seq = file->private_data; >> size_t n, offs, copied = 0; >> - int err = 0; >> + int err = 0, num_objs = 0; >> void *p; >> >> mutex_lock(&seq->lock); >> @@ -135,6 +138,7 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size, >> while (1) { >> loff_t pos = seq->index; >> >> + num_objs++; >> offs = seq->count; >> p = seq->op->next(seq, p, &seq->index); >> if (pos == seq->index) { >> @@ -153,6 +157,15 @@ static ssize_t bpf_seq_read(struct file *file, char __user *buf, size_t size, >> if (seq->count >= size) >> break; >> >> + if (num_objs >= MAX_ITER_OBJECTS) { >> + if (offs == 0) { >> + err = -EAGAIN; >> + seq->op->stop(seq, p); >> + goto done; >> + } >> + break; >> + } >> + > > should this block be after op->show() and error processing? > Otherwise bpf_iter_inc_seq_num() will be incorrectly incremented? The purpose of op->next() is to calculate the "next" object position, stored in the seq private data. So for next read() syscall, start() will try to fetch the data based on the info in seq private data. This is true for conditions "if (seq->count >= size) break" in the above so next op->start() can try to locate the correct object. The same is for this -EAGAIN thing. > >> err = seq->op->show(seq, p); >> if (err > 0) { >> bpf_iter_dec_seq_num(seq); > > After op->stop() we can do cond_resched() in all cases, > since rhashtable walk does rcu_unlock in stop() callback, right? Yes, we can. I am thinking since we return to user space, cond_resched() might not be needed since returning to user space will trigger some kind of scheduling. This patch fixed the rcu stall issue. But if my understanding is incorrect, I am happy to add cond_reched(). > I think copy_to_user() and mutex_unlock() don't do cond_resched() > equivalent work. >
Powered by blists - more mailing lists