[<prev] [next>] [day] [month] [year] [list]
Message-ID: <57dc6a0e-6ba9-e77c-80ac-6bb0a6e2650a@meta.com>
Date: Sat, 20 May 2023 20:44:44 -0700
From: Yonghong Song <yhs@...a.com>
To: Teng Qi <starmiku1207184332@...il.com>
Cc: ast@...nel.org, daniel@...earbox.net, john.fastabend@...il.com,
andrii@...nel.org, martin.lau@...ux.dev, song@...nel.org, yhs@...com,
kpsingh@...nel.org, sdf@...gle.com, haoluo@...gle.com,
jolsa@...nel.org, davem@...emloft.net, kuba@...nel.org,
hawk@...nel.org, bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [bug] kernel: bpf: syscall: a possible sleep-in-atomic bug in
__bpf_prog_put()
On 5/19/23 7:18 AM, Teng Qi wrote:
> Thank you for your response.
> > Looks like you only have suspicion here. Could you find a real violation
> > here where __bpf_prog_put() is called with !in_irq() &&
> > !irqs_disabled(), but inside spin_lock or rcu read lock? I have not seen
> > things like that.
>
> For the complex conditions to call bpf_prog_put() with 1 refcnt, we have
> been
> unable to really trigger this atomic violation after trying to construct
> test cases manually. But we found that it is possible to show cases with
> !in_irq() && !irqs_disabled(), but inside spin_lock or rcu read lock.
> For example, even a failed case, one of selftest cases of bpf, netns_cookie,
> calls bpf_sock_map_update() and may indirectly call bpf_prog_put()
> only inside rcu read lock: The possible call stack is:
> net/core/sock_map.c: 615 bpf_sock_map_update()
> net/core/sock_map.c: 468 sock_map_update_common()
> net/core/sock_map.c: 217 sock_map_link()
> kernel/bpf/syscall.c: 2111 bpf_prog_put()
>
> The files about netns_cookie include
> tools/testing/selftests/bpf/progs/netns_cookie_prog.c and
> tools/testing/selftests/bpf/prog_tests/netns_cookie.c. We inserted the
> following code in
> ‘net/core/sock_map.c: 468 sock_map_update_common()’:
> static int sock_map_update_common(..)
> {
> int inIrq = in_irq();
> int irqsDisabled = irqs_disabled();
> int preemptBits = preempt_count();
> int inAtomic = in_atomic();
> int rcuHeld = rcu_read_lock_held();
> printk("in_irq() %d, irqs_disabled() %d, preempt_count() %d,
> in_atomic() %d, rcu_read_lock_held() %d", inIrq, irqsDisabled,
> preemptBits, inAtomic, rcuHeld);
> }
>
> The output message is as follows:
> root@(none):/root/bpf# ./test_progs -t netns_cookie
> [ 137.639188] in_irq() 0, irqs_disabled() 0, preempt_count() 0,
> in_atomic() 0,
> rcu_read_lock_held() 1
> #113 netns_cookie:OK
> Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
>
> We notice that there are numerous callers in kernel/, net/ and drivers/,
> so we
> highly suggest modifying __bpf_prog_put() to address this gap. The gap
> exists
> because __bpf_prog_put() is only safe under in_irq() || irqs_disabled()
> but not in_atomic() || rcu_read_lock_held(). The following code snippet may
> mislead developers into thinking that bpf_prog_put() is safe in all
> contexts.
> if (in_irq() || irqs_disabled()) {
> INIT_WORK(&aux->work, bpf_prog_put_deferred);
> schedule_work(&aux->work);
> } else {
> bpf_prog_put_deferred(&aux->work);
> }
>
> Implicit dependency may lead to issues.
>
> > Any problem here?
> We mentioned it to demonstrate the possibility of kvfree() being
> called by __bpf_prog_put_noref().
>
> Thanks.
> -- Teng Qi
>
> On Wed, May 17, 2023 at 1:08 AM Yonghong Song <yhs@...a.com
> <mailto:yhs@...a.com>> wrote:
>
>
>
> On 5/16/23 4:18 AM, starmiku1207184332@...il.com
> <mailto:starmiku1207184332@...il.com> wrote:
> > From: Teng Qi <starmiku1207184332@...il.com
> <mailto:starmiku1207184332@...il.com>>
> >
> > Hi, bpf developers,
> >
> > We are developing a static tool to check the matching between
> helpers and the
> > context of hooks. During our analysis, we have discovered some
> important
> > findings that we would like to report.
> >
> > ‘kernel/bpf/syscall.c: 2097 __bpf_prog_put()’ shows that function
> > bpf_prog_put_deferred() won`t be called in the condition of
> > ‘in_irq() || irqs_disabled()’.
> > if (in_irq() || irqs_disabled()) {
> > INIT_WORK(&aux->work, bpf_prog_put_deferred);
> > schedule_work(&aux->work);
> > } else {
> >
> > bpf_prog_put_deferred(&aux->work);
> > }
> >
> > We suspect this condition exists because there might be sleepable
> operations
> > in the callees of the bpf_prog_put_deferred() function:
> > kernel/bpf/syscall.c: 2097 __bpf_prog_put()
> > kernel/bpf/syscall.c: 2084 bpf_prog_put_deferred()
> > kernel/bpf/syscall.c: 2063 __bpf_prog_put_noref()
> > kvfree(prog->aux->jited_linfo);
> > kvfree(prog->aux->linfo);
>
> Looks like you only have suspicion here. Could you find a real
> violation
> here where __bpf_prog_put() is called with !in_irq() &&
> !irqs_disabled(), but inside spin_lock or rcu read lock? I have not seen
> things like that.
>
> >
> > Additionally, we found that array prog->aux->jited_linfo is
> initialized in
> > ‘kernel/bpf/core.c: 157 bpf_prog_alloc_jited_linfo()’:
> > prog->aux->jited_linfo = kvcalloc(prog->aux->nr_linfo,
> > sizeof(*prog->aux->jited_linfo), bpf_memcg_flags(GFP_KERNEL |
> __GFP_NOWARN));
>
> Any problem here?
>
> >
> > Our question is whether the condition 'in_irq() ||
> irqs_disabled() == false' is
> > sufficient for calling 'kvfree'. We are aware that calling
> 'kvfree' within the
> > context of a spin lock or an RCU lock is unsafe.
Your above analysis makes sense if indeed that kvfree cannot appear
inside a spin lock region or RCU read lock region. But is it true?
I checked a few code paths in kvfree/kfree. It is either guarded
with local_irq_save/restore or by
spin_lock_irqsave/spin_unlock_irqrestore, etc. Did I miss
anything? Are you talking about RT kernel here?
> >
> > Therefore, we propose modifying the condition to include
> in_atomic(). Could we
> > update the condition as follows: "in_irq() || irqs_disabled() ||
> in_atomic()"?
> >
> > Thank you! We look forward to your feedback.
> >
> > Signed-off-by: Teng Qi <starmiku1207184332@...il.com
> <mailto:starmiku1207184332@...il.com>>
>
Powered by blists - more mailing lists