lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPRMd3m9NtGXfH3kDWLq-Lu63i1ww4znDJ9aG6ho5J3+Ow_bnQ@mail.gmail.com>
Date: Wed, 16 Jul 2025 12:02:00 +0530
From: Shankari Anand <shankari.ak0208@...il.com>
To: Martin KaFai Lau <martin.lau@...ux.dev>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>, bpf <bpf@...r.kernel.org>, 
	LKML <linux-kernel@...r.kernel.org>, Martin KaFai Lau <martin.lau@...nel.org>, 
	Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	John Fastabend <john.fastabend@...il.com>, Andrii Nakryiko <andrii@...nel.org>, 
	Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>, 
	Yonghong Song <yonghong.song@...ux.dev>, KP Singh <kpsingh@...nel.org>, 
	Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, Jiri Olsa <jolsa@...nel.org>, 
	syzbot+ad4661d6ca888ce7fe11@...kaller.appspotmail.com
Subject: Re: [PATCH] bpf: restrict verifier access to bpf_lru_node.ref

Hello,
>
>
> Also you misread the kcsan report.

> It says that 'read' comes from:
>
> read to 0xffff888118f3d568 of 4 bytes by task 4719 on cpu 1:
>  lookup_nulls_elem_raw kernel/bpf/hashtab.c:643 [inline]

> which is reading hash and key of htab_elem while
> write side actually writes hash too:
> *(u32 *)((void *)node + lru->hash_offset) = hash;

Thanks for the clarification. I misattributed the race to the ref
field, but the KCSAN report indeed points to a data race between a
reader, lookup_nulls_elem_raw(), accessing the hash or key fields, and
a writer, bpf_lru_pop_free(), reinitializing and reusing the same
element from the LRU freelist without waiting for an RCU grace period.

> I think it is possible. The elem in the lru's freelist currently does not wait
> for a rcu gp before reuse. There is a chance that the rcu reader is still
> reading the hash value that was put in the freelist, while the writer is reusing
> and updating it.
>
> I think the percpu_freelist used in the regular hashmap should have similar
> behavior, so may be worth finding a common solution, such as waiting for a rcu
> gp before reusing it.

To resolve this, would it make sense to ensure that elements popped
from the free list are only reused after a grace period? Similar to
how other parts of the kernel manage safe object reuse.

--
Regards,
Shankari



On Wed, Jul 16, 2025 at 2:57 AM Martin KaFai Lau <martin.lau@...ux.dev> wrote:
>
> On 7/15/25 7:49 AM, Alexei Starovoitov wrote:
> > Also you misread the kcsan report.
> >
> > It says that 'read' comes from:
> >
> > read to 0xffff888118f3d568 of 4 bytes by task 4719 on cpu 1:
> >   lookup_nulls_elem_raw kernel/bpf/hashtab.c:643 [inline]
> >
> > which is reading hash and key of htab_elem while
> > write side actually writes hash too:
> > *(u32 *)((void *)node + lru->hash_offset) = hash;
> >
> > Martin,
> > is it really possible for these read/write to race ?
>
> I think it is possible. The elem in the lru's freelist currently does not wait
> for a rcu gp before reuse. There is a chance that the rcu reader is still
> reading the hash value that was put in the freelist, while the writer is reusing
> and updating it.
>
> I think the percpu_freelist used in the regular hashmap should have similar
> behavior, so may be worth finding a common solution, such as waiting for a rcu
> gp before reusing it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ