lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC1LvL0ntdrWh_1y0EcVR6C1_WyqOQ15EhihfQRs=ai7pcE-Sw@mail.gmail.com>
Date: Wed, 26 Feb 2025 15:17:14 -0800
From: Zvi Effron <zeffron@...tgames.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Hou Tao <houtao@...weicloud.com>, Toke Høiland-Jørgensen <toke@...nel.org>, 
	bpf <bpf@...r.kernel.org>, rcu@...r.kernel.org, 
	LKML <linux-kernel@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>, 
	Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman <eddyz87@...il.com>, Song Liu <song@...nel.org>, 
	Yonghong Song <yonghong.song@...ux.dev>, John Fastabend <john.fastabend@...il.com>, 
	KP Singh <kpsingh@...nel.org>, Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>, 
	Jiri Olsa <jolsa@...nel.org>, "Paul E . McKenney" <paulmck@...nel.org>, Cody Haas <chaas@...tgames.com>, 
	Hou Tao <hotforest@...il.com>
Subject: Re: [RESEND] [PATCH bpf-next 2/3] bpf: Overwrite the element in hash
 map atomically

On Tue, Feb 25, 2025 at 9:42 PM Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
>
> On Tue, Feb 25, 2025 at 8:05 PM Hou Tao <houtao@...weicloud.com> wrote:
> >
> > Hi,
> >
> > On 2/26/2025 11:24 AM, Alexei Starovoitov wrote:
> > > On Sat, Feb 8, 2025 at 2:17 AM Hou Tao <houtao@...weicloud.com> wrote:
> > >> Hi Toke,
> > >>
> > >> On 2/6/2025 11:05 PM, Toke Høiland-Jørgensen wrote:
> > >>> Hou Tao <houtao@...weicloud.com> writes:
> > >>>
> > >>>> +cc Cody Haas
> > >>>>
> > >>>> Sorry for the resend. I sent the reply in the HTML format.
> > >>>>
> > >>>> On 2/4/2025 4:28 PM, Hou Tao wrote:
> > >>>>> Currently, the update of existing element in hash map involves two
> > >>>>> steps:
> > >>>>> 1) insert the new element at the head of the hash list
> > >>>>> 2) remove the old element
> > >>>>>
> > >>>>> It is possible that the concurrent lookup operation may fail to find
> > >>>>> either the old element or the new element if the lookup operation starts
> > >>>>> before the addition and continues after the removal.
> > >>>>>
> > >>>>> Therefore, replacing the two-step update with an atomic update. After
> > >>>>> the change, the update will be atomic in the perspective of the lookup
> > >>>>> operation: it will either find the old element or the new element.
> > > I'm missing the point.
> > > This "atomic" replacement doesn't really solve anything.
> > > lookup will see one element.
> > > That element could be deleted by another thread.
> > > bucket lock and either two step update or single step
> > > don't change anything from the pov of bpf prog doing lookup.
> >
> > The point is that overwriting an existed element may lead to concurrent
> > lookups return ENOENT as demonstrated by the added selftest and the
> > patch tried to "fix" that. However, it seems using
> > hlist_nulls_replace_rcu() for the overwriting update is still not
> > enough. Because when the lookup procedure found the old element, the old
> > element may be reusing, the comparison of the map key may fail, and the
> > lookup procedure may still return ENOENT.
>
> you mean l_old == l_new ? I don't think it's possible
> within htab_map_update_elem(),
> but htab_map_delete_elem() doing hlist_nulls_del_rcu()
> then free_htab_elem, htab_map_update_elem, alloc, hlist_nulls_add_head_rcu
> and just deleted elem is reused to be inserted
> into another bucket.
>
> I'm not sure whether this new hlist_nulls_replace_rcu()
> primitive works with nulls logic.
>
> So back to the problem statement..
> Are you saying that adding new to a head while lookup is in the middle
> is causing it to miss an element that
> is supposed to be updated assuming atomicity of the update?
> While now update_elem() is more like a sequence of delete + insert?
>
> Hmm.

Yes, exactly that. Because update_elem is actually a delete + insert (actually
an insert + delete, I think?), it is possible for a concurrent lookup to see no
element instead of either the old or new value.

>
> > I see. In v2 I will fallback to the original idea: adding a standalone
> > update procedure for htab of maps in which it will atomically overwrite
> > the map_ptr just like array of maps does.
>
> hold on. is this only for hash-of-maps ?

I believe this was also replicated for hash as well as hash-of-maps. Cody can
confirm, or use the reproducer he has to test that case.

> How that non-atomic update manifested in real production?
>

See [1] (in the cover letter for this series, also replicated below).

[1] : https://lore.kernel.org/xdp-newbies/07a365d8-2e66-2899-4298-b8b158a928fa@huaweicloud.com/T/#m06fcd687c6cfdbd0f9b643b227e69b479fc8c2f6

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ