lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250417224219.29946-1-kuniyu@amazon.com>
Date: Thu, 17 Apr 2025 15:41:48 -0700
From: Kuniyuki Iwashima <kuniyu@...zon.com>
To: <jordan@...fe.io>
CC: <aditi.ghag@...valent.com>, <bpf@...r.kernel.org>, <daniel@...earbox.net>,
	<kuniyu@...zon.com>, <martin.lau@...ux.dev>, <netdev@...r.kernel.org>,
	<willemdebruijn.kernel@...il.com>
Subject: Re: [PATCH v3 bpf-next 2/6] bpf: udp: Make sure iter->batch always contains a full bucket snapshot

From: Jordan Rife <jordan@...fe.io>
Date: Wed, 16 Apr 2025 16:36:17 -0700
> Require that iter->batch always contains a full bucket snapshot. This
> invariant is important to avoid skipping or repeating sockets during
> iteration when combined with the next few patches. Before, there were
> two cases where a call to bpf_iter_udp_batch may only capture part of a
> bucket:
> 
> 1. When bpf_iter_udp_realloc_batch() returns -ENOMEM [1].
> 2. When more sockets are added to the bucket while calling
>    bpf_iter_udp_realloc_batch(), making the updated batch size
>    insufficient [2].
> 
> In cases where the batch size only covers part of a bucket, it is
> possible to forget which sockets were already visited, especially if we
> have to process a bucket in more than two batches. This forces us to
> choose between repeating or skipping sockets, so don't allow this:
> 
> 1. Stop iteration and propagate -ENOMEM up to userspace if reallocation
>    fails instead of continuing with a partial batch.
> 2. Retry bpf_iter_udp_realloc_batch() up to two times if we fail to
>    capture the full bucket. On the third attempt, hold onto the bucket
>    lock (hslot2->lock) through bpf_iter_udp_realloc_batch() with
>    GFP_ATOMIC to guarantee that the bucket size doesn't change before
>    our next attempt. Try with GFP_USER first to improve the chances
>    that memory allocation succeeds; only use GFP_ATOMIC as a last
>    resort.

kvmalloc() tries the kmalloc path, 1. slab and 2. page allocator, and
if both of them fails, then tries 3. vmalloc().  But, vmalloc() does not
support GFP_ATOMIC, __kvmalloc_node_noprof() returns at
!gfpflags_allow_blocking().

So, falling back to GFP_ATOMIC is most unlikely to work as the last resort.

GFP_ATOMIC first and falling back to GFP_USER few more times, or not using
GFP_ATOMIC makes more sense to me.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ