lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250520171814.GC773385@cmpxchg.org>
Date: Tue, 20 May 2025 13:18:14 -0400
From: Johannes Weiner <hannes@...xchg.org>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: Usama Arif <usamaarif642@...il.com>,
	Kent Overstreet <kent.overstreet@...ux.dev>,
	Andrew Morton <akpm@...ux-foundation.org>, shakeel.butt@...ux.dev,
	vlad.wing@...il.com, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH 1/2] mm: slub: allocate slab object extensions
 non-contiguously

On Tue, May 20, 2025 at 08:20:38AM -0700, Suren Baghdasaryan wrote:
> On Tue, May 20, 2025 at 7:13 AM Usama Arif <usamaarif642@...il.com> wrote:
> >
> >
> >
> > On 20/05/2025 14:46, Usama Arif wrote:
> > >
> > >
> > > On 20/05/2025 14:44, Kent Overstreet wrote:
> > >> On Tue, May 20, 2025 at 01:25:46PM +0100, Usama Arif wrote:
> > >>> When memory allocation profiling is running on memory bound services,
> > >>> allocations greater than order 0 for slab object extensions can fail,
> > >>> for e.g. zs_handle zswap slab which will be 512 objsperslab x 16 bytes
> > >>> per slabobj_ext (order 1 allocation). Use kvcalloc to improve chances
> > >>> of the allocation being successful.
> > >>>
> > >>> Signed-off-by: Usama Arif <usamaarif642@...il.com>
> > >>> Reported-by: Vlad Poenaru <vlad.wing@...il.com>
> > >>> Closes: https://lore.kernel.org/all/17fab2d6-5a74-4573-bcc3-b75951508f0a@gmail.com/
> > >>> ---
> > >>>  mm/slub.c | 2 +-
> > >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/mm/slub.c b/mm/slub.c
> > >>> index dc9e729e1d26..bf43c403ead2 100644
> > >>> --- a/mm/slub.c
> > >>> +++ b/mm/slub.c
> > >>> @@ -1989,7 +1989,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> > >>>     gfp &= ~OBJCGS_CLEAR_MASK;
> > >>>     /* Prevent recursive extension vector allocation */
> > >>>     gfp |= __GFP_NO_OBJ_EXT;
> > >>> -   vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
> > >>> +   vec = kvcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
> > >>>                        slab_nid(slab));
> > >>
> > >> And what's the latency going to be on a vmalloc() allocation when we're
> > >> low on memory?
> > >
> > > Would it not be better to get the allocation slighly slower than to not get
> > > it at all?
> >
> > Also a majority of them are less than 1 page. kvmalloc of less than 1 page
> > falls back to kmalloc. So vmalloc will only be on those greater than 1 page
> > size, which are in the minority (for e.g. zs_handle, request_sock_subflow_v6,
> > request_sock_subflow_v4...).
> 
> Not just the majority. For all of these kvmalloc allocations kmalloc
> will be tried first and vmalloc will be used only if the former
> failed: https://elixir.bootlin.com/linux/v6.14.7/source/mm/util.c#L665
> That's why I think this should not regress normal case when slab has
> enough space to satisfy the allocation.

Alexei raised a good point offline that having slab enter vmalloc
messes with the whole slab re-entrancy and nmi safety he has been
pursuing for bpf/probing.

Add that to the other concerns around vmalloc, and I think we should
just drop that part.

IMO, the more important takeaway is that we accept that this
allocation is optimistic, and can and does fail in practice, even if
the slab allocation itself succeeded.

So it probably makes sense to 1) ax the warning entirely - since it's
not indicative of a bug. And 2) accept that the numbers can have a
fudge factor in practice, and mark line items in the report
correspondingly when they do.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ