lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <sfh57nqbhxaycdlyitiughqc7ul3xuix5kis65l4grrnxwfqz3@gch2dlf3fnxo>
Date: Tue, 20 May 2025 12:41:20 -0400
From: Kent Overstreet <kent.overstreet@...ux.dev>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: Usama Arif <usamaarif642@...il.com>, 
	Andrew Morton <akpm@...ux-foundation.org>, hannes@...xchg.org, shakeel.butt@...ux.dev, vlad.wing@...il.com, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org, kernel-team@...a.com
Subject: Re: [PATCH 1/2] mm: slub: allocate slab object extensions
 non-contiguously

On Tue, May 20, 2025 at 08:20:38AM -0700, Suren Baghdasaryan wrote:
> On Tue, May 20, 2025 at 7:13 AM Usama Arif <usamaarif642@...il.com> wrote:
> >
> >
> >
> > On 20/05/2025 14:46, Usama Arif wrote:
> > >
> > >
> > > On 20/05/2025 14:44, Kent Overstreet wrote:
> > >> On Tue, May 20, 2025 at 01:25:46PM +0100, Usama Arif wrote:
> > >>> When memory allocation profiling is running on memory bound services,
> > >>> allocations greater than order 0 for slab object extensions can fail,
> > >>> for e.g. zs_handle zswap slab which will be 512 objsperslab x 16 bytes
> > >>> per slabobj_ext (order 1 allocation). Use kvcalloc to improve chances
> > >>> of the allocation being successful.
> > >>>
> > >>> Signed-off-by: Usama Arif <usamaarif642@...il.com>
> > >>> Reported-by: Vlad Poenaru <vlad.wing@...il.com>
> > >>> Closes: https://lore.kernel.org/all/17fab2d6-5a74-4573-bcc3-b75951508f0a@gmail.com/
> > >>> ---
> > >>>  mm/slub.c | 2 +-
> > >>>  1 file changed, 1 insertion(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/mm/slub.c b/mm/slub.c
> > >>> index dc9e729e1d26..bf43c403ead2 100644
> > >>> --- a/mm/slub.c
> > >>> +++ b/mm/slub.c
> > >>> @@ -1989,7 +1989,7 @@ int alloc_slab_obj_exts(struct slab *slab, struct kmem_cache *s,
> > >>>     gfp &= ~OBJCGS_CLEAR_MASK;
> > >>>     /* Prevent recursive extension vector allocation */
> > >>>     gfp |= __GFP_NO_OBJ_EXT;
> > >>> -   vec = kcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
> > >>> +   vec = kvcalloc_node(objects, sizeof(struct slabobj_ext), gfp,
> > >>>                        slab_nid(slab));
> > >>
> > >> And what's the latency going to be on a vmalloc() allocation when we're
> > >> low on memory?
> > >
> > > Would it not be better to get the allocation slighly slower than to not get
> > > it at all?
> >
> > Also a majority of them are less than 1 page. kvmalloc of less than 1 page
> > falls back to kmalloc. So vmalloc will only be on those greater than 1 page
> > size, which are in the minority (for e.g. zs_handle, request_sock_subflow_v6,
> > request_sock_subflow_v4...).
> 
> Not just the majority. For all of these kvmalloc allocations kmalloc
> will be tried first and vmalloc will be used only if the former
> failed: https://elixir.bootlin.com/linux/v6.14.7/source/mm/util.c#L665
> That's why I think this should not regress normal case when slab has
> enough space to satisfy the allocation.

And you really should consider just letting the extension vector
allocation fail if we're under that much memory pressure.

Failing allocations is an important mechanism for load shedding,
otherwise stuff just piles up - it's a big cause of our terrible
behaviour when we're thrashing.

It's equivalent to bufferbloat in the networking world.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ