lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: 
 <PH0PR11MB5192C3A3806D89D0CACC2CEEEC3D2@PH0PR11MB5192.namprd11.prod.outlook.com>
Date: Wed, 3 Apr 2024 00:37:25 +0000
From: "Song, Xiongwei" <Xiongwei.Song@...driver.com>
To: Vlastimil Babka <vbabka@...e.cz>,
        "rientjes@...gle.com"
	<rientjes@...gle.com>,
        "cl@...ux.com" <cl@...ux.com>,
        "penberg@...nel.org"
	<penberg@...nel.org>,
        "iamjoonsoo.kim@....com" <iamjoonsoo.kim@....com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "roman.gushchin@...ux.dev" <roman.gushchin@...ux.dev>,
        "42.hyeyoo@...il.com"
	<42.hyeyoo@...il.com>
CC: "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "chengming.zhou@...ux.dev"
	<chengming.zhou@...ux.dev>
Subject: RE: [PATCH 3/4] mm/slub: simplify get_partial_node()

> 
> On 3/31/24 4:19 AM, xiongwei.song@...driver.com wrote:
> > From: Xiongwei Song <xiongwei.song@...driver.com>
> >
> > The break conditions can be more readable and simple.
> >
> > We can check if we need to fill cpu partial after getting the first
> > partial slab. If kmem_cache_has_cpu_partial() returns true, we fill
> > cpu partial from next iteration, or break up the loop.
> >
> > Then we can remove the preprocessor condition of
> > CONFIG_SLUB_CPU_PARTIAL. Use dummy slub_get_cpu_partial() to make
> > compiler silent.
> >
> > Signed-off-by: Xiongwei Song <xiongwei.song@...driver.com>
> > ---
> >  mm/slub.c | 22 ++++++++++++----------
> >  1 file changed, 12 insertions(+), 10 deletions(-)
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 590cc953895d..ec91c7435d4e 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -2614,18 +2614,20 @@ static struct slab *get_partial_node(struct kmem_cache *s,
> >               if (!partial) {
> >                       partial = slab;
> >                       stat(s, ALLOC_FROM_PARTIAL);
> > -             } else {
> > -                     put_cpu_partial(s, slab, 0);
> > -                     stat(s, CPU_PARTIAL_NODE);
> > -                     partial_slabs++;
> > +
> > +                     /* Fill cpu partial if needed from next iteration, or break */
> > +                     if (kmem_cache_has_cpu_partial(s))
> 
> That kinda puts back the check removed in patch 1, although only in the
> first iteration. Still not ideal.
> 
> > +                             continue;
> > +                     else
> > +                             break;
> >               }
> > -#ifdef CONFIG_SLUB_CPU_PARTIAL
> > -             if (partial_slabs > s->cpu_partial_slabs / 2)
> > -                     break;
> > -#else
> > -             break;
> > -#endif
> 
> I'd suggest intead of the changes done in this patch, only change this part
> above to:
> 
>         if ((slub_get_cpu_partial(s) == 0) ||
>             (partial_slabs > slub_get_cpu_partial(s) / 2))
>                 break;
> 
> That gets rid of the #ifdef and also fixes a weird corner case that if we
> set cpu_partial_slabs to 0 from sysfs, we still allocate at least one here.

Oh, yes. Will update.

> 
> It could be tempting to use >= instead of > to achieve the same effect but
> that would have unintended performance effects that would best be evaluated
> separately.

I can run a test to measure Amean changes. But in terms of x86 assembly, there 
should not be extra  instructions with ">=".

Did a simple test, for ">=" it uses "jle" instruction, while "jl" instruction is used for ">".
No more instructions involved. So there should not be performance effects on x86.

Thanks,
Xiongwei

> 
> >
> > +             put_cpu_partial(s, slab, 0);
> > +             stat(s, CPU_PARTIAL_NODE);
> > +             partial_slabs++;
> > +
> > +             if (partial_slabs > slub_get_cpu_partial(s) / 2)
> > +                     break;
> >       }
> >       spin_unlock_irqrestore(&n->list_lock, flags);
> >       return partial;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ