lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 23 Sep 2022 07:36:25 -0400
From:   Phil Auld <pauld@...hat.com>
To:     Yury Norov <yury.norov@...il.com>
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Petr Štetiar <ynezz@...e.cz>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        linux-kernel@...r.kernel.org,
        Rasmus Villemoes <linux@...musvillemoes.dk>
Subject: Re: [PATCH v2 1/1] cpumask: Don't waste memory for sysfs cpulist
 nodes

On Thu, Sep 22, 2022 at 06:45:27PM -0700 Yury Norov wrote:
> On Thu, Sep 22, 2022 at 5:38 PM Phil Auld <pauld@...hat.com> wrote:
> >
> > Hi Andy,
> >
> > On Thu, Sep 22, 2022 at 10:49:54PM +0300 Andy Shevchenko wrote:
> > > Currently the approximation is used which wastes the more memory
> > > the more CPUs are present on the system. Proposed change calculates
> > > the exact maximum needed in the worst case:
> > >
> > >   NR_CPUS     old             new
> > >   -------     ---             ---
> > >   1 .. 1170   4096            4096
> > >   1171 .. 1860        4098 .. 6510    4096
> > >   ...         ...             ...
> > >   2*4096      28672           19925
> > >   4*4096      57344           43597
> > >   8*4096      114688          92749
> > >   16*4096     229376          191053
> > >   32*4096     458752          403197
> > >   64*4096     917504          861949
> > >   128*4096    1835008         1779453
> > >   256*4096    3670016         3670016
> > >
> > > Under the hood the reccurent formula is being used:
> > >   (5 - 0) * 2 +
> > >     (50 - 5) * 3 +
> > >       (500 - 50) * 4 +
> > >         (5000 - 500) * 5 +
> > >           ...
> > >             (X[i] - X[i-1]) * i
> > >
> > > which allows to count the exact maximum length in the worst case,
> > > i.e. when each second CPU is being listed. For backward compatibility
> > > for more than 1170 and less than 1861 CPUs the page size is preserved.
> > >
> > > For less than 1171 and more than 1 million CPUs the old is being used.
> >
> > The memory is not really wasted since it's probably temporary in userspace
> > and in the kernel it _is_ temporary and is only the length of the kasprintf
> > string, which is most of the time much less.
> >
> > But that said, it is more accurate than the previous estimate.
> >
> > I was wondering if you were going to try to come up with a suitable
> > compile time macro :)
> >
> > I tested 2, 8192 and 16k since the kernel does not want to build for other
> > reasons with NR_CPUS at 32k.
> >
> > Reviewed-by: Phil Auld <pauld@...hat.com>
> > Tested-by: Phil Auld <pauld@...hat.com>
> 
> By the way, why don't we use nr_cpu_ids? It's set at boot-time, or even
> at compile-time in some cases and never changed at runtime.
> 
> nr_cpu_ids is set very early, when ACPI tables are parsed. I don't think
> it's possible for a userspace to observe it uninitialized. Am I wrong?
> 

I very much wanted it to be compile time so there was no overhead.

Is there a good place to init these bin_attrs at boottime, anyway?
I didn't dig into it farther, but it looks like the size is set
when declared. I suppose we could adjust it at the first read if
there was a way to do that. It's sort of all abstracted away to
common code though.

The fact that the values are larger than actually needed doesn't really
cause any harm.  In the kernel no extra memory is allocated since it uses
the size of the sprintf of the actual data.  And it's only allocated
while being read.

In userspace it may be a little extra memory while doing a read. But that's
also likely to be just temporary.

And if you have NR_CPUS 10000 you probably have a lot of wasted percpu
space anyway...


Cheers,
Phil

-- 

Powered by blists - more mailing lists