[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZALcbQoKA7K8k2gJ@yury-laptop>
Date: Fri, 3 Mar 2023 21:51:41 -0800
From: Yury Norov <yury.norov@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Mateusz Guzik <mjguzik@...il.com>,
Alexander Potapenko <glider@...gle.com>,
Al Viro <viro@...iv.linux.org.uk>,
Kees Cook <keescook@...omium.org>,
Eric Biggers <ebiggers@...gle.com>,
Christian Brauner <brauner@...nel.org>, serge@...lyn.com,
paul@...l-moore.com, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-security-module@...r.kernel.org
Subject: Re: [PATCH v3 2/2] vfs: avoid duplicating creds in faccessat if
possible
On Fri, Mar 03, 2023 at 07:42:36PM -0800, Linus Torvalds wrote:
> ing: quoted-printable
> Status: O
> Content-Length: 1593
> Lines: 41
>
> On Fri, Mar 3, 2023 at 7:25 PM Yury Norov <yury.norov@...il.com> wrote:
> >
> > Did you enable CONFIG_FORCE_NR_CPUS? If you pick it, the kernel will
> > bind nr_cpu_ids to NR_CPUS at compile time, and the memset() call
> > should disappear.
>
> I do not believe CONFIG_FORCE_NR_CPUS makes any sense, and I think I
> told you so at the time.
At that time you was OK with CONFIG_FORCE_NR_CPUS, only suggested to
hide it behind CONFIG_EXPERT:
https://lore.kernel.org/all/Yzx4fSmmr8bh6gdl@yury-laptop/T/#m92d405527636154c3b2000e0105379170d988315
> This all used to just work *without* some kind of config thing, First
> removing the automatic "do the right thing", and then adding a config
> option to "force" doing the right thing seems more than a bit silly to
> me.
>
> I think CONFIG_FORCE_NR_CPUS should go away, and - once more - become
> just the "is the cpumask small enough to be just allocated directly"
> thing.
This all was just broken. For example, as I mentioned in commit message,
cpumask_full() was broken. I know because I wrote a test. There were no
a single user for the function, and nobody complained. Now we have one
in BPF code. So if we simply revert the aa47a7c215e, it will hurt real
users.
The pre-CONFIG_FORCE_NR_CPUS cpumask machinery would work only if you
set NR_CPUS to the number that matches to the actual number of CPUs as
detected at boot time.
In your example, if you have NR_CPUS == 64, and for some reason disable
hyper threading, nr_cpumask_bits will be set to 64 at compile time, but
nr_cpu_ids will be set to 32 at boot time, assuming
CONFIG_CPUMASK_OFFSTACK is disabled.
And the following code will be broken:
cpumask_t m1, m2;
cpumask_setall(m1); // m1 is ffff ffff ffff ffff because it uses
// compile-time optimized nr_cpumask_bits
for_each_cpu(cpu, m1) // 32 iterations because it relied on nr_cpu_ids
cpumask_set_cpu(cpu, m2); // m2 is ffff ffff XXXX XXXX
BUG_ON(!cpumask_equal(m1, m2)); // Bug because it will test all 64 bits
Today with CONFIG_FORCE_NR_CPUS disabled, kernel consistently relies
on boot-time defined nr_cpu_ids in functions like cpumask_equal()
with the cost of disabled runtime optimizations.
If CONFIG_FORCE_NR_CPUS is enabled, it wires nr_cpu_ids to NR_CPUS
at compile time, which allows compile-time optimization.
If CONFIG_FORCE_NR_CPUS is enabled, but actual number of CPUs doesn't
match to NR_CPUS, the kernel throws a warning at boot time - better
than nothing.
I'm not happy bothering people with a new config parameter in such a
simple case. I just don't know how to fix it better. Is there a safe
way to teach compiler to optimize against NR_CPUS other than telling
it explicitly?
> Of course, the problem for others remain that distros will do that
> CONFIG_CPUMASK_OFFSTACK thing, and then things will suck regardless.
>
> I was *so* happy with our clever "you can have large cpumasks, and
> we'll just allocate them off the stack" long long ago, because it
> meant that we could have one single source tree where this was all
> cleanly abstracted away, and we even had nice types and type safety
> for it all.
>
> That meant that we could support all the fancy SGI machines with
> several thousand cores, and it all "JustWorked(tm)", and didn't make
> the normal case any worse.
>
> I didn't expect distros to then go "ooh, we want that too", and enable
> it all by default, and make all our clever "you only see this
> indirection if you need it" go away, and now the normal case is the
> *bad* case, unless you just build your own kernel and pick sane
> defaults.
>
> Oh well.
>From distro people's perspective, 'one size fits all' is the best
approach. It's hard to blame them.
Thanks,
Yury
Powered by blists - more mailing lists