lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 6 Mar 2023 10:02:27 -0800
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Vernon Yang <vernon2gm@...il.com>
Cc:     tytso@....edu, Jason@...c4.com, davem@...emloft.net,
        edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
        jejb@...ux.ibm.com, martin.petersen@...cle.com,
        yury.norov@...il.com, andriy.shevchenko@...ux.intel.com,
        linux@...musvillemoes.dk, james.smart@...adcom.com,
        dick.kennedy@...adcom.com, linux-kernel@...r.kernel.org,
        wireguard@...ts.zx2c4.com, netdev@...r.kernel.org,
        linux-scsi@...r.kernel.org
Subject: Re: [PATCH 5/5] cpumask: fix comment of cpumask_xxx

On Mon, Mar 6, 2023 at 9:47 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> The drivers/char/random.c code is very wrong, and does
>
>              if (cpu == nr_cpumask_bits)
>                              cpu = cpumask_first(&timer_cpus);
>
> which fails miserably exactly because it doesn't use ">=".

Turns out this "cpu == nr_cpumask_bits" pattern exists in a couple of
other places too.

It was always wrong, but it always just happened to work. The lpfc
SCSI driver in particular seems to *love* this pattern:

        start_cpu = cpumask_next(new_cpu, cpu_present_mask);
        if (start_cpu == nr_cpumask_bits)
                start_cpu = first_cpu;

and has repeated it multiple times, all incorrect.

We do have "cpumask_next_wrap()", and that *seems* to be what the lpcf
driver actually wants to do.

.. and then we have kernel/sched/fair.c, which is actually not buggy,
just odd. It uses nr_cpumask_bits too, but it uses it purely for its
own internal nefarious reasons - it's not actually related to the
cpumask functions at all, its just used as a "not valid CPU number".

I think that scheduler use is still very *wrong*, but it doesn't look
actively buggy.

The other cases all look very buggy indeed, but yes, they happened to
work, and now they don't. So commit 596ff4a09b89 ("cpumask:
re-introduce constant-sized cpumask optimizations") did break them.

I'd rather fix these bad users than revert, but there does seem to be
an alarming number of these things, which worries me:

     git grep '== nr_cpumask_bits'

and that's just checking for this *exact* thing.

                Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ