lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <87sfbdh3ag.ffs@tglx> Date: Wed, 31 May 2023 00:17:59 +0200 From: Thomas Gleixner <tglx@...utronix.de> To: Chuck Lever III <chuck.lever@...cle.com> Cc: Eli Cohen <elic@...dia.com>, Leon Romanovsky <leon@...nel.org>, Saeed Mahameed <saeedm@...dia.com>, linux-rdma <linux-rdma@...r.kernel.org>, "open list:NETWORKING [GENERAL]" <netdev@...r.kernel.org> Subject: Re: system hang on start-up (mlx5?) On Tue, May 30 2023 at 21:48, Chuck Lever III wrote: >> On May 30, 2023, at 3:46 PM, Thomas Gleixner <tglx@...utronix.de> wrote: >> cpumask_copy(d, s) >> bitmap_copy(d, s, nbits = 32) >> len = BITS_TO_LONGS(nbits) * sizeof(unsigned long); >> >> So it copies as many longs as required to cover nbits, i.e. it copies >> any clobbered bits beyond nbits too. While that looks odd at the first >> glance, that's just an optimization which is harmless. >> >> for_each_cpu() finds the next set bit in a mask and breaks the loop once >> bitnr >= small_cpumask_bits, which is nr_cpu_ids and should be 32 too. >> >> I just booted a kernel with NR_CPUS=32: > > My system has only 12 CPUs. So every bit in your mask represents > a present CPU, but on my system, only 0x00000fff are ever present. > > Therefore, on my system, any bit higher than bit 11 in a CPU mask > will reference a CPU that is not present. Correct.... Sorry, I missed the part that your machine has only 12 CPUs.... Now I can reproduce the wreckage even with that trivial test I did: [ 0.210089] setup_percpu: NR_CPUS:32 nr_cpumask_bits:12 nr_cpu_ids:12 nr_node_ids:1 ... [ 0.606591] smp: MASKBITS: 5555555555555555 [ 0.607026] smp: CPUs: 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 I'm way too tired to make sense of that right now. Will have a look at it tomorrow with brain awake unless you beat me to it. That's one mystery but the other one is this: [ 71.273798][ T1185] irq_matrix_reserve_managed: MASKBITS: ffffb1a74686bcd8 That's clearly a kernel address within the direct map. How does that end up as content of a cpumask? Thanks, tglx
Powered by blists - more mailing lists