lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YRvbS5ypWhcsBzzU@hirez.programming.kicks-ass.net>
Date:   Tue, 17 Aug 2021 17:52:43 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Eugene Syromiatnikov <esyr@...hat.com>
Cc:     joel@...lfernandes.org, chris.hyser@...cle.com, joshdon@...gle.com,
        mingo@...nel.org, vincent.guittot@...aro.org,
        valentin.schneider@....com, mgorman@...e.de,
        linux-kernel@...r.kernel.org, tglx@...utronix.de,
        Christian Brauner <christian.brauner@...ntu.com>, ldv@...ace.io
Subject: Re: [PATCH 18/19] sched: prctl() core-scheduling interface

On Tue, Aug 17, 2021 at 05:15:42PM +0200, Eugene Syromiatnikov wrote:
> [76195.611570] BUG: kernel NULL pointer dereference, address: 0000000000000000
> [76195.613059] #PF: supervisor read access in kernel mode
> [76195.614174] #PF: error_code(0x0000) - not-present page
> [76195.615329] PGD 800000005f27e067 P4D 800000005f27e067 PUD 3f7a3067 PMD 0 
> [76195.616801] Oops: 0000 [#67] SMP PTI
> [76195.617586] CPU: 2 PID: 239821 Comm: prctl-sched-cor Tainted: G      D W        --------- ---  5.14.0-0.rc5.20210813gitf8e6dfc64f61.46.fc36.x86_64 #1
> [76195.620374] Hardware name: HP ProLiant BL480c G1, BIOS I14 10/04/2007
> [76195.621771] RIP: 0010:do_raw_spin_trylock+0x5/0x40
> [76195.622821] Code: c6 a4 12 5f 9f 48 89 ef e8 c8 fe ff ff eb a9 89 c6 48 89 ef e8 0c f5 ff ff 66 90 eb a9 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 <8b> 07 85 c0 75 28 ba 01 00 00 00 f0 0f b1 17 75 1d 65 8b 05 fb 98
> [76195.626797] RSP: 0018:ffffa366014abe58 EFLAGS: 00010086
> [76195.627936] RAX: 0000000000000001 RBX: 0000000000000004 RCX: 0000000000000000
> [76195.629470] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [76195.631048] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
> [76195.632585] R10: 0000000000000000 R11: ffff98292b21ad48 R12: 0000000000000018
> [76195.634078] R13: 0000000000000000 R14: ffff98292b7ef940 R15: ffff982813938e00
> [76195.635621] FS:  00007f271f8755c0(0000) GS:ffff98292b200000(0000) knlGS:0000000000000000
> [76195.637354] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [76195.638606] CR2: 0000000000000000 CR3: 000000000fdd0000 CR4: 00000000000006e0
> [76195.640144] Call Trace:
> [76195.640706]  _raw_spin_lock_nested+0x37/0x80
> [76195.641645]  ? raw_spin_rq_lock_nested+0x4b/0x80
> [76195.642693]  raw_spin_rq_lock_nested+0x4b/0x80
> [76195.643669]  online_fair_sched_group+0x39/0x240

Urgh... lemme guess, your HP BIOS is funny and reports more possible
CPUs than you actually have resulting in cpu_possible_mask !=
cpu_online_mask. Alternatively, you booted with nr_cpus= or something
daft like that.

That code does for_each_possible_cpus(i) { rq_lock_irq(cpu_rq(i)); },
which, because of core-sched, needs rq->core set-up, but because these
CPUs have never been online, that's not done and *BOOM*.

Or something like that.. I'll try and have a look tomorrow, I'm in dire
need of sleep.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ