lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251117190641.GCaRtyQXdOhKrlAF7Y@fat_crate.local>
Date: Mon, 17 Nov 2025 20:06:41 +0100
From: Borislav Petkov <bp@...en8.de>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>,
	"Peter Zijlstra (Intel)" <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Tim Chen <tim.c.chen@...ux.intel.com>,
	linux-tip-commits@...r.kernel.org, Chen Yu <yu.c.chen@...el.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	K Prateek Nayak <kprateek.nayak@....com>,
	Srikar Dronamraju <srikar@...ux.ibm.com>,
	Mohini Narkhede <mohini.narkhede@...el.com>, x86@...nel.org
Subject: Re: [tip: sched/core] sched/fair: Skip sched_balance_running cmpxchg
 when balance is not due

On Sun, Nov 16, 2025 at 02:26:13AM +0530, Shrikanth Hegde wrote:
> 
> Hi Peter.
> 
> On 11/14/25 5:49 PM, tip-bot2 for Tim Chen wrote:
> > The following commit has been merged into the sched/core branch of tip:
> > 
> > Commit-ID:     2265c5d4deeff3bfe4580d9ffe718fd80a414cac
> > Gitweb:        https://git.kernel.org/tip/2265c5d4deeff3bfe4580d9ffe718fd80a414cac
> > Author:        Tim Chen <tim.c.chen@...ux.intel.com>
> > AuthorDate:    Mon, 10 Nov 2025 10:47:35 -08:00
> > Committer:     Peter Zijlstra <peterz@...radead.org>
> > CommitterDate: Fri, 14 Nov 2025 13:03:05 +01:00
> > 
> > sched/fair: Skip sched_balance_running cmpxchg when balance is not due
> > 
> > 
> 
> > +	if (!need_unlock && (sd->flags & SD_SERIALIZE) && idle != CPU_NEWLY_IDLE) {
> > +		if (!atomic_try_cmpxchg_acquire(&sched_balance_running, 0, 1))
> 
> This should be atomic_cmpxchg_acquire?
> 
> I booted the system with latest sched/core and it crashes at the boot.
> 
> BUG: Kernel NULL pointer dereference on read at 0x00000000
> Faulting instruction address: 0xc0000000001db57c
> Oops: Kernel access of bad area, sig: 7 [#1]
> LE PAGE_SIZE=64K MMU=Radix  SMP NR_CPUS=8192 NUMA pSeries
> Modules linked in:
> CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Not tainted 6.18.0-rc3+ #242 PREEMPT(lazy)
> NIP [c0000000001db57c] sched_balance_rq+0x560/0x92c
> LR [c0000000001db198] sched_balance_rq+0x17c/0x92c
> Call Trace:
> [c00000111ffdfd10] [c0000000001db198] sched_balance_rq+0x17c/0x92c (unreliable)
> [c00000111ffdfe50] [c0000000001dc598] sched_balance_domains+0x2c4/0x3d0
> [c00000111ffdff00] [c000000000168958] handle_softirqs+0x138/0x414
> [c00000111ffdffe0] [c000000000017d80] do_softirq_own_stack+0x3c/0x50
> [c000000008a57a60] [c000000000168048] __irq_exit_rcu+0x18c/0x1b4
> [c000000008a57a90] [c0000000001691a8] irq_exit+0x20/0x38
> [c000000008a57ab0] [c000000000028c18] timer_interrupt+0x174/0x394
> [c000000008a57b10] [c000000000009f8c] decrementer_common_virt+0x28c/0x290
> 
> 
> Bisect pointed to:
> git bisect bad 2265c5d4deeff3bfe4580d9ffe718fd80a414cac
> # first bad commit: [2265c5d4deeff3bfe4580d9ffe718fd80a414cac] sched/fair: Skip sched_balance_running cmpxchg when balance is not due

Dammit, I spent a whole day bisecting exactly the same issue and I missed your
mail.

Oh well, it is fixed now... should pay more attention next time.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ