lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aJtCSjsCEtN1csjg@gpd4>
Date: Tue, 12 Aug 2025 15:31:54 +0200
From: Andrea Righi <arighi@...dia.com>
To: Jake Hillion <jake@...lion.co.uk>
Cc: Tejun Heo <tj@...nel.org>, Christian Loehle <christian.loehle@....com>,
	void@...ifault.com, linux-kernel@...r.kernel.org,
	sched-ext@...ts.linux.dev, changwoo@...lia.com, hodgesd@...a.com,
	mingo@...hat.com, peterz@...radead.org
Subject: Re: [PATCH v3 3/3] sched_ext: Guarantee rq lock on scx_bpf_cpu_rq()

On Mon, Aug 11, 2025 at 03:35:05PM +0100, Jake Hillion wrote:
> On Sun, Aug 10, 2025 at 12:52:53PM +0200, Andrea Righi wrote:
> > Yeah, this is not nice, but they would be still broken though, in PATCH 1/3
> > we force schedulers to check for NULL and, if they don't, the verifier
> > won't be happy, so this already breaks existing binaries.
> 
> I ran some testing on the sched_ext for-next branch, and scx_cosmos is
> breaking in cosmos_init including the latest changes. I believe it kicks
> off a timer in init, which indirectly calls
> `scx_bpf_cpu_rq(cpu)->curr->flags & PF_IDLE`. This should be NULL
> checked, but old binaries breaking is pretty inconvenient for new users.
> 
> As Andrea says, this is the already merged patch triggering this.

We should provide a compat helper in common.bpf.h and fix the schedulers to
use this helper. Something like the following (untested):

static inline struct task_struct *
__COMPAT_scx_bpf_task_acquire_remote_curr(s32 cpu)
{
	struct rq *rq;

	if (bpf_ksym_exists(scx_bpf_task_acquire_remote_curr)
		return scx_bpf_task_acquire_remote_curr(cpu);

	rq = scx_bpf_cpu_rq(cpu);

	return rq ? rq->curr : NULL;
}

Then we can drop this after a couple of kernel releases (like in v6.20).

-Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ