lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aMNnLenCytO_KEKg@slm.duckdns.org>
Date: Thu, 11 Sep 2025 14:19:57 -1000
From: Tejun Heo <tj@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, mingo@...hat.com, juri.lelli@...hat.com,
	vincent.guittot@...aro.org, dietmar.eggemann@....com,
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
	vschneid@...hat.com, longman@...hat.com, hannes@...xchg.org,
	mkoutny@...e.com, void@...ifault.com, arighi@...dia.com,
	changwoo@...lia.com, cgroups@...r.kernel.org,
	sched-ext@...ts.linux.dev, liuwenfang@...or.com, tglx@...utronix.de
Subject: Re: [PATCH 12/14] sched: Add shared runqueue locking to
 __task_rq_lock()

Hello,

On Wed, Sep 10, 2025 at 05:44:21PM +0200, Peter Zijlstra wrote:
> @@ -703,17 +703,24 @@ void double_rq_lock(struct rq *rq1, stru
>  struct rq *__task_rq_lock(struct task_struct *p, struct rq_flags *rf)
>  	__acquires(rq->lock)
>  {
> +	raw_spinlock_t *slock;
>  	struct rq *rq;
>  
>  	lockdep_assert_held(&p->pi_lock);
>  
>  	for (;;) {
>  		rq = task_rq(p);
> +		slock = p->srq_lock;
>  		raw_spin_rq_lock(rq);
> -		if (likely(rq == task_rq(p) && !task_on_rq_migrating(p))) {
> +		if (slock)
> +			raw_spin_lock(slock);
> +		if (likely(rq == task_rq(p) && !task_on_rq_migrating(p) &&
> +			   (!slock || p->srq_lock == slock))) {
>  			rq_pin_lock(rq, rf);
>  			return rq;
>  		}

With the !slock condition, the following scenario is possible:

  __task_rq_lock()
     slock = p->srq_lock; /* NULL */
                                                dispatch_enqueue()
                                                  p->srq_lock = &dsq->lock;
                                                enqueue finishes
     raw_spin_rq_lock(rq);
     rq is the same, $slock is NULL, return
  do something assuming p is locked down        p gets dispatched to another rq

I'm unclear on when p->srq_lock would be safe to set and clear, so the goal
is that whoever does [__]task_rq_lock() ends up waiting on the dsq lock that
the task is queued on, and if we can exclude other sched operations that
way, we don't have to hold source rq lock when moving the task to another rq
for execution, right?

In the last patch, it's set on dispatch_enqueue() and cleared when the task
leaves the DSQ. Let's consider a simple scenario where a task gets enqueued,
gets put on a non-local DSQ and then dispatched to a local DSQ, Assuming
everything works out and we don't have to lock the source rq for migration,
we'd be depending on task_rq_lock() reliably hitting p->srq_lock to avoid
races, but I'm not sure how this would work. Let's say p is currently
associated with CPU1 on a non-local DSQ w/ p->srq_lock set to its source
DSQ.

  pick_task_ext() on CPU0               task property change on CPU1
    locks the DSQ
    picks p      
    task_unlink_from_dsq()              task_rq_lock();
      p->srq_lock = NULL;                 lock rq on CPU1
    p is moved to local DSQ               sees p->src_lock == NULL
                                          return
  p starts running
  anything can happen
                                        proceed with property change

What am I missing?

Thanks.

-- 
tejun

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ