linux-kernel - Re: [patch 4/4] sched/mmcid: Optimize transitional CIDs when scheduling out

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87343nkrix.ffs@tglx>
Date: Fri, 30 Jan 2026 17:13:26 +0100
From: Thomas Gleixner <tglx@...nel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, LKML
 <linux-kernel@...r.kernel.org>
Cc: Ihor Solodrai <ihor.solodrai@...ux.dev>, Shrikanth Hegde
 <sshegde@...ux.ibm.com>, Peter Zijlstra <peterz@...radead.org>, Michael
 Jeanson <mjeanson@...icios.com>
Subject: Re: [patch 4/4] sched/mmcid: Optimize transitional CIDs when
 scheduling out

On Fri, Jan 30 2026 at 10:50, Mathieu Desnoyers wrote:
> On 2026-01-29 16:20, Thomas Gleixner wrote:
>> During the investigation of the various transition mode issues
>> instrumentation revealed that the amount of bitmap operations can be
>> significantly reduced when a task with a transitional CID schedules out
>> after the fixup function completed and disabled the transition mode.
>> 
>> At that point the mode is stable and therefore it is not required to drop
>> the transitional CID back into the pool. As the fixup is complete the
>> potential exhaustion of the CID pool is not longer possible, so the CID can
>> be transferred to the scheduling out task or to the CPU depending on the
>> current ownership mode. This is now possible because mm_cid::mode contains
>> both the ownership state and the transition bit so the racy snapshot is
>> valid under all circumstances because a subsequent modification of the
>> mode is serialized by the corresponding runqueue lock.
>
> AFAIU the mc->mode updates are serialized by the mm->mm_cid.lock
> and not the runqueue locks. What am I missing ?

Actually the mode updates are serialized by the mutex. They happen under
the lock as well, but the lock is not a serialization requirement for
mode changes.

What I meant to write with tired brain is:

  The racy snapshot is valid under runqueue lock even when there is a
  concurrent mode update going on because the subsequent fixup function
  is serialized with runqueue lock. That means in the following
  scenario:

  CPU0                  CPU1
  clear TRANSIT
  ....
  			lock(rq)
                        sched_out()
  		          CID has TRANSIT set
                          ...
                          // observes TRANSIT=0
                          localmode = READ_ONCE(...mode);
  // sets TRANSIT
  switch mode
                          transfer CID according to localmode
  fixup()
    lock(rq)    <- Blocked until the schedule on CPU1 is complete

So both sched_out() and fixup() observe consistent state and everything
just works.

Thanks,

        tglx