[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87343nkrix.ffs@tglx>
Date: Fri, 30 Jan 2026 17:13:26 +0100
From: Thomas Gleixner <tglx@...nel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, LKML
<linux-kernel@...r.kernel.org>
Cc: Ihor Solodrai <ihor.solodrai@...ux.dev>, Shrikanth Hegde
<sshegde@...ux.ibm.com>, Peter Zijlstra <peterz@...radead.org>, Michael
Jeanson <mjeanson@...icios.com>
Subject: Re: [patch 4/4] sched/mmcid: Optimize transitional CIDs when
scheduling out
On Fri, Jan 30 2026 at 10:50, Mathieu Desnoyers wrote:
> On 2026-01-29 16:20, Thomas Gleixner wrote:
>> During the investigation of the various transition mode issues
>> instrumentation revealed that the amount of bitmap operations can be
>> significantly reduced when a task with a transitional CID schedules out
>> after the fixup function completed and disabled the transition mode.
>>
>> At that point the mode is stable and therefore it is not required to drop
>> the transitional CID back into the pool. As the fixup is complete the
>> potential exhaustion of the CID pool is not longer possible, so the CID can
>> be transferred to the scheduling out task or to the CPU depending on the
>> current ownership mode. This is now possible because mm_cid::mode contains
>> both the ownership state and the transition bit so the racy snapshot is
>> valid under all circumstances because a subsequent modification of the
>> mode is serialized by the corresponding runqueue lock.
>
> AFAIU the mc->mode updates are serialized by the mm->mm_cid.lock
> and not the runqueue locks. What am I missing ?
Actually the mode updates are serialized by the mutex. They happen under
the lock as well, but the lock is not a serialization requirement for
mode changes.
What I meant to write with tired brain is:
The racy snapshot is valid under runqueue lock even when there is a
concurrent mode update going on because the subsequent fixup function
is serialized with runqueue lock. That means in the following
scenario:
CPU0 CPU1
clear TRANSIT
....
lock(rq)
sched_out()
CID has TRANSIT set
...
// observes TRANSIT=0
localmode = READ_ONCE(...mode);
// sets TRANSIT
switch mode
transfer CID according to localmode
fixup()
lock(rq) <- Blocked until the schedule on CPU1 is complete
So both sched_out() and fixup() observe consistent state and everything
just works.
Thanks,
tglx
Powered by blists - more mailing lists