linux-kernel - Re: [PATCH 0/5] sched: Lazy preemption muck

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ed4pq953.ffs@tglx>
Date: Wed, 09 Oct 2024 23:06:00 +0200
From: Thomas Gleixner <tglx@...utronix.de>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Sebastian Andrzej Siewior
 <bigeasy@...utronix.de>, Ankur Arora <ankur.a.arora@...cle.com>,
 mingo@...nel.org, linux-kernel@...r.kernel.org, juri.lelli@...hat.com,
 vincent.guittot@...aro.org, dietmar.eggemann@....com, bsegall@...gle.com,
 mgorman@...e.de, vschneid@...hat.com, efault@....de
Subject: Re: [PATCH 0/5] sched: Lazy preemption muck

On Wed, Oct 09 2024 at 16:43, Steven Rostedt wrote:
> On Wed, 09 Oct 2024 22:13:51 +0200
> Thomas Gleixner <tglx@...utronix.de> wrote:
>> The reason why this works is that preempt_enable() actually has a
>> meaning while it does not with NONE.
>
> Looking around, I see the pattern of checking need_resched() from within a
> loop where a spinlock is held. Then after the break of the loop and release
> of the spinlock, cond_resched() is checked, and the loop is entered again.
>
> Thus, I guess this is the reason you are saying that it should just check
> NEED_RESCHED and not the LAZY variant. Because if we remove that
> cond_resched() then it would just re-enter the loop again with the LAZY
> being set.
>
> Hmm, but then again...
>
> Perhaps these cond_resched() is proper? That is, the need_resched() /
> cond_resched() is not something that is being done for PREEMPT_NONE, but
> for preempt/voluntary kernels too. Maybe these cond_resched() should stay?
> If we spin in the loop for one more tick, that is actually changing the
> behavior of PREEMPT_NONE and PREEMPT_VOLUNTARY, as the need_resched()/cond_resched()
> helps with latency. If we just wait for the next tick, these loops (and
> there's a lot of them) will all now run for one tick longer than if
> PREEMPT_NONE or PREEMPT_VOLUNTARY were set today.

You are looking at it from the wrong perspective. You are trying to
preserve the status quo. I know that's the path of least effort but it's
the fundamentally wrong approach.

    spin_lock(L);
    while ($cond) {
          do_stuff();
          if (need_resched()) {
          	spin_unlock(L);
                resched();
                spin_lock(L);
          }
    }
    spin_unlock(L);

is the bogus pattern which was introduced to deal with the NONE
shortcomings. That's what we want to get rid of and not proliferate.

For the transition phase we obviously need to do:

    while ($cond) {
          spin_lock(L);
          do_stuff();
          spin_unlock(L);
          cond_resched();
    }

And once all the problems with LAZY are sorted then this cond_resched()
line just goes away and the loop looks like this:

    while ($cond) {
          spin_lock(L);
          do_stuff();
          spin_unlock(L);
    }

There is absolutely no argument that the spinlock held section needs to
spawn the loop. We fixed up several of these constructs over the years
and none of them caused a performance regression. Quite the contrary
quite some of them improved performance because dropping the lock lets
other CPUs interleave.

Seriously, this crap preserving mindset has to stop. If we go new ways
then we go them all the way.

There is a cost involved for cleaning the existing crap up, but that
cost is well worth it, because anything else is just adding to the ever
increasing accumulation of technical debt. We have enough of that
already.

Thanks,

        tglx