linux-kernel - Re: [PATCH v2 7/9] sched: define TIF_ALLOW

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <878r90lyai.fsf@oracle.com>
Date:   Wed, 20 Sep 2023 07:22:29 -0700
From:   Ankur Arora <ankur.a.arora@...cle.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ankur Arora <ankur.a.arora@...cle.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org,
        akpm@...ux-foundation.org, luto@...nel.org, bp@...en8.de,
        dave.hansen@...ux.intel.com, hpa@...or.com, mingo@...hat.com,
        juri.lelli@...hat.com, vincent.guittot@...aro.org,
        willy@...radead.org, mgorman@...e.de, rostedt@...dmis.org,
        jon.grimm@....com, bharata@....com, raghavendra.kt@....com,
        boris.ostrovsky@...cle.com, konrad.wilk@...cle.com,
        jgross@...e.com, andrew.cooper3@...rix.com
Subject: Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED


Thomas Gleixner <tglx@...utronix.de> writes:

> So the decision matrix would be:
>
>                 Ret2user        Ret2kernel      PreemptCnt=0
>
> NEED_RESCHED       Y                Y               Y
> LAZY_RESCHED       Y                N               N
>
> That is completely independent of the preemption model and the
> differentiation of the preemption models happens solely at the scheduler
> level:

This is relatively minor, but do we need two flags? Seems to me we
can get to the same decision matrix by letting the scheduler fold
into the preempt-count based on current preemption model.

> PREEMPT_NONE sets only LAZY_RESCHED unless it needs to enforce the time
> slice where it sets NEED_RESCHED.

PREEMPT_NONE sets up TIF_NEED_RESCHED. For the time-slice expiry case,
also fold into preempt-count.

> PREEMPT_VOLUNTARY extends the NONE model so that the wakeup of RT class
> tasks or sporadic event tasks sets NEED_RESCHED too.

PREEMPT_NONE sets up TIF_NEED_RESCHED and also folds it for the
RT/sporadic tasks.

> PREEMPT_FULL always sets NEED_RESCHED like today.

Always fold the TIF_NEED_RESCHED into the preempt-count.

> We should be able merge the PREEMPT_NONE/VOLUNTARY behaviour so that we
> only end up with two variants or even subsume PREEMPT_FULL into that
> model because that's what is closer to the RT LAZY preempt behaviour,
> which has two goals:
>
>       1) Make low latency guarantees for RT workloads
>
>       2) Preserve the throughput for non-RT workloads
>
> But in any case this decision happens solely in the core scheduler code
> and nothing outside of it needs to be changed.
>
> So we not only get rid of the cond/might_resched() muck, we also get rid
> of the static_call/static_key machinery which drives PREEMPT_DYNAMIC.
> The only place which still needs that runtime tweaking is the scheduler
> itself.

True. The dynamic preemption could just become a scheduler tunable.

> Though it just occured to me that there are dragons lurking:
>
> arch/alpha/Kconfig:     select ARCH_NO_PREEMPT
> arch/hexagon/Kconfig:   select ARCH_NO_PREEMPT
> arch/m68k/Kconfig:      select ARCH_NO_PREEMPT if !COLDFIRE
> arch/um/Kconfig:        select ARCH_NO_PREEMPT
>
> So we have four architectures which refuse to enable preemption points,
> i.e. the only model they allow is NONE and they rely on cond_resched()
> for breaking large computations.
>
> But they support PREEMPT_COUNT, so we might get away with a reduced
> preemption point coverage:
>
>                 Ret2user        Ret2kernel      PreemptCnt=0
>
> NEED_RESCHED       Y                N               Y
> LAZY_RESCHED       Y                N               N

So from the discussion in the other thread, for the ARCH_NO_PREEMPT
configs that don't support preemption, we probably need a fourth
preemption model, say PREEMPT_UNSAFE.

These could use only the Ret2user preemption points and just fallback
to the !PREEMPT_COUNT primitives.

Thanks

--
ankur