lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHk-=wi0bXpgULVVLc2AdJcta-fvQP7yyFQ_JtaoHUiPrqf--A@mail.gmail.com>
Date:   Sat, 9 Sep 2023 21:35:54 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Ankur Arora <ankur.a.arora@...cle.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org,
        akpm@...ux-foundation.org, luto@...nel.org, bp@...en8.de,
        dave.hansen@...ux.intel.com, hpa@...or.com, mingo@...hat.com,
        juri.lelli@...hat.com, vincent.guittot@...aro.org,
        willy@...radead.org, mgorman@...e.de, rostedt@...dmis.org,
        tglx@...utronix.de, jon.grimm@....com, bharata@....com,
        raghavendra.kt@....com, boris.ostrovsky@...cle.com,
        konrad.wilk@...cle.com
Subject: Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED

On Sat, 9 Sept 2023 at 20:49, Ankur Arora <ankur.a.arora@...cle.com> wrote:
>
> I think we can keep these checks, but with this fixed definition of
> resched_allowed(). This might be better:
>
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2260,7 +2260,8 @@ static inline void disallow_resched(void)
>
>  static __always_inline bool resched_allowed(void)
>  {
> -       return unlikely(test_tsk_thread_flag(current, TIF_RESCHED_ALLOW));
> +       return unlikely(!preempt_count() &&
> +                        test_tsk_thread_flag(current, TIF_RESCHED_ALLOW));
>  }

I'm not convinced (at all) that the preempt count is the right thing.

It works for interrupts, yes, because interrupts will increment the
preempt count even on non-preempt kernels (since the preempt count is
also the interrupt context level).

But what about any synchronous trap handling?

In other words, just something like a page fault? A page fault doesn't
increment the preemption count (and in fact many page faults _will_
obviously re-schedule as part of waiting for IO).

A page fault can *itself* say "feel free to preempt me", and that's one thing.

But a page fault can also *interupt* something that said "feel free to
preempt me", and that's a completely *different* thing.

So I feel like the "tsk_thread_flag" was sadly completely the wrong
place to add this bit to, and the wrong place to test it in. What we
really want is "current kernel entry context".

So the right thing to do would basically be to put it in the stack
frame at kernel entry - whether that kernel entry was a system call
(which is doing some big copy that should be preemptible without us
having to add "cond_resched()" in places), or is a page fault (which
will also do things like big page clearings for hugepages)

And we don't have that, do we?

We have "on_thread_stack()", which checks for "are we on the system
call stack". But that doesn't work for page faults.

PeterZ - I feel like I might be either very confused, or missing
something You probably go "Duh, Linus, you're off on one of your crazy
tangents, and none of this is relevant because..."

                 Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ