[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87h6o01w1a.fsf@oracle.com>
Date: Mon, 11 Sep 2023 10:04:17 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Ankur Arora <ankur.a.arora@...cle.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, x86@...nel.org,
akpm@...ux-foundation.org, luto@...nel.org, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com, mingo@...hat.com,
juri.lelli@...hat.com, vincent.guittot@...aro.org,
willy@...radead.org, mgorman@...e.de, rostedt@...dmis.org,
tglx@...utronix.de, jon.grimm@....com, bharata@....com,
raghavendra.kt@....com, boris.ostrovsky@...cle.com,
konrad.wilk@...cle.com, jgross@...e.com, andrew.cooper3@...rix.com
Subject: Re: [PATCH v2 7/9] sched: define TIF_ALLOW_RESCHED
Peter Zijlstra <peterz@...radead.org> writes:
> On Sun, Sep 10, 2023 at 11:32:32AM -0700, Linus Torvalds wrote:
>
>> I was hoping that we'd have some generic way to deal with this where
>> we could just say "this thing is reschedulable", and get rid of - or
>> at least not increasingly add to - the cond_resched() mess.
>
> Isn't that called PREEMPT=y ? That tracks precisely all the constraints
> required to know when/if we can preempt.
>
> The whole voluntary preempt model is basically the traditional
> co-operative preemption model and that fully relies on manual yields.
Yeah, but as Linus says, this means a lot of code is just full of
cond_resched(). For instance a loop the process_huge_page() uses
this pattern:
for (...) {
cond_resched();
clear_page(i);
cond_resched();
clear_page(j);
}
> The problem with the REP prefix (and Xen hypercalls) is that
> they're long running instructions and it becomes fundamentally
> impossible to put a cond_resched() in.
>
>> Yes. I'm starting to think that that the only sane solution is to
>> limit cases that can do this a lot, and the "instruciton pointer
>> region" approach would certainly work.
>
> From a code locality / I-cache POV, I think a sorted list of
> (non overlapping) ranges might be best.
Yeah, agreed. There are a few problems with doing that though.
I was thinking of using a check of this kind to schedule out when
it is executing in this "reschedulable" section:
!preempt_count() && in_resched_function(regs->rip);
For preemption=full, this should mostly work.
For preemption=voluntary, though this'll only work with out-of-line
locks, not if the lock is inlined.
(Both, should have problems with __this_cpu_* and the like, but
maybe we can handwave that away with sparse/objtool etc.)
How expensive would be always having PREEMPT_COUNT=y?
--
ankur
Powered by blists - more mailing lists