lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0jz1aZdNHPH5x=VoDjfdMnD_+iE2CuYN5=nk=CMxSkXpA@mail.gmail.com>
Date:   Mon, 31 Jul 2023 19:27:27 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     anna-maria@...utronix.de, tglx@...utronix.de, frederic@...nel.org,
        gautham.shenoy@....com, linux-kernel@...r.kernel.org,
        daniel.lezcano@...aro.org, linux-pm@...r.kernel.org,
        mingo@...hat.com, juri.lelli@...hat.com,
        vincent.guittot@...aro.org, dietmar.eggemann@....com,
        rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
        bristot@...hat.com, vschneid@...hat.com
Subject: Re: [RFC][PATCH 1/3] cpuidle: Inject tick boundary state

On Mon, Jul 31, 2023 at 6:55 PM Rafael J. Wysocki <rafael@...nel.org> wrote:
>
> On Mon, Jul 31, 2023 at 1:39 PM Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Mon, Jul 31, 2023 at 12:35:20PM +0200, Rafael J. Wysocki wrote:
> >
> > > > So I agree with 1.
> > > >
> > > > I do not agree with 2. Disabling the tick is costly, doubly so with the
> > > > timer-pull thing, but even today. Simply disabling it because we picked
> > > > the deepest idle state, irrespective of the expected duration is wrong
> > > > as it will incur this significant cost.
> > > >
> > > > With 3 there is the question of how we get the expected sleep duration;
> > > > this is especially important with timer-pull, where we have this
> > > > chicken-and-egg thing.
> > > >
> > > > Notably: tick_nohz_get_sleep_length() wants to know if the tick gets
> > > > disabled
> > >
> > > Well, it shouldn't.  Or at least it didn't before.
> >
> > Correct, this is new in the timer-pull thing.
> >
> > > It is expected to produce two values, one with the tick stopped (this
> > > is the return value of the function) and the other with the tick
> > > ticking (this is the one written under the address passed as the arg).
> > > This cannot depend on whether or not the tick will be stopped.  Both
> > > are good to know.
> > >
> > > Now, I understand that getting these two values may be costly, so
> > > there is an incentive to avoid calling it, but then the governor needs
> > > to figure this out from its crystal ball and so care needs to be taken
> > > to limit the possible damage in case the crystal ball is not right.
> >
> > If we can get the governor to decide the tick state up-front we can
> > avoid a lot of the expensive parts.
>
> I claim that in the vast majority of cases this is the same as
> deciding the state.
>
> The case when it is not is when the target residency of the deepest
> idle state is below the tick period length and the governor is about
> to select that state.  According to the data I've seen so far this is
> a tiny fraction of all the cases.
>
> > > > and cpuilde wants to use tick_nohz_get_sleep_length() to
> > > > determine if to disable the tick. This cycle needs to be broken for
> > > > timer-pull.
> > > >
> > > > Hence my proposal to introduce the extra tick state, that allows fixing
> > > > both 2 and 3.
> > >
> > > I'm not sure about 3 TBH.
> > >
> > > Say there are 2 idle states, one shallow (say its target residency is
> > > 10 us) and one deep (say its target residency is T = 2 * TICK_NSEC).
> >
> > This is the easy case and that actually 'works' today.
>
> But this is my case 3 which you said you didn't agree with.  I don't
> see why it needs to be fixed.
>
> > The interesting case is where your deepest state has a target residency that
> > is below the tick (because for HZ=100, we have a 10ms tick and pretty
> > much all idle states are below that).
>
> What about HZ=1000, though?
>
> > In that case you cannot tell the difference between I'm good to use this
> > state and I'm good to disable the tick and still use this state.
>
> No, you don't, but is it really worth the fuss?
>
> The state is high-latency anyway and tick_nohz_get_sleep_length()
> needs to be called anyway in that case in order to determine if a
> shallower state wouldn't be better.  At this point the statistics have
> already told the governor otherwise and a misprediction would be a
> double loss.
>
> So yes, you can gain performance by avoiding to call
> tick_nohz_get_sleep_length(), but then you can also lose it by
> selecting a state that is too deep (and that can be determined exactly
> with respect to timers).

BTW, note that teo records timers as "hits", because it has an idea
about when the next timer should occur and that's because it calls
tick_nohz_get_sleep_length().

If it doesn't call that function, it will not be able to tell the
difference between a non-tick timer and any other wakeup, so the
non-tick timer wakeups will then be recorded as "intercepts" which
will skew it towards shallow states over time.  That's one of the
reasons why I would prefer to only avoid calling
tick_nohz_get_sleep_length() when the candidate idle state is really
shallow.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ