linux-kernel - Re: [RFC PATCH 1/1] sched/pelt: Introduce PELT multiplier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YwzL1eJUIReAEv0l@google.com>
Date:   Mon, 29 Aug 2022 14:23:17 +0000
From:   Quentin Perret <qperret@...gle.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ingo Molnar <mingo@...nel.org>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Vincent Donnefort <vdonnefort@...gle.com>,
        Patrick Bellasi <patrick.bellasi@...bug.net>,
        Abhijeet Dharmapurikar <adharmap@...cinc.com>,
        Jian-Min <Jian-Min.Liu@...iatek.com>,
        Qais Yousef <qais.yousef@....com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 1/1] sched/pelt: Introduce PELT multiplier

On Monday 29 Aug 2022 at 12:13:26 (+0200), Vincent Guittot wrote:
> On Mon, 29 Aug 2022 at 12:03, Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Mon, Aug 29, 2022 at 10:08:13AM +0200, Peter Zijlstra wrote:
> > > On Mon, Aug 29, 2022 at 07:54:50AM +0200, Dietmar Eggemann wrote:
> > > > From: Vincent Donnefort <vincent.donnefort@....com>
> > > >
> > > > The new sysctl sched_pelt_multiplier allows a user to set a clock
> > > > multiplier to x2 or x4 (x1 being the default). This clock multiplier
> > > > artificially speeds up PELT ramp up/down similarly to use a faster
> > > > half-life than the default 32ms.
> > > >
> > > >   - x1: 32ms half-life
> > > >   - x2: 16ms half-life
> > > >   - x4: 8ms  half-life
> > > >
> > > > Internally, a new clock is created: rq->clock_task_mult. It sits in the
> > > > clock hierarchy between rq->clock_task and rq->clock_pelt.
> > > >
> > > > Signed-off-by: Vincent Donnefort <vincent.donnefort@....com>
> > > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@....com>
> > >
> > > > +extern unsigned int sched_pelt_lshift;
> > > > +
> > > > +/*
> > > > + * absolute time   |1      |2      |3      |4      |5      |6      |
> > > > + * @ mult = 1      --------****************--------****************-
> > > > + * @ mult = 2      --------********----------------********---------
> > > > + * @ mult = 4      --------****--------------------****-------------
> > > > + * clock task mult
> > > > + * @ mult = 2      |   |   |2  |3  |   |   |   |   |5  |6  |   |   |
> > > > + * @ mult = 4      | | | | |2|3| | | | | | | | | | |5|6| | | | | | |
> > > > + *
> > > > + */
> > > > +static inline void update_rq_clock_task_mult(struct rq *rq, s64 delta)
> > > > +{
> > > > +   delta <<= READ_ONCE(sched_pelt_lshift);
> > > > +
> > > > +   rq->clock_task_mult += delta;
> > > > +
> > > > +   update_rq_clock_pelt(rq, delta);
> > > > +}
> > >
> > > Hurmph... I'd almost go write you something like
> > > static_call()/static_branch() but for immediates.
> > >
> > > That said; given there's only like 3 options, perhaps a few
> > > static_branch() instances work just fine ?
> >
> > Also, I'm not at all sure about exposing that as an official sysctl.
> 
> Me too, I would even make it a boot time parameter so we can remove
> the new clock_task_mult clock and left shift clock_taslk or the delta
> before passing it to clock_pelt

I'll let folks in CC comment about their use-case in more details, but
there's definitely been an interest in tuning this thing at run-time
FWIW. Typically a larger half-life will be fine with predictable
workloads with little inputs from users (e.g. fullscreen video playback)
while a lower one can be preferred in highly interactive cases (games,
...). The transient state is fun to reason about, but it really
shouldn't be too common AFAIK.

With that said I'd quite like to see numbers to back that claim.
Measuring power while running a video (for instance) with various HL
values should help. And similarly it shouldn't be too hard to get
performance numbers.