lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 29 Aug 2022 14:23:17 +0000
From:   Quentin Perret <qperret@...gle.com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ingo Molnar <mingo@...nel.org>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Vincent Donnefort <vdonnefort@...gle.com>,
        Patrick Bellasi <patrick.bellasi@...bug.net>,
        Abhijeet Dharmapurikar <adharmap@...cinc.com>,
        Jian-Min <Jian-Min.Liu@...iatek.com>,
        Qais Yousef <qais.yousef@....com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 1/1] sched/pelt: Introduce PELT multiplier

On Monday 29 Aug 2022 at 12:13:26 (+0200), Vincent Guittot wrote:
> On Mon, 29 Aug 2022 at 12:03, Peter Zijlstra <peterz@...radead.org> wrote:
> >
> > On Mon, Aug 29, 2022 at 10:08:13AM +0200, Peter Zijlstra wrote:
> > > On Mon, Aug 29, 2022 at 07:54:50AM +0200, Dietmar Eggemann wrote:
> > > > From: Vincent Donnefort <vincent.donnefort@....com>
> > > >
> > > > The new sysctl sched_pelt_multiplier allows a user to set a clock
> > > > multiplier to x2 or x4 (x1 being the default). This clock multiplier
> > > > artificially speeds up PELT ramp up/down similarly to use a faster
> > > > half-life than the default 32ms.
> > > >
> > > >   - x1: 32ms half-life
> > > >   - x2: 16ms half-life
> > > >   - x4: 8ms  half-life
> > > >
> > > > Internally, a new clock is created: rq->clock_task_mult. It sits in the
> > > > clock hierarchy between rq->clock_task and rq->clock_pelt.
> > > >
> > > > Signed-off-by: Vincent Donnefort <vincent.donnefort@....com>
> > > > Signed-off-by: Dietmar Eggemann <dietmar.eggemann@....com>
> > >
> > > > +extern unsigned int sched_pelt_lshift;
> > > > +
> > > > +/*
> > > > + * absolute time   |1      |2      |3      |4      |5      |6      |
> > > > + * @ mult = 1      --------****************--------****************-
> > > > + * @ mult = 2      --------********----------------********---------
> > > > + * @ mult = 4      --------****--------------------****-------------
> > > > + * clock task mult
> > > > + * @ mult = 2      |   |   |2  |3  |   |   |   |   |5  |6  |   |   |
> > > > + * @ mult = 4      | | | | |2|3| | | | | | | | | | |5|6| | | | | | |
> > > > + *
> > > > + */
> > > > +static inline void update_rq_clock_task_mult(struct rq *rq, s64 delta)
> > > > +{
> > > > +   delta <<= READ_ONCE(sched_pelt_lshift);
> > > > +
> > > > +   rq->clock_task_mult += delta;
> > > > +
> > > > +   update_rq_clock_pelt(rq, delta);
> > > > +}
> > >
> > > Hurmph... I'd almost go write you something like
> > > static_call()/static_branch() but for immediates.
> > >
> > > That said; given there's only like 3 options, perhaps a few
> > > static_branch() instances work just fine ?
> >
> > Also, I'm not at all sure about exposing that as an official sysctl.
> 
> Me too, I would even make it a boot time parameter so we can remove
> the new clock_task_mult clock and left shift clock_taslk or the delta
> before passing it to clock_pelt

I'll let folks in CC comment about their use-case in more details, but
there's definitely been an interest in tuning this thing at run-time
FWIW. Typically a larger half-life will be fine with predictable
workloads with little inputs from users (e.g. fullscreen video playback)
while a lower one can be preferred in highly interactive cases (games,
...). The transient state is fun to reason about, but it really
shouldn't be too common AFAIK.

With that said I'd quite like to see numbers to back that claim.
Measuring power while running a video (for instance) with various HL
values should help. And similarly it shouldn't be too hard to get
performance numbers.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ