lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220601151831.abbo3fxuua5lqj23@wubuntu>
Date:   Wed, 1 Jun 2022 16:18:31 +0100
From:   Qais Yousef <qais.yousef@....com>
To:     Paul Bone <pbone@...illa.com>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: Scheduling for heterogeneous computers

On 05/27/22 15:45, Paul Bone wrote:
> On Wed, May 25, 2022 at 04:29:56PM +0100, Qais Yousef wrote:
> > Hi Paul
> > 
> > On 05/24/22 15:23, Paul Bone wrote:
> > > Hi Qais,
> > > 
> > > That's excellent.
> > > 
> > > I'll definitely check out those links.  This could be very interesting for
> > > people using firefox on a phone/tablet, where we can run background tasks with
> > > a lower UCLAMP_MAX
> > 
> > If you're running on Android, you might find that you won't have permission to
> > use uclamp directly. Android restricts access and requires you to use higher
> > level APIs sometimes.
> > 
> > And I'm not sure if they have API to allow you to do what you want. I've seen
> > they have the concept of creating Foreground and Background jobs in one of
> > their Google IO presentations. But not sure if this will be tied to uclamp_max.
> > It might give you similar results still though regardless of the underlying
> > mechanism.
> 
> We want to support both desktop and android.  I have been assuming there's
> an API for android already, I vaguely remember hearing about it before.  We
> might already be using it (at least for processes but not yet for individual
> threads).
> 
> I was searching online now, but not for very long, and didn't find something
> like this, so maybe Android doesn't expose it, or at least not in one of the
> APIs they encourage you to use.

I am not an Android developer, so don't take it as a guidance :-)

But what I've seen and seemed related is this:

	* https://developer.android.com/guide/background
	* https://www.youtube.com/watch?v=IqnCqHyu1E4

I don't know the inner plumbing of these APIs and just some relevant stuff I've
come across. I hope they get attached to background cgroup and benefit from
uclamp indirectly that way.

> 
> What I'd really like is an API where I can choose one of:
> 
>  * This task is user-interactive, as-quick-as-possible please.
>  * This task is not user-interactive, but does have a deadline.

What's the difference between the two?

as-quickly-as-possible is about wake up latency or DVFS latency?

If the former then we had several discussions for that in OSPM and LPC. Latest
proposal is here to try to help tag tasks that care about wake up latency:

	https://lore.kernel.org/lkml/86066641739c4897b0001153e598a261@AcuMS.aculab.com/

If the latter, then uclamp_min should help you tell the kernel what performance
you need to get your work done in time. You can dynamically adjust it, or set
it once after a short discovery period assuming your workload is constant for
the duration of its lifetime. The goal to keep it as small as possible to
avoid wasting unnecessary power yet without missing a deadline.

>  * This task doesn't have a deadline.

I think we have enough plumbing in the kernel to provide these classifications.
It'd be nice to have a library that provides higher level API maybe for the end
users.

> Rather than choosing a suitable UCLAMP_MAX, I'll expriment with the numbers
> but choosing "400" on one system might mean something different from "400"
> on another system.  But I guess that's the problem, there are gray areas
> between my discrete options above.  A deadline could be "finish doing GC
> before we run out of memory" (which can have feedback from the GC about if
> it's on target), or "Finish encoding this video before the client wants to
> publish it", or "finish rendering this frame of a video game before the next
> VBLANK".  Depending on how on-target any of these are we could decrese or
> increase clock speed, because decreasing will always save power as long as
> things get done by their deadline.

Yep. I'm glad you're aware that "400" could mean anything and depends on the
target.

Any feedback on how to make this more useful will be appreciated. I think
picking the middle (512) and then expand or shrink based how much headroom you
have (or performance you're willing to sacrifice) might be a good starting
point. I think steering away from the top perf points will yield good results
in general. To squeeze more, maybe we'll need to expose more info to allow for
potable code. If you know your system, you can make some assumptions.

I'd be interested to know how well you can do with simple controls like these.

Cheers

--
Qais Yousef

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ