lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCr=S8b5MyDa9xp9D08FcsG6VGrHQjkj5CW3iFzuFO-4Xg@mail.gmail.com>
Date: Wed, 10 Apr 2024 10:30:56 -0700
From: John Stultz <jstultz@...gle.com>
To: Qais Yousef <qyousef@...alina.io>
Cc: Vincent Guittot <vincent.guittot@...aro.org>, Ingo Molnar <mingo@...nel.org>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Daniel Bristot de Oliveira <bristot@...hat.com>, 
	Thomas Gleixner <tglx@...utronix.de>, "Paul E. McKenney" <paulmck@...nel.org>, 
	Joel Fernandes <joel@...lfernandes.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	linux-kernel@...r.kernel.org, Yabin Cui <yabinc@...gle.com>
Subject: Re: [PATCH] sched/pi: Reweight fair_policy() tasks when inheriting prio

On Tue, Apr 9, 2024 at 11:59 PM Qais Yousef <qyousef@...alina.io> wrote:
>
> On 04/09/24 14:35, Vincent Guittot wrote:
> > On Tue, 9 Apr 2024 at 08:19, Qais Yousef <qyousef@...alina.io> wrote:
> > >
> > > On 04/08/24 12:51, John Stultz wrote:
> > > > On Mon, Apr 8, 2024 at 12:17 AM Vincent Guittot
> > > > <vincent.guittot@...aro.org> wrote:
> > > > >
> > > > > On Sun, 7 Apr 2024 at 14:27, Qais Yousef <qyousef@...alina.io> wrote:
> > > > > >
> > > > > > On 04/05/24 18:16, Qais Yousef wrote:
> > > > > >
> > > > > > > >
> > > > > > > > All that to say that I think the weight is not applied on purpose.
> > > > > > > > This might work for your particular case but there are more changes to
> > > > > > > > be done if you want to apply prio inheritance between cfs tasks.
> > > > > > > >
> > > > > > > > As an example, what about the impact of cgroup on the actual weight
> > > > > > > > and the inherited priority of a task ? If the owner and the waiter
> > > > > > > > don't belong to the same cgroup their own prio is meaningless... task
> > > > > > > > nice -20 in a group with a weight equal to nice 19 vs a task nice 19
> > > > > > > > in a group with a weight equals to nice -20
> > > > > > >
> > > > > > > That is on my mind actually. But I thought it's a separate problem. That has to
> > > > > > > do with how we calculate the effective priority of the pi_task. And probably
> > > > > > > the sorting order to if we agree we need to revert the above. If that is done
> > > > > >
> > > > > > Thinking more about it the revert is not the right thing to do. We want fair
> > > > > > tasks to stay ordered in FIFO for better fairness and avoid potential
> > > > > > starvation issues. It's just the logic for searching the top_waiter need to be
> > > > > > different. If the top_waiter is fair, then we need to traverse the tree to find
> > > > > > the highest nice value. We probably can keep track of this while adding items
> > > > > > to the tree to avoid the search.
> > > > > >
> > > > > > For cgroup; is it reasonable (loosely speaking) to keep track of pi_cfs_rq and
> > > > > > detach_attach_task_cfs_rq() before the reweight? This seems the most
> > > > > > straightforward solution and will contain the complexity to keeping track of
> > > > > > cfs_rq. But it'll have similar issue to proxy execution where a task that
> > > > > > doesn't belong to the cgroup will consume its share..
> > > > >
> > > > > That's a good point, Would proxy execution be the simplest way to fix all this ?
> > >
> > > Is it? Over 4.5 years ago Unity reported to me about performance inversion
> > > problem and that's when proxy execution work was revived as simplest way to fix
> > > all of this. But still no end in sight from what I see. I was and still think
> > > an interim solution in rt_mutex could help a lot of use cases already without
> > > being too complex. Not as elegant and comprehensive like proxy execution, but
> > > given the impact on both userspace and out of tree kernel hacks are growing
> > > waiting for this to be ready, the cost of waiting is high IMHO.
> > >
> > > FWIW, I already heard several feedbacks that PTHREAD_PRIO_INHERIT does nothing.
> > > I think this reweight issue is more serious problem and likely why I heard this
> > > feedback. I could be underestimating the complexity of the fix though So I'll
> >
> > Without cgroup, the solution could be straightforward but android uses
> > extensively cgroup AFAICT and update_cfs_group() makes impossible to
> > track the top cfs waiter and its "prio"
>
> :(
>
> IIUC the issue is that we can't easily come up with a single number of
> 'effective prio' for N level hierarchy and compare it with another M level
> hierarchy..
>
> Does proxy execution fix this problem then? If we can't find the top waiter,
> I can't see how proxy execution would work here too. To my understanding it's
> more about how we apply inheritance (by donating execution context of the top
> waiter) instead of manually applying inheritance like we're doing now.

So, while proxy provides a sort of generalized inheritance, it isn't
deep enough in the class scheduler logic to need to really think about
priority/cgroups.

It just looks at what gets selected to run. That's the most important
task at that moment. It doesn't really need to care about how/why,
that's left to pick_next_task().

Since it leaves mutex blocked tasks on the RQ, it allows the class
scheduler logic to pick the most important task (mutex-blocked or not)
to run. Then if a mutex-blocked task gets selected, we will then find
the mutex owner and run it instead so it can release the lock.  When
locks are released, if the owner has a "donor" task, the lock is
handed off to the donor.  So, this basically uses the
pick_next_task()'s evaluation of what it wanted to run to effectively
provide the "top waiter".

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ