linux-kernel - Re: [PATCH v3 1/6] ANDROID: binder: add support for RT prio inheritance.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171115130111.qq5jjgdqc55xpu3i@hirez.programming.kicks-ass.net>
Date:   Wed, 15 Nov 2017 14:01:11 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Martijn Coenen <maco@...roid.com>
Cc:     gregkh@...uxfoundation.org, john.stultz@...aro.org,
        tkjos@...gle.com, arve@...roid.com, sherryy@...roid.com,
        tglx@...utronix.de, amit.pundir@...aro.org,
        linux-kernel@...r.kernel.org, devel@...verdev.osuosl.org,
        maco@...gle.com
Subject: Re: [PATCH v3 1/6] ANDROID: binder: add support for RT prio
 inheritance.

On Thu, Oct 26, 2017 at 04:07:45PM +0200, Martijn Coenen wrote:
> +/**
> + * binder_set_priority() - sets the scheduler priority of a task
> + * @task:	task to set priority on
> + * @desired:	desired priority to run at
> + *
> + * The scheduler policy of tasks is changed explicitly, because we want to
> + * support a few distinct features:
> + * 1) If the requested priority is higher than the maximum allowed priority,
> + *    we want to "clip" at the highest supported priority.
> + * 2) For a future patch, we need to allow changing the priority of a task
> + *    with a different UID; when we make a binder transaction from process A
> + *    to process B with different UIDs, A must be able to set B's priority
> + *    before B wakes up to handle the call. If B were to raise its own priority
> + *    after waking up, a race condition exists where B gets preempted before
> + *    it can raise its own priority.
> + *
> + * Feature 2) sounds like something a rt_mutex would solve, for example by
> + * having the caller proxy lock an rt_mutex on behalf of the callee, and then
> + * sleeping on it. But we have a few requirements that don't work with this
> + * approach:
> + * 1) binder supports a "minimum node priority", meaning that all transactions
> + *    into a node must run at this priority at a minimum. This means that the
> + *    desired priority for handling a transaction is not necessarily equal to
> + *    the priority of the caller.

Isn't that the same as running everything in that node (whatever that
is) at that given prio?

> + * 2) binder supports asynchronous transactions, where the caller is not blocked
> + *    on transaction completion; so, it also can't be blocked on an rt_mutex.

This is not a theoretically sound thing to do... For a system to be
real-time it not only needs to guarantee forward progress but also have
bounded runtime.

Elevating another task's priority to your own and then not blocking
effectively injects time. At the very least this time should be taking
into consideration when doing runtime analysis of the system.

Also, since its async, why do we care? The original task is still making
progress. Only at the point where we start to wait for completion does
it make sense to boost. Otherwise you end up having to charge one task
double time.

> + * 3) similarly, there may not necessarily be a thread waiting for
> + *    transactions at the time the call is made, so we don't know who to proxy-
> + *    lock the lock for.

Non-issue; you have the exact same problem now. Your interface takes a
@task argument, so you have to have a @task either way around.

> + * 4) binder supports nested transactions, where A can call into B, and B can
> + *    call back into A before returning a reply to the original transaction.
> + *    This means that if A is blocked on an rt_mutex B holds, B must first wake
> + *    up A to handle a new transaction, and only then can it proxy-lock and try
> + *    to acquire the new rt_mutex. This leaves a race condition where B
> + *    temporarily runs at its original priority.

That doesn't sound like something a well formed (RT) program should do
in the first place.

You could play horrible games with lock ownership, but YUCK!!

> + * 5) rt_mutex does not currently support PI for CFS tasks.

Neither do you.. just inheriting the nice value is not correct for a WFQ
style scheduler.

> + */


What you're proposing is a bunch of make-work-ish hacks on a system that
isn't designed for RT.