lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMP5XgfDybsfz-v2NTYqAX8K48xCJ_zkH65AHosAOmT7P=Gyxg@mail.gmail.com>
Date:   Mon, 12 Sep 2016 20:44:09 -0700
From:   Arve Hjønnevåg <arve@...roid.com>
To:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>,
        "devel@...verdev.osuosl.org" <devel@...verdev.osuosl.org>,
        Riley Andrews <riandrews@...roid.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Christoph Hellwig <hch@...radead.org>,
        Todd Kjos <tkjos@...gle.com>
Subject: Re: [PATCH] android: binder: Disable preemption while holding the
 global binder lock

On Sat, Sep 10, 2016 at 10:28 AM, Greg Kroah-Hartman
<gregkh@...uxfoundation.org> wrote:
> On Sat, Sep 10, 2016 at 06:37:29PM +0200, Thomas Gleixner wrote:
>> On Sat, 10 Sep 2016, Peter Zijlstra wrote:
>>
>> > On Sat, Sep 10, 2016 at 09:16:59AM -0700, Christoph Hellwig wrote:
>> > > On Thu, Sep 08, 2016 at 09:12:50AM -0700, Todd Kjos wrote:
>> > > > In Android systems, the display pipeline relies on low
>> > > > latency binder transactions and is therefore sensitive to
>> > > > delays caused by contention for the global binder lock.
>> > > > Jank is siginificantly reduced by disabling preemption
>> > > > while the global binder lock is held.
>> > >
>> > > That's now how preempt_disable is supposed to use.  It is for critical
>> >
>> > not, that's supposed to be _not_. Just to be absolutely clear, this is
>> > NOT how you're supposed to use preempt_disable().
>> >
>> > > sections that use per-cpu or similar resources.
>> > >
>> > > >
>> > > > Originally-from: Riley Andrews <riandrews@...gle.com>
>> > > > Signed-off-by: Todd Kjos <tkjos@...gle.com>
>> >
>> > > > @@ -389,7 +390,11 @@ static int task_get_unused_fd_flags(struct
>> > > > binder_proc *proc, int flags)
>> > > >   rlim_cur = task_rlimit(proc->tsk, RLIMIT_NOFILE);
>> > > >   unlock_task_sighand(proc->tsk, &irqs);
>> > > >
>> > > > - return __alloc_fd(files, 0, rlim_cur, flags);
>> > > > + preempt_enable_no_resched();
>> > > > + ret = __alloc_fd(files, 0, rlim_cur, flags);
>> > > > + preempt_disable();
>> >
>> > And the fact that people want to use preempt_enable_no_resched() shows
>> > that they're absolutely clueless.
>> >
>> > This is so broken its not funny.
>> >
>> > NAK NAK NAK
>>
>> Indeed. Sprinkling random preempt_enabe/disable() pairs all over the place
>> documents clearly that this is tinkering and not proper software
>> engineering.
>
> I have pointed out in the other thread for this patch (the one that had
> a patch that could be applied) that the single lock in the binder code
> is the main problem here, it should be solved instead of this messing
> around with priorities.
>

While removing the single lock in the binder driver would help reduce
the problem that this patch tries to work around, it would not fix it.
The largest problems occur when a very low priority thread gets
preempted while holding the lock. When a high priority thread then
needs the same lock it can't get it. Changing the driver to use more
fine-grained locking would reduce the set of threads that can trigger
this problem, but there are processes that receive work from both high
and low priority threads and could still end up in the same situation.

A previous attempt to fix this problem, changed the lock to use
rt_mutex instead of mutex, but this apparently did not work as well as
this patch. I believe the added overhead was noticeable, and it did
not work when the preempted thread was in a different cgroup (I don't
know if this is still the case).

It would be useful to generic solution to this problem.

> So don't worry, I'm not taking this change :)
>
> thanks,
>
> greg k-h



-- 
Arve Hjønnevåg

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ