lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ydw4hWCRjAhGfCAv@cmpxchg.org>
Date:   Mon, 10 Jan 2022 14:45:41 +0100
From:   Johannes Weiner <hannes@...xchg.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Eric Biggers <ebiggers@...nel.org>, Tejun Heo <tj@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Hillf Danton <hdanton@...a.com>,
        syzbot <syzbot+cdb5dd11c97cc532efad@...kaller.appspotmail.com>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Linux-MM <linux-mm@...ck.org>,
        Suren Baghdasaryan <surenb@...gle.com>
Subject: Re: psi_trigger_poll() is completely broken

On Wed, Jan 05, 2022 at 11:13:30AM -0800, Linus Torvalds wrote:
> On Wed, Jan 5, 2022 at 11:07 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > Whoever came up with that stupid "replace existing trigger with a
> > write()" model should feel bad. It's garbage, and it's actively buggy
> > in multiple ways.
> 
> What are the users? Can we make the rule for -EBUSY simply be that you
> can _install_ a trigger, but you can't replace an existing one (except
> with NULL, when you close it).

Apologies for the delay, I'm traveling right now.

The primary user of the poll interface is still Android userspace OOM
killing. I'm CCing Suren who is the most familiar with this usecase.

Suren, the way the refcounting is written right now assumes that
poll_wait() is the actual blocking wait. That's not true, it just
queues the waiter and saves &t->event_wait, and the *caller* of
psi_trigger_poll() continues to interact with it afterwards.

If at all possible, I would also prefer the simplicity of one trigger
setup per fd; if you need a new trigger, close the fd and open again.

Can you please take a look if that is workable from the Android side?

(I'm going to follow up on the static branch issue Linus pointed out,
later this week when I'm back home. I also think we should add Suren
as additional psi maintainer since the polling code is a good chunk of
the codebase and he shouldn't miss threads like these.)

> That would fix the poll() lifetime issue, and would make the
> psi_trigger_replace() races fairly easy to fix - just use
> 
>         if (cmpxchg(trigger_ptr, NULL, new) != NULL) {
>                 ... free 'new', return -EBUSY ..
> 
> to install the new one, instead of
> 
>         rcu_assign_pointer(*trigger_ptr, new);
> 
> or something like that. No locking necessary.
> 
> But I assume people actually end up re-writing triggers, because
> people are perverse and have taken advantage of this completely broken
> API.
> 
>                Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ