lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 10 Jan 2022 09:25:00 -0800
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Eric Biggers <ebiggers@...nel.org>, Tejun Heo <tj@...nel.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Ingo Molnar <mingo@...hat.com>,
        Hillf Danton <hdanton@...a.com>,
        syzbot <syzbot+cdb5dd11c97cc532efad@...kaller.appspotmail.com>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
        Linux-MM <linux-mm@...ck.org>
Subject: Re: psi_trigger_poll() is completely broken

On Mon, Jan 10, 2022 at 5:45 AM Johannes Weiner <hannes@...xchg.org> wrote:
>
> On Wed, Jan 05, 2022 at 11:13:30AM -0800, Linus Torvalds wrote:
> > On Wed, Jan 5, 2022 at 11:07 AM Linus Torvalds
> > <torvalds@...ux-foundation.org> wrote:
> > >
> > > Whoever came up with that stupid "replace existing trigger with a
> > > write()" model should feel bad. It's garbage, and it's actively buggy
> > > in multiple ways.
> >
> > What are the users? Can we make the rule for -EBUSY simply be that you
> > can _install_ a trigger, but you can't replace an existing one (except
> > with NULL, when you close it).
>
> Apologies for the delay, I'm traveling right now.
>
> The primary user of the poll interface is still Android userspace OOM
> killing. I'm CCing Suren who is the most familiar with this usecase.
>
> Suren, the way the refcounting is written right now assumes that
> poll_wait() is the actual blocking wait. That's not true, it just
> queues the waiter and saves &t->event_wait, and the *caller* of
> psi_trigger_poll() continues to interact with it afterwards.

Thanks for adding me, Johannes. I see where I made a mistake.
Terribly sorry for the trouble this caused. I do feel bad.

>
> If at all possible, I would also prefer the simplicity of one trigger
> setup per fd; if you need a new trigger, close the fd and open again.
>
> Can you please take a look if that is workable from the Android side?

Yes, one trigger per fd would work fine for Android. That's how we
intended to use it.
I'm still catching up on this email thread. Once I digest it, will try
to fix this with one-trigger-per-fd approach.

About the issue of serializing concurrent writes for
cgroup_pressure_write() similar to how psi_write() does. Doesn't
of->mutex inside kernfs_fop_write_iter() serialize the writes to the
same file: https://elixir.bootlin.com/linux/latest/source/fs/kernfs/file.c#L287
?

>
> (I'm going to follow up on the static branch issue Linus pointed out,
> later this week when I'm back home. I also think we should add Suren
> as additional psi maintainer since the polling code is a good chunk of
> the codebase and he shouldn't miss threads like these.)

That would help me not to miss these emails and respond promptly.
Thanks,
Suren.

>
> > That would fix the poll() lifetime issue, and would make the
> > psi_trigger_replace() races fairly easy to fix - just use
> >
> >         if (cmpxchg(trigger_ptr, NULL, new) != NULL) {
> >                 ... free 'new', return -EBUSY ..
> >
> > to install the new one, instead of
> >
> >         rcu_assign_pointer(*trigger_ptr, new);
> >
> > or something like that. No locking necessary.
> >
> > But I assume people actually end up re-writing triggers, because
> > people are perverse and have taken advantage of this completely broken
> > API.
> >
> >                Linus

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ