linux-kernel - Re: [GIT PULL] gpio: fixes for v6.1-rc8

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMRc=Me--X58j+tLn50osPRXSsJeqms8m=aYDcwMwktfZvFQow@mail.gmail.com>
Date:   Sun, 4 Dec 2022 21:46:50 +0100
From:   Bartosz Golaszewski <brgl@...ev.pl>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Linus Walleij <linus.walleij@...aro.org>,
        linux-gpio@...r.kernel.org, linux-kernel@...r.kernel.org,
        Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
Subject: Re: [GIT PULL] gpio: fixes for v6.1-rc8 - take 2

On Sun, Dec 4, 2022 at 9:17 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Sat, Dec 3, 2022 at 7:05 AM Bartosz Golaszewski <brgl@...ev.pl> wrote:
> >
> > Here's a fixed PR from the GPIO subsystem for the next rc.
>
> No, this cannot be right.
>
> That last commit seems *very* dubious, and in particular all those
>
>         if (!down_read_trylock(&gdev->sem))
>                 return EPOLLHUP | EPOLLERR;
>
> are a sign that something is very very wrong there.
>
> Either the lock is necessary or it isn't, and "trylock" isn't the way
> to deal with it, with random failures if you cannot take the lock.
>
> If you are using "trylock" because the data structure might go away
> from under you, you have already lost, and the code is buggy.
>
> And if the data structure cannot go away from under  you,  you should
> do an unconditional lock, and then check "gdev->chip" for being NULL
> once you have gotten the lock (the same way you did in open()).
>

No, the data can't be removed with these locks in place. It's just to
avoid going into the callbacks if gpiochip_remove() is already in
progress (the only reason why trylock would fail).

> But a "trylock and return error if it failed" just means that now you
> are randomly returning errors to user space, which is entirely
> undebuggable and makes no sense.
>

Technically these are the same errors we'll return later if gdev->chip
is NULL but I get your point.

> Or, alternatively, the trylock succeeds - because it hits fully
> *after* gpiochip_remove() has finished, and now ->chip is NULL anyway,
> which is what you claim to protect against.
>
> End result: "trylock" can never be right in this kind of context.
>
> That "call_locked() helper might make sense more along the lines of
>
>         ret = -ENODEV;
>
>         down_read(&gdev->sem))
>         // Does the device still exist?
>         if (gdev->chip)
>                 ret = func(file, cmd, arg);
>         up_read(&gdev->sem);
>
>         return ret;
>

This is a good suggestion, thanks. And with it, the two patches can
get squashed into one for easier backporting.

> or similar. Not with that odd "try to lock, and if that fails, assume error".
>
> And again - if the trylock is there because 'gdev' itself might go
> away at any time and you can't afford to wait on the lock, then it's
> broken regardless (and the above suggestion won't help either)
>
> Anyway: the end result of this all is that I think this is a
> fundamental bug in the gpio layer, and rc7 (soon to be rc8) is too
> late to try these kinds of unfinished games.
>
> Fix it properly for 6.2, and make it back-portable, because I'm not
> pulling something like this right now.
>
>                Linus

Will do. I will still resend the PR with only the resource leak fixes.

Bartosz