[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMRc=McHowkYJBckM1eikcrBUoXXZN+OkozA-dNXZc1Zgd+Kfw@mail.gmail.com>
Date: Thu, 26 Jan 2023 10:11:07 +0100
From: Bartosz Golaszewski <brgl@...ev.pl>
To: Wolfram Sang <wsa@...nel.org>, Bartosz Golaszewski <brgl@...ev.pl>,
linux-i2c@...r.kernel.org, linux-kernel@...r.kernel.org,
Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
Subject: Re: [PATCH v3] i2c: dev: don't allow user-space to deadlock the kernel
On Wed, Jan 25, 2023 at 11:12 PM Wolfram Sang <wsa@...nel.org> wrote:
>
>
> > So, this code handled all my stress-testing well so far. I'll try to
> > think of some more ideas until this evening, but likely I will apply it
> > later. Nonetheless, more review eyes are still welcome!
>
> Ah yes, I now recalled why I had the gut feeling that this solution is
> not complete. See this mail thread from 2015:
>
> https://lkml.iu.edu/hypermail/linux/kernel/1501.2/01700.html
>
> There are still drivers using i2c_del_adapter()+kfree(), so removing the
> completion could cause use-after-free there, or?
>
Ugh, what a mess... I was mostly focused on the character device side
of it but now I realized the true extent of the problem.
It's not like there are just *some* odd drivers that delete the
adapter struct at .remove() - it's literally all of them one way or
another.
It's all because the adapter struct really should be allocated by
i2c_add_adapter() and bus drivers should only really provide some
structure containing the adapter description for the subsystem the
lifetime of which would not affect the adapter itself. This way the
adapter (embedding struct device) would be freed by device type's
.release() like we do over in the GPIO subsystem. Instead the adapter
struct is allocated by drivers at .probe() meaning it will get dropped
at .remove().
I wonder how many more subsystems do that. No wonder people blame
devres for these user-space device node crashes...
I don't have a good solution. I've been thinking about it for an hour
and every solution requires sweeping changes across the entire
subsystem. Or else we'd introduce a parallel solution that would do
the right thing and wait in perpetuity until all drivers convert -
like with i2e probe_new() which is after all much simpler.
Anyway, that's all I've got. We probably need to drop this change and
live with what we have now.
Bart
Powered by blists - more mailing lists