[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YPgGIncQxcD2frBY@kroah.com>
Date: Wed, 21 Jul 2021 13:33:54 +0200
From: Greg KH <gregkh@...uxfoundation.org>
To: Luis Chamberlain <mcgrof@...nel.org>
Cc: tj@...nel.org, shuah@...nel.org, akpm@...ux-foundation.org,
rafael@...nel.org, davem@...emloft.net, kuba@...nel.org,
ast@...nel.org, andriin@...com, daniel@...earbox.net,
atenart@...nel.org, alobakin@...me, weiwan@...gle.com,
ap420073@...il.com, jeyu@...nel.org, ngupta@...are.org,
sergey.senozhatsky.work@...il.com, minchan@...nel.org,
axboe@...nel.dk, mbenes@...e.com, jpoimboe@...hat.com,
tglx@...utronix.de, keescook@...omium.org, jikos@...nel.org,
rostedt@...dmis.org, peterz@...radead.org,
linux-block@...r.kernel.org, netdev@...r.kernel.org,
linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 4/4] test_sysfs: demonstrate deadlock fix
On Sat, Jul 03, 2021 at 10:28:28AM -0700, Luis Chamberlain wrote:
> On Sat, Jul 03, 2021 at 06:49:46AM +0200, Greg KH wrote:
> > On Fri, Jul 02, 2021 at 05:46:32PM -0700, Luis Chamberlain wrote:
> > > +#define MODULE_DEVICE_ATTR_FUNC_STORE(_name) \
> > > +static ssize_t module_ ## _name ## _store(struct device *dev, \
> > > + struct device_attribute *attr, \
> > > + const char *buf, size_t len) \
> > > +{ \
> > > + ssize_t __ret; \
> > > + if (!try_module_get(THIS_MODULE)) \
> > > + return -ENODEV; \
> > > + __ret = _name ## _store(dev, attr, buf, len); \
> > > + module_put(THIS_MODULE); \
> > > + return __ret; \
> > > +}
> >
> > As I have pointed out before, doing try_module_get(THIS_MODULE) is racy
> > and should not be added back to the kernel tree. We got rid of many
> > instances of this "bad pattern" over the years, please do not encourage
> > it to be added back as others will somehow think that it correct code.
>
> It is noted this is used in lieu of any agreed upon solution to
> *demonstrate* how this at least does fix it. In this case (and in the
> generic solution I also had suggested for kernfs a while ago), if the
> try fails, we give up. If it succeeds, we now know we can rely on the
> device pointer. If the refcount succeeds, can the module still not
> be present? Is try_module_get() racy in that way? In what way is it
> racy and where is this documented? Do we have a selftest to prove the
> race?
As I say in the other email where you tried to add this, think about
what happens if the module is removed _right before_ you make this call.
Or a few instructions before that. The race is still there, this fixes
nothing except make the window smaller.
thanks,
greg k-h
Powered by blists - more mailing lists