lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 24 Jun 2021 13:09:03 +0200
From:   Greg KH <gregkh@...uxfoundation.org>
To:     Luis Chamberlain <mcgrof@...nel.org>
Cc:     rafael@...nel.org, davem@...emloft.net, kuba@...nel.org,
        ast@...nel.org, andriin@...com, daniel@...earbox.net,
        atenart@...nel.org, alobakin@...me, weiwan@...gle.com,
        ap420073@...il.com, jeyu@...nel.org, ngupta@...are.org,
        sergey.senozhatsky.work@...il.com, minchan@...nel.org,
        axboe@...nel.dk, mbenes@...e.com, jpoimboe@...hat.com,
        tglx@...utronix.de, keescook@...omium.org, jikos@...nel.org,
        rostedt@...dmis.org, peterz@...radead.org,
        linux-block@...r.kernel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] sysfs: fix kobject refcount to address races with
 kobject removal

On Wed, Jun 23, 2021 at 02:50:07PM -0700, Luis Chamberlain wrote:
> It's possible today to have a device attribute read or store
> race against device removal. This is known to happen as follows:
> 
> write system call -->
>   ksys_write () -->
>     vfs_write() -->
>       __vfs_write() -->
>         kernfs_fop_write_iter() -->
>           sysfs_kf_write() -->
>             dev_attr_store() -->
>               null reference
> 
> This happens because the dev_attr->store() callback can be
> removed prior to its call, after dev_attr_store() was initiated.
> The null dereference is possible because the sysfs ops can be
> removed on module removal, for instance, when device_del() is
> called, and a sysfs read / store is not doing any kobject reference
> bumps either. This allows a read/store call to initiate, a
> device_del() to kick off, and then the read/store call can be
> gone by the time to execute it.
> 
> The sysfs filesystem is not doing any kobject reference bumps during a
> read / store ops to prevent this.
> 
> To fix this in a simplified way, just bump the kobject reference when
> we create a directory and remove it on directory removal.
> 
> The big unfortunate eye-sore is addressing the manual kobject reference
> assumption on the networking code, which leads me to believe we should
> end up replacing that eventually with another sort of check.
> 
> Suggested-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> Signed-off-by: Luis Chamberlain <mcgrof@...nel.org>
> ---
> 
> This v4 moves to fixing the race condition on dev_attr_store() and
> dev_attr_read() to sysfs by bumping the kobject reference count
> on directory creation / deletion as suggested by Greg.

This looks good.

It's late in the development cycle, I'll hold off on adding this to my
tree until 5.14-rc1 is out because of:

> Unfortunately at least the networking core has a manual refcount
> assumption, which needs to be adjusted to account for this change.
> This should also mean there is runtime for other kobjects which may
> not be explored yet which may need fixing as well. We may want to
> change the check to something else on the networking front, but its
> not clear to me yet what to use.

That's crazy what networking is doing here, hopefully no one else is.
If they are, let's shake it out in linux-next to find the problems which
is why a good "soak" there is a good idea.

thanks for making this change and sticking with it!

Oh, and with this change, does your modprobe/rmmod crazy test now work?

greg k-h

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ