lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <18735.14834.899120.164099@notabene.brown>
Date:	Fri, 28 Nov 2008 11:23:14 +1100
From:	Neil Brown <neilb@...e.de>
To:	Al Viro <viro@...IV.linux.org.uk>
Cc:	Tejun Heo <teheo@...e.de>, linux-kernel@...r.kernel.org,
	linux-raid@...r.kernel.org, Doug Ledford <dledford@...hat.com>,
	Greg KH <greg@...ah.com>, Jens Axboe <jens.axboe@...cle.com>
Subject: Re: [PATCH 1/2] md: make devices disappear when they are no longer
	needed.

On Monday November 24, viro@...IV.linux.org.uk wrote:
> 
> I really think that it's much saner than trying to change the lifetime
> rules for gendisk, etc.

I'm not sure there is anything wrong with the lifetime rules for
gendisk....

In my mind a big issue is:
   What should happen if "open" is racing with "destroy device" ?

For 'sd', if you open before the decision has been made to tear down
the data structures, the open succeeds but you start getting EIO on
any IO.
If you open after the decision, you get ENODEV (or ENXIO?).
If you wait a little longer and some other device gets hot-plugged,
then the open succeeds on a different physical device.

For 'md' the situation is quite different, due to unfortunate legacy
interface design.
An open should never fail with ENODEV.
It might return a device on which read/write always fails and only
various 'ioctls' work, or it might return a device which is fully
working.

So we need to close the window in which ENODEV can be returned.

I think the code I wrote is the best way to close that window.  If we
detect that we are in the window during open, wait for the window to
close and retry, by returning ERESTARTSYS.


I've thought more about the possibility of the disk_release method
which is used to tear down the md data structures but I don't think it
would really answer the need.

A key point in time is when blk_unregister_region is called to
remove the mapping from minor number to the gendisk.
At any time before this, new references can be taken by an open.
At any time after this, any open will be directed through ->probe
which can do any necessary interlocking with ->open.

Currently we need to call blk_unregister_region before the final
put_disk otherwise a new reference could be taken while that put_disk
is running.
blk_unregister_region is normally called by unlink_gendisk called by
del_gendisk.
So we need to call del_gendisk before the final call to put_disk.
And once we have called del_gendisk we may as well clean up the md
specific stuff there and not bother with a ->disk_release method.

So I'm having trouble seeing what is wrong with my approach (other
than the fact it implements an unfortunate choice of user-space
interface, which we agree is a legacy requirement) or how anything
else could provide a better solution.

I can appreciate that there might be other issues with gendisk lifetime
(such as the fact that get_disk takes a reference to the module but
put_disk doesn't) but I suspect they are largely orthogonal to my
current issue.

Finally, I suspect that there is a problem with my patch as it stands
with respect to module reference counting.  I'll have to explore to
be sure exactly what is happening, but I strongly suspect it is
wrong.  Thank you for highlighting that issue for me.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ