lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 4 May 2015 17:30:25 +1000
From:	NeilBrown <neilb@...e.de>
To:	Al Viro <viro@...IV.linux.org.uk>
Cc:	Christoph Hellwig <hch@...radead.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [RFC][PATCHSET] non-recursive link_path_walk() and reducing
 stack footprint

On Mon, 4 May 2015 06:11:29 +0100 Al Viro <viro@...IV.linux.org.uk> wrote:

> On Fri, Apr 24, 2015 at 02:42:03PM +0100, Al Viro wrote:
> 
> > That avoids this spin_lock() on each absolute symlink at the price of extra
> > 32 bits in struct nameidata.  It looks like doing on-demand reallocation
> > of nd->stack is the right way to go anyway, so the pressure on nameidata size
> > is going to be weaker and that might be the right way to go...
> 
> OK, on-demand reallocation is done.  What I have right now is
> 	* flat MAXSYMLINKS 40, no matter what kind of nesting there might
> be.
> 	* purely iterative link_path_walk().
> 	* no damn nameidata on stack for generic_readlink()
> 	* stack footprint of the entire thing independent from the nesting
> depth, and about on par with "no symlinks at all" case in mainline.
> 	* some massage towards RCU follow_link done (in the end of queue),
> but quite a bit more work remains.
> 
> What I've got so far is in vfs.git#link_path_walk; I'm not too happy about
> posting a 70-chunk mailbomb, but it really will need review and testing.
> It survives xfstests and LTP with no regressions, but it will need
> serious profiling, etc., along with RTFS.  I tried to keep it in reasonably
> small pieces, but there's a lot of them ;-/
> 
> FWIW, I've a bit more reorganization plotted out, but it's not far from
> where we need to be for RCU follow_link.  Some notes:
> 	* I don't believe we want to pass flags to ->follow_link() - it's
> much simpler to give the damn thing NULL for dentry in RCU case.  In *all*
> cases where we might have a change to get the symlink body without blocking
> we can do that by inode alone.  We obviously want to pass dentry and inode
> separately (and in case of fast symlinks we don't hit the filesystem at
> all), but that's it - flags isn't needed.
> 	* terminate_walk() should do bulk put_link().  So should the
> failure cases of complete_walk().  _Success_ of complete_walk() should
> be careful about legitimizing links - it *can* be called with one link
> on stack, and be followed by access to link body.  Yes, really - do_last()
> in O_CREAT case.
> 	* do_last(), lookup_last() and mountpoint_last() ought to
> have put_link() done when called on non-empty stack (thus turning the loops
> into something like
>                 while ((err = lookup_last(nd)) > 0) {
>                         err = trailing_symlink(nd);
>                         if (err)
>                                 break;
>                 }
> _After_ the point where they don't need to look at the last component of
> name, obviously.
> 	* I think we should leave terminate_walk() to callers in failure
> cases of walk_component() and handle_dots(), as well as get_link().  Makes
> life simpler in callers, actually.  I'll play with that a bit more.
> 	* it might make sense to add the second flag to walk_component(),
> in addition to LOOKUP_FOLLOW, meaning "do put_link() once you are done looking
> at the name".  In principle, it promises simpler logics with unlazy_walk(),
> but that's one area I'm still not entirely sure about.  Will need to
> experiment a bit...
> 	* nd->seq clobbering will need to be dealt with, as discussed upthread.
> 	* I _really_ hate your "let's use the LSB of struct page * to tell
> if we need to kunmap()" approach.  It's too damn ugly to live.  _And_ it's
> trivial to avoid - all we need is to have (non-lazy) page_follow_link_light()
> and page_symlink() to remove __GFP_HIGHMEM from inode->i_mapping before
> ever asking to allocate pages there.  That'll suffice, and it makes sense
> regardless of RCU work - that kmap/kunmap with potential for minutes in
> between (while waiting for stuck NFS server in the middle of symlink traversal)
> is simply wrong.


Thanks!
I'll have another look and see about adding what is needed for RCU symlink
support.

NeilBrown

Content of type "application/pgp-signature" skipped

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ