[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20080603130538.GA28946@ZenIV.linux.org.uk>
Date: Tue, 3 Jun 2008 14:05:39 +0100
From: Al Viro <viro@...IV.linux.org.uk>
To: Miklos Szeredi <miklos@...redi.hu>
Cc: mtk.manpages@...glemail.com, drepper@...hat.com,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
linux-man@...r.kernel.org, linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] utimensat() non-conformances and fixes [v3]
On Tue, Jun 03, 2008 at 02:16:33PM +0200, Miklos Szeredi wrote:
> > > > > I'm not sure of the correct way to get the required nameidata (to do a
> > > > > vfs_permission() call) from the file descriptor. Can you give me a
> > > > > tip there?
> > > >
> > > > Could you point me at the right way of doing this?
> > >
> > > You don't need nameidata for this at all. Just call permission() with
> > > a NULL nameidata.
> > >
> > > Ugly API? Yes, will be cleaned up if we manage to find some common
> > > ground with the VFS maintainers.
> >
> > As soon as I'm done with sysctls...
>
> Can't you just do that independently (for now just put a
> d_find_alias() in there and be done with it)? If you fix every piece
> of horrid code that you come across, it'll never be done...
There's not much left to do, actually... FWIW, solution goes like this:
* introduce structure on the classes of sysctls
(currently - root and per-network-namespace). Namely "X is parent of Y",
with "if task T sees Y, it also sees X" as defining property.
* when adding a sysctl table, find a "parent" one. Which is to say,
find the deepest node on its stem that already is present in one of the
tables from our class or its ancestor classes. That table will be our
parent and that node in it - attachment point.
* delay freeing the table headers; have them refcounted and instead
of unconditionally freeing the sucker on unregistration just drop the refcount.
Now we can keep a pair (reference to header, pointer to ctl table entry)
as long as we hold refcount on header. It won't affect unregistration
in any way. And at any point we can try to acquire "active" (use) reference
to header. If that succeeds, we know that
+ unregistration hadn't been started
+ unregistration won't be finished until we unuse the sucker
+ table entry is alive and will stay alive until then.
So we can hold references to those puppies from inodes under /proc/sys
without blocking unregistration, etc.
What's more, we can associate such pair with each node in sysctl tree.
For non-directories that's obvious. For directories, take the tree such
that directory belongs to tree \setminus parent of tree.
That's pretty much it. Filesystem side is simple - we keep a pointer to
class of tree responsible for a node (see directly above) in dentry.
And ->d_compare() checks that class of candidate match should be visible
for task doing the lookup. ->lookup() tries finding an entry with requested
name in sysctl table (found by directory inode) and in case of miss it goes
through the list of tables attached at that node, searching in those that
ought to be visible to us.
As the result, we have direct access to sysctl table entry right from inode,
maintain these references accross lookups without going through the contortions
done by current code and we do *NOT* use the same dentry for flipping between
unrelated sysctl nodes with different visibility...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists