[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20101020173808.GA15056@fieldses.org>
Date: Wed, 20 Oct 2010 13:38:08 -0400
From: "J. Bruce Fields" <bfields@...ldses.org>
To: Al Viro <viro@...IV.linux.org.uk>
Cc: Matthew Wilcox <matthew@....cx>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Eric Paris <eparis@...hat.com>, linux-kernel@...r.kernel.org,
linux-security-module@...r.kernel.org,
linux-fsdevel@...r.kernel.org, hch@...radead.org, zohar@...ibm.com,
warthog9@...nel.org, david@...morbit.com, jmorris@...ei.org,
kyle@...artin.ca, hpa@...or.com, akpm@...ux-foundation.org,
mingo@...e.hu
Subject: Re: [PATCH 1/3] IMA: move read/write counters into struct inode
On Wed, Oct 20, 2010 at 04:15:04AM +0100, Al Viro wrote:
> On Tue, Oct 19, 2010 at 01:11:33PM -0600, Matthew Wilcox wrote:
> > Hm. Sounds like the same question that the file leases code needs
> > answered. The important difference is that the leases code can just
> > refuse to set a lease on inodes with multiple dentries.
> >
> > While my mind's on it ... Al, is this code even close to correct?
> >
> > if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
> > goto out;
> > if ((arg == F_WRLCK)
> > && ((atomic_read(&dentry->d_count) > 1)
> > || (atomic_read(&inode->i_count) > 1)))
> > goto out;
>
> No. This is complete junk; note that e.g. ls -lR will disrupt it, since
> lstat(2) will bump dentry refcount. The first part is more or less OK;
> the second makes no sense.
It may be nobody cares about false positives--if leases are something
provided as a caching optimization, then it's OK to deny them more often
than strictly necessary.
(In fact, if you're using the write lease to allow somebody else to
cache writes, then you want the stat to break the lease so it can get
can trigger a cache flush and get an updated mtime.)
Not that I'll claim those checks are correct.
> What is it trying to do? Note that the first part also doesn't make a lot
> of sense, since you could be acquiring a write reference *right* *now*,
> just as that check passes. And you could finish getting it before you get
> to do anything else in generic_setlease().
Yeah, that's a race.
The lease code is broken. On my list of problems:
- Leases are broken only by conflicting opens; but both nfsv4
delegations and (I'm told) Windows op locks actually require
that read leases be broken on any operation that changes file
metadata--including unlink, link, rename, chmod, and chown.
- The internal kernel api used for lease-breaking is inherently
racy as long as it consists of a single break_lease() call.
(We probably need deny_leases() and undeny_leases(), or
something.)
I'd be happy just to have read leases that worked correctly; write
leases (which, at least for the NFSv4 server's purposes, also need to be
broken on *access* to metadata) are worse.
I have patches that attempted to replace the current mechanism for
F_RDLCK leases with an i_leasecount modeled on i_writecount. (So
positive counts the number of leases held, negative counts operations
temporarily disabling leases.) They were never completely correct.
I also wasn't sure what sort of requirements I should meet to avoid
noticeable scalability problems (how much additional space in the inode
could I get away with? What about additional locks, or just writes to
inode fields, on read opens?).
--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists