[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20080326033738.GX103491721@sgi.com>
Date: Wed, 26 Mar 2008 14:37:38 +1100
From: David Chinner <dgc@....com>
To: "Josef 'Jeff' Sipek" <jeffpc@...efsipek.net>
Cc: NeilBrown <neilb@...e.de>,
"J. Bruce Fields" <bfields@...ldses.org>, xfs@....sgi.com,
Adam Schrotenboer <adam@...00.com>,
Jesper Juhl <jesper.juhl@...il.com>,
Trond Myklebust <trond.myklebust@...app.com>,
linux-kernel@...r.kernel.org, linux-nfs@...r.kernel.org,
Thomas Daniel <tdaniel@...00.com>,
Frederic Revenu <frevenu@...00.com>,
Jeff Doan <jdoan@...00.com>
Subject: Re: [opensuse] nfs_update_inode: inode X mode changed, Y to Z
On Tue, Mar 25, 2008 at 06:13:21PM -0400, Josef 'Jeff' Sipek wrote:
> On Wed, Mar 26, 2008 at 08:38:22AM +1100, NeilBrown wrote:
> ...
> > However you still need to do something about the generation number. It
> > must be set to something.
.....
> > Even better would be store store that 'next generation number' in the
> > superblock so there would be even less risk of the 'random' generation
> > producing repeats.
> > This is what ext3 does. It doesn't dynamically allocate inodes,
> > but it doesn't want to pay the cost of reading an old inode from
> > storage just to see what the generation number is. So it has
> > a number in the superblock which is incremented on each inode allocation
> > and is used as the generation number.
>
> Something tells me that the SGI folks might not be all too happy with the
> in-sb number...
.....
> Perhaps a per-ag variable would be better,
/me goes back to the bug from last year about stable inode/gen numbers
for a HSM.
dgc> Right, except the last thing we want is yet more global state needing to
dgc> be updated in inode allocation. The best way to do this is a max generation
dgc> number per AG (held in the AGI) so that it can be updated at the same time
dgc> inodes are freed and not cause additional serialisation.
Which was soundly rejected by the HSM folk because it wraps at 4 billion
inode create/unlink cycles in an AG rather than per inode. The only thing
they were happy with was the old behaviour and so they now mount their
filesystems with ikeep. At that point the issue was dropped on the floor;
the NFS side of things apparently weren't causing any problems so we didn't
consider it urgent to fix....
Given this state of affairs (i.e. HSM using ikeep), I guess we can do
anything we want for the noikeep case. I'll cook up a patch that does
something similar to ext3 generation numbers for the initial seeding....
> but I remember reading that parallelizing updates
> to some inode count variable (I forget which) in the superblock
> \cite{dchinner-ols2006} led to a rather big improvement.
That was for in memory counters not on disk, and the problem really was
free block counts rather than free inode counts. Yes, I converted the
inode counters at the same time, but that wasn't the limiting factor.
Updates to the on disk superblock, OTOH, are a limiting factor and
that was the lazy superblock counter modifications solve....
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists