[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100102192337.GB30016@basil.fritz.box>
Date: Sat, 2 Jan 2010 20:23:37 +0100
From: Andi Kleen <andi@...stfloor.org>
To: Frederic Weisbecker <fweisbec@...il.com>
Cc: Andi Kleen <andi@...stfloor.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>,
Christian Kujau <lists@...dbynature.de>,
Alexander Beregalov <a.beregalov@...il.com>,
Chris Mason <chris.mason@...cle.com>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: reiserfs broken in 2.6.32 was Re: [GIT PULL] reiserfs fixes
On Sat, Jan 02, 2010 at 08:02:15PM +0100, Frederic Weisbecker wrote:
> On Sat, Jan 02, 2010 at 06:43:12PM +0100, Andi Kleen wrote:
> > > I only have reiserfs partitions in my laptop and my testbox,
> > > nothing else. And that because I'm now maintaining it de facto.
> >
> > AFAIK it's widely used in SUSE installations. It was the default
> > for a long time.
> >
> > And right now as in 2.6.32 it's in a state of
> > "may randomly explode/deadlock". And no clear path out of it. Not good.
> >
> > I am very concerned about destabilizing a widely used file system
> > like this. This has the potential to really hurt users.
>
>
> I understand your worries. And I've been very cautious with that,
> waiting for three cycles before requesting an upstream merge. I did
> it because the isolated tree model did not scale anymore.
>
> Now that it's upstream, I get more testing and I expect that, in
> the end of this cycle, I get most of these issues reported and
> fixed.
Will you?
How many users systems could it break by then?
>
> Serious users who run serious datas won't ship 2.6.33, they will ship
> a further stable version 2.6.33.x (if they haven't converted their
> filesystems already).
> And at this time, things should be 99% fixed.
That seems very risky. For some rarely used obscure subsystems
that might work but a widely used file system that keeps people's $HOME?
I don't think seriously destabilizing that for a potentially longer
time is a good idea. There's the potential to break
a lot of porcelain.
Probably you could do a ext3/ext4 like thing by starting
with a "reiserfs3.5" copy and do the work there and then
merge back once things work and have been reasonably verified
by code review.
> That's the theory. Fitting into this strict scheme brings performance
> regressions. The bkl is a spinlock, it disables preemption, it is
> relaxed on sleep, and doesn't have locking dependencies. Moreover
> it's not a lock but a simulation of a NO_PREEMPT UP flow, with all
> the fixup guardians that come with (fixup if we schedule, as
> scheduling brings races).
>
> From the conversion is borned a mutex. Even though we have
> adaptive spinning, we don't catch up spinlock performances
> as it's not a pure optimized looping fast path, and it may
> actually just sleep.
Fix the adaptive spinlock then?
>
> The bkl is relaxed only when we sleep. Now simulating that with
> a mutex that gets explicitly relaxed is not the same thing as
> we need to relax the lock each time we _might_ sleep. It means
> we relax more and that brings performance regressions.
At least in the cases where the decision is in reiserfs code
directly you could predict it by using need_resched(), couldn't you?
That might not be 100% accurate, but good enough.
> That said, if the general opinion is in favour of unmerging
> the bkl removal changes in reiserfs. Then please do.
For me it seems too aggressive at this point.
If it was just a case of fixing a few known bugs, but
if you're not even sure how many problems are left ...
Perhaps do the reiserfs35 variant?
> Just to express my point of view, as my primary goal is not
> to fix reiserfs but the kernel: If you are afraid of such
> changes, your kernel will just become mildewed by the time.
Better some mildew than a seriously-broken-for-enough people's
release (although I have my doubts that's the right metapher
for the BKL anyways)
Having stable releases is an important part for
getting enough testers (we already have too little). And
if we start breaking their $HOMEs they might become
even less.
-Andi
--
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists