[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LFD.1.00.0801080902460.3148@woody.linux-foundation.org>
Date: Tue, 8 Jan 2008 09:11:48 -0800 (PST)
From: Linus Torvalds <torvalds@...ux-foundation.org>
To: Stefan Richter <stefanr@...6.in-berlin.de>
cc: Matthew Wilcox <matthew@....cx>, Valdis.Kletnieks@...edu,
Willy Tarreau <w@....eu>, Adrian Bunk <bunk@...nel.org>,
James Bottomley <James.Bottomley@...senPartnership.com>,
Ingo Molnar <mingo@...e.hu>,
Peter Osterlund <petero2@...ia.com>,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Al Viro <viro@....linux.org.uk>
Subject: Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"
On Tue, 8 Jan 2008, Stefan Richter wrote:
> Matthew Wilcox wrote:
> > So you're saying that you can't find reliable ways to reproduce problems
> > on demand? Those are some of the lower quality bug reports,
>
> Or those are the more difficult problems.
Indeed. If it's some race condition, or dependent on memory pressure at
just the right time, or a use-after-free that corrupts some memory that
normally nobody will even notice (it's freed, after all, and not
necessarily re-allocated), reproduction can be really very hard.
It's happily not exactly *common*, but it's certainly not unheard of
either, when you need to run some specific workload for hours to trigger
the bug - and then when it doesn't happen, you have to ask yourself: "was
I just lunky, punk?"
Some of those things also go away magically between kernel versions or
subtly different configurations. A use-after-free problem might be obvious
in one config, but then another configuration might change the size of a
structure, and suddenly the two kmalloc's that used to be in the same slab
(and made the problem more visible) end up in different slabs, and now you
suddenly cannot reproduce it with that particular load at all any more!
These things *are* fairly rare (most bugs by _far_ are of the trivial
stupid kind), but some of those things can stay around for a long time,
and it can take months of different people reporting similar problems
until somebody finally puts two and two together and sees the pattern.
When we get a good bug-reporter that is willing and able to reproduce and
test kernels, that's wonderful, but when we get some "background noise" of
bad bug-reports, that's usually good too - even if it's good only in the
long run (ie sometimes we just have to accept that the bug-report didn't
contain enough information for us to really do anythign about it, and just
let it be - and hope that future events will clarify things)
Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists