lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100615070244.GD6727@basil.fritz.box>
Date:	Tue, 15 Jun 2010 09:02:45 +0200
From:	Andi Kleen <andi@...stfloor.org>
To:	Dave Chinner <david@...morbit.com>
Cc:	Andi Kleen <andi@...stfloor.org>, xfs@....sgi.com,
	akpm@...ux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] [16/23] XFS: Fix gcc 4.6 set but not read and unused
	statement warnings

> We find out about corrupted filesystems all the time from users
> sending mail to the list. Even if we did panic by default on
> corruption events, kerneloops.org is *useless* for reporting them
> because finding out about a corruption is only the very first step
> of what is usually a long and involved process that requires user
> interaction to gather information necessary to find the cause of the
> corruption.

The idea behind kerneloops.org 
is normally that any single report can be always a flake
(broken memory, hardware, flipped bit whatever).

An error becomes important and interesting when there are multiple
occurrences of it in the field.
> 
> Besides, if we _really_ want the machine to panic on corruption,

BUG_ON is not panic normally.

> then we configure the machine specifically for it via setting the
> relevant corruption type bit in /proc/sys/fs/xfs/panic_mask. This is
> generally only used when a developer asks a user to set it to get
> kernel crash dumps triggered when a corruption event occurs so we
> can do remote, offline analysis of the failure.

Especially when you're talking about desktop class systems
without ECC memory that will mean you'll spend at least some
time on errors which are simply bit flips.

> > That's standard Linux kernel development
> > practice.  Maybe XFS should catch up on that.
> 
> I find this really amusing because linux filesystems have, over the

This has really nothing to do with file systems, it's general
practice for everything (well except XFS) 

> last few years, implemented a simpler version of XFS's way of
> dealing with corruption events(*). Perhaps you should catch up
> with the state of the art before throwing rocks, Andi....

I suspect you miss quite a lot of valuable information from
your user base by not supporting kerneloops.org. On the other
hand it would likely also save you from spending time on 
flakes.

That said you don't need BUG_ON to support it (WARN etc. work
too), it's just the easiest way.

-Andi
-- 
ak@...ux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ