lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100922160705.GK26290@redhat.com>
Date:	Wed, 22 Sep 2010 12:07:05 -0400
From:	Don Zickus <dzickus@...hat.com>
To:	Andi Kleen <andi@...stfloor.org>
Cc:	Huang Ying <ying.huang@...el.com>, Ingo Molnar <mingo@...e.hu>,
	"H. Peter Anvin" <hpa@...or.com>, linux-kernel@...r.kernel.org
Subject: Re: [RFC 1/6] x86, NMI, Add symbol definition for NMI magic constants

On Wed, Sep 22, 2010 at 12:19:16AM +0200, Andi Kleen wrote:
> 
> >
> > I guess adding either another knob to override the hardware error option
> > or tying it in with the panic_on_unknown_error option might make me more
> > comfortable.  That way enterprise customers can always just enable it by
> > default and desktop users (for now) could have it off.
> 
> Anything that needs explicit enabling is a bad idea, that
> would lead to a lot of users running in "corrupt my data" mode.

I know.  But as I said earlier in my emails, I am trying to figure out how
to deal with the fallout from unknown nmis turning into panics.  Today
people see unknown nmis.  They may or may not be corrupting data.  They
usually file a bug.  Currently it is hard for me to diagnosis the problem.
Usually the old 'upgrade your bios/firmware' does the trick.  Sometimes it
doesn't and people feel like their machines still run fine.  So they
ignore it (for good or for bad).

Turning unknown nmis into panics would break their current setup without
much gain.  So I was trying to propose something temporarily until we
could get a better infrastructure to isolate the source and provide better
info on what to do.

I agree with you that long term unknown nmis should be panics.  I just get
nervous about doing that now from a support perspective.

> 
> The code currently uses the presence of a HEST error table
> to detect a server. HEST should be only available on servers.
> 
> On servers at least panic should be default.

Ok.  That's fine. But then what.  What does a developer do with that
panic?  There's no useful info.  That is sorta my problem.  Then again I
do not know much about HEST.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ