lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210820144314.GA1622759@agluck-desk2.amr.corp.intel.com>
Date:   Fri, 20 Aug 2021 07:43:14 -0700
From:   "Luck, Tony" <tony.luck@...el.com>
To:     Borislav Petkov <bp@...en8.de>
Cc:     x86@...nel.org, linux-edac@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Sumanth Kamatala <skamatala@...iper.net>
Subject: Re: [PATCH] x86/mce/dev-mcelog: Call mce_register_decode_chain()
 much earlier

On Fri, Aug 20, 2021 at 02:28:45PM +0200, Borislav Petkov wrote:
> On Thu, Aug 19, 2021 at 03:44:52PM -0700, Tony Luck wrote:
> > which made sure that the logs were not lost completely by printing
> > to the console. But parsing console logs is error prone. Users
> > of /dev/mcelog should expect to find any early errors logged to
> > standard places.
> 
> Yes, and for that matter, *all* consumers which register on the decoding
> chain should get a chance to look at those records...
> 
> > Split the initialization code in dev-mcelog.c into:
> > 1) an "early" part that registers for mce notifications. Call this
> > directly from mcheck_init() because early_initcall() is still too late.
> > This allocation is too early for kzalloc() so use memblock_alloc().
> > 2) "late" part that registers the /dev/mcelog character device.
> 
> ... but this looks like a hack to me: why aren't we adding those early
> records to the gen_pool and kick the work to consume them *only* *after*
> all consumers have been registered properly and everything is up and
> running?

How can the kernel tell that all consumers have registered? Is there
some new kernel crystal ball functionality that can predict that an
EDAC driver module is going to be loaded at some point in the future
when user space is up and running :-)

I think the best we could do would be to set a timer for some point
far enough out (one minute?, two minutes?) to give a chance for
modules to load. But this seems even more hacky ... I have no idea
how much time is enough? In this particular case we know that the
system crashed before ... maybe the file systems are going to need
a fsck(8) before modules are loaded?

-Tony

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ