lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+8MBbK5J1qWqTZjC6nHsVbqk05t0yF1F7d-_0PQpvBQBXgO1w@mail.gmail.com>
Date:	Thu, 1 Nov 2012 10:25:23 -0700
From:	Tony Luck <tony.luck@...el.com>
To:	Mauro Carvalho Chehab <mchehab@...hat.com>
Cc:	Borislav Petkov <bp@...en8.de>,
	Linux Edac Mailing List <linux-edac@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [RFC EDAC/GHES] edac: lock module owner to avoid error report conflicts

On Thu, Nov 1, 2012 at 4:47 AM, Mauro Carvalho Chehab
<mchehab@...hat.com> wrote:
> Take a look at arch/x86/kernel/cpu/mcheck/mce-apei.c:
>
>         void apei_mce_report_mem_error(int corrected, struct cper_sec_mem_err *mem_err)
>         {
>                 struct mce m;
>
>                 /* Only corrected MC is reported */
>                 if (!corrected || !(mem_err->validation_bits &
>                                         CPER_MEM_VALID_PHYSICAL_ADDRESS))
>                         return;
>
>                 mce_setup(&m);
>                 m.bank = 1;
>                 /* Fake a memory read corrected error with unknown channel */
>                 m.status = MCI_STATUS_VAL | MCI_STATUS_EN | MCI_STATUS_ADDRV | 0x9f;
>                 m.addr = mem_err->physical_addr;
>                 mce_log(&m);
>                 mce_notify_irq();
>         }
>
> Bank information there is fake; status is fake. Only addr is really filled
> there; it works only for corrected errors.

This went in like this to help out the Westmere-EX processors that
didn't fill out MCi_ADDR for corrected errors. APEI could get the
address from some platform CSRs ... reporting via /dev/mcelog
so that predictive analysis in mcelog(8) would work on these machines.

I don't think we can rip it out yet ... not until those machines are
shuffled off to recycle heaven.

But perhaps we should get smarter about which machines we enable
APEI on?  If we get everything we need from the machine check banks,
then the detour via the BIOS to report the same thing again isn't helpful.

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ