lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1496869051.9288.1.camel@hpe.com>
Date:   Wed, 7 Jun 2017 20:57:57 +0000
From:   "Kani, Toshimitsu" <toshi.kani@....com>
To:     "dan.j.williams@...el.com" <dan.j.williams@...el.com>
CC:     "linux-acpi@...r.kernel.org" <linux-acpi@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
        "rjw@...ysocki.net" <rjw@...ysocki.net>,
        "vishal.l.verma@...el.com" <vishal.l.verma@...el.com>
Subject: Re: [PATCH] Add support of NVDIMM memory error notification in ACPI
 6.2

On Wed, 2017-06-07 at 12:09 -0700, Dan Williams wrote:
> On Wed, Jun 7, 2017 at 11:49 AM, Toshi Kani <toshi.kani@....com>
> wrote:
 :
> > +
> > +static void acpi_nfit_uc_error_notify(struct device *dev,
> > acpi_handle handle)
> > +{
> > +       struct acpi_nfit_desc *acpi_desc = dev_get_drvdata(dev);
> > +
> > +       acpi_nfit_ars_rescan(acpi_desc);
> 
> I wonder if we should gate re-scanning with a similar:
> 
>     if (acpi_desc->scrub_mode == HW_ERROR_SCRUB_ON)
> 
> ...check that we do in the mce notification case? Maybe not since we
> don't get an indication of where the error is without a rescan.

I think this mce case is different since the MCE handler already knows
where the new poison location is and can update badblocks information
for it.  Starting ARS is an optional precaution.

> However, at a minimum I think we need support for the new Start ARS
> flag ("If set to 1 the firmware shall return data from a previous
> scrub, if any, without starting a new scrub") and use that for this
> case.

That's an interesting idea.  But I wonder how users know if it is OK to
set this flag as it relies on BIOS implementation that is not described
in ACPI...

> Another thing that seems to be missing in both this and the mce case
> is a notification to userspace that something changed. We have calls
> to sysfs_notify_dirent() to notify scrub completion events and DIMM
> health status change events, I think we need a similar notifier
> mechanism for new un-correctable errors.

Good point.  I think this can be a badblocks population event, which
gets generated when badblocks information is updated at boot-time and
run-time via this notification and MCE.

Thanks,
-Toshi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ