lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180717063619.GB1346@hori1.linux.bs1.fc.nec.co.jp>
Date:   Tue, 17 Jul 2018 06:36:19 +0000
From:   Naoya Horiguchi <n-horiguchi@...jp.nec.com>
To:     Dan Williams <dan.j.williams@...el.com>
CC:     "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
        Jan Kara <jack@...e.cz>, Christoph Hellwig <hch@....de>,
        Jérôme Glisse <jglisse@...hat.com>,
        Matthew Wilcox <mawilcox@...rosoft.com>,
        Ross Zwisler <ross.zwisler@...ux.intel.com>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 08/11] mm, memory_failure: Teach memory_failure()
 about dev_pagemap pages

On Fri, Jul 13, 2018 at 05:28:05PM -0700, Dan Williams wrote:
> On Fri, Jul 13, 2018 at 1:52 AM, Naoya Horiguchi
> <n-horiguchi@...jp.nec.com> wrote:
> > On Wed, Jul 04, 2018 at 02:41:06PM -0700, Dan Williams wrote:
...
> >> +
> >> +     /*
> >> +      * Use this flag as an indication that the dax page has been
> >> +      * remapped UC to prevent speculative consumption of poison.
> >> +      */
> >> +     SetPageHWPoison(page);
> >
> > The number of hwpoison pages is maintained by num_poisoned_pages,
> > so you can call num_poisoned_pages_inc()?
> 
> I don't think we want these pages accounted in num_poisoned_pages().
> We have the badblocks infrastructure in libnvdimm to track how many
> errors and where they are located, and since they can be repaired via
> driver actions I think we should track them separately.

OK.

> > Related to this, I'm interested in whether/how unpoison_page() works
> > on a hwpoisoned dev_pagemap page.
> 
> unpoison_page() is only triggered via freeing pages to the page
> allocator, and that never happens for dev_pagemap / ZONE_DEVICE pages.

sorry, my bad comment.
I meant unpoison_memory() in mm/memory-failure.c, which is triggered
via debugfs:hwpoison/unpoison-pfn. This interface looks like below

  int unpoison_memory(unsigned long pfn)
  {
          struct page *page;
          struct page *p;
          int freeit = 0;
          static DEFINE_RATELIMIT_STATE(unpoison_rs, DEFAULT_RATELIMIT_INTERVAL,
                                          DEFAULT_RATELIMIT_BURST);

          if (!pfn_valid(pfn))
                  return -ENXIO;

          p = pfn_to_page(pfn);
          page = compound_head(p);

          if (!PageHWPoison(p)) {
                  unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n",
                                   pfn, &unpoison_rs);
                  return 0;
          }
  ...

so I think that we can add is_zone_device_page() check at the beginning
of this function to call hwpoison_clear() introduced in patch 13/13?
Otherwise maybe compound_head() will cause some critical issue like
general protection fault.

Thanks,
Naoya Horiguchi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ