[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220322025528.GA188243@hori.linux.bs1.fc.nec.co.jp>
Date: Tue, 22 Mar 2022 02:55:30 +0000
From: HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@....com>
To: Yang Shi <shy828301@...il.com>
CC: Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
Linux MM <linux-mm@...ck.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5] mm/hwpoison: fix race between hugetlb free/demotion
and memory_failure_hugetlb()
On Mon, Mar 21, 2022 at 05:46:48PM -0700, Yang Shi wrote:
> On Thu, Mar 17, 2022 at 10:16 PM Naoya Horiguchi
> <naoya.horiguchi@...ux.dev> wrote:
> >
> > From: Naoya Horiguchi <naoya.horiguchi@....com>
> >
> > There is a race condition between memory_failure_hugetlb() and hugetlb
> > free/demotion, which causes setting PageHWPoison flag on the wrong page.
> > The one simple result is that wrong processes can be killed, but another
> > (more serious) one is that the actual error is left unhandled, so no one
> > prevents later access to it, and that might lead to more serious results
> > like consuming corrupted data.
> >
> > Think about the below race window:
> >
> > CPU 1 CPU 2
> > memory_failure_hugetlb
> > struct page *head = compound_head(p);
> > hugetlb page might be freed to
> > buddy, or even changed to another
> > compound page.
> >
> > get_hwpoison_page -- page is not what we want now...
> >
> > The compound_head is called outside hugetlb_lock, so the head is not
> > reliable.
> >
> > So set PageHWPoison flag after passing prechecks. And to detect
> > potential violation, this patch also introduces a new action type
> > MF_MSG_DIFFERENT_PAGE_SIZE.
> >
> > Reported-by: Mike Kravetz <mike.kravetz@...cle.com>
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@....com>
> > Signed-off-by: Miaohe Lin <linmiaohe@...wei.com>
> > Cc: <stable@...r.kernel.org>
...
> > @@ -1547,21 +1545,31 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
> > * If this happens just bail out.
> > */
> > if (!PageHuge(p) || compound_head(p) != head) {
> > + if (TestSetPageHWPoison(p))
> > + already_hwpoisoned = pfn;
> > + else
> > + num_poisoned_pages_inc();
> > action_result(pfn, MF_MSG_DIFFERENT_PAGE_SIZE, MF_IGNORED);
>
> The commit log says "this patch also introduces a new action type
> MF_MSG_DIFFERENT_PAGE_SIZE", but it is not defined in the patch and it
> is called here. Did I miss something?
Sorry, you're right. MF_MSG_DIFFERENT_PAGE_SIZE is defined in the separate
patch in mmotm, and disappeared when rebasing (not intended).
I think of rebasing this to mainline again to apply cleanly to -stable,
expecting it to applied before other recent hwpoison patches.
Thanks,
Naoya Horiguchi
Powered by blists - more mailing lists