lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 9 Mar 2022 15:59:55 -0800
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Yang Shi <shy828301@...il.com>,
        Naoya Horiguchi <naoya.horiguchi@...ux.dev>
Cc:     Linux MM <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Miaohe Lin <linmiaohe@...wei.com>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v1] mm/hwpoison: set PageHWPoison after taking page lock
 in memory_failure_hugetlb()

On 3/9/22 13:55, Yang Shi wrote:
> On Wed, Mar 9, 2022 at 1:15 AM Naoya Horiguchi
> <naoya.horiguchi@...ux.dev> wrote:
>>
>> From: Naoya Horiguchi <naoya.horiguchi@....com>
>>
>> There is a race condition between memory_failure_hugetlb() and hugetlb
>> free/demotion, which causes setting PageHWPoison flag on the wrong page
>> (which was a hugetlb when memory_failrue() was called, but was removed
>> or demoted when memory_failure_hugetlb() is called).  This results in
>> killing wrong processes.  So set PageHWPoison flag with holding page lock,
>>
>> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@....com>
>> ---
>>  mm/memory-failure.c | 27 ++++++++++++---------------
>>  1 file changed, 12 insertions(+), 15 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index ac6492e36978..fe25eee8f9d6 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -1494,24 +1494,11 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>>         int res;
>>         unsigned long page_flags;
>>
>> -       if (TestSetPageHWPoison(head)) {
>> -               pr_err("Memory failure: %#lx: already hardware poisoned\n",
>> -                      pfn);
>> -               res = -EHWPOISON;
>> -               if (flags & MF_ACTION_REQUIRED)
>> -                       res = kill_accessing_process(current, page_to_pfn(head), flags);
>> -               return res;
>> -       }
>> -
>> -       num_poisoned_pages_inc();
>> -
>>         if (!(flags & MF_COUNT_INCREASED)) {
>>                 res = get_hwpoison_page(p, flags);
> 
> I'm not an expert of hugetlb, I may be wrong. I'm wondering how this
> could solve the race? Is the below race still possible?
> 
> __get_hwpoison_page()
>   head = compound_head(page)
> 
> hugetlb demotion (1G --> 2M)
>   get_hwpoison_huge_page(head, &hugetlb);
> 
> 
> Then the head may point to a 2M page, but the hwpoisoned subpage is
> not in that 2M range?

That is correct.

It is also possible that __free_pages(page, huge_page_order(h)) could have
been called during this window.  So IIUC, head would have an increased ref
count and pages would be freed to buddy when the memory error code drops the
ref.  At that time, head would be marked as poisoned which could be different
than actual page with poison.

An increased ref count, or page lock will not prevent hugetlb page demotion
or (attempting) to free to buddy today.

There is already a patch in Andrew's tree to not demote hugetlb pages marked
with poison.  This at least makes the demote code perform the same check as
allocation code.  The race which started this discussion has been there for
a while.  demotion opened another window, but that is now closed.

IMO, it would be better to take a step back and look at the overall design
and decide how to proceed.  There is also an effort underway to provide double
mapping of hugetlb pages, and one of the target use cases is memory error
handling.  This effort is in the very early stages, but it will certainly
require setting poison on the (sub-)page with actual error rather than
head page.  Perhaps something like what is done for THP today.  Nothing to
address yet, but I just wanted to note there will be future changes in this
area.
-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ