lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <26945734-ac7e-f71e-dbfa-0b0f0fdaff32@oracle.com>
Date:   Mon, 23 Oct 2017 11:20:02 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Cc:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Michal Hocko <mhocko@...nel.org>,
        Aneesh Kumar <aneesh.kumar@...ux.vnet.ibm.com>,
        Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH 1/1] mm:hugetlbfs: Fix hwpoison reserve accounting

On 10/23/2017 12:32 AM, Naoya Horiguchi wrote:
> On Fri, Oct 20, 2017 at 10:49:46AM -0700, Mike Kravetz wrote:
>> On 10/19/2017 07:30 PM, Naoya Horiguchi wrote:
>>> On Thu, Oct 19, 2017 at 04:00:07PM -0700, Mike Kravetz wrote:
>>>
>>> Thank you for addressing this. The patch itself looks good to me, but
>>> the reported issue (negative reserve count) doesn't reproduce in my trial
>>> with v4.14-rc5, so could you share the exact procedure for this issue?
>>
>> Sure, but first one question on your test scenario below.
>>
>>>
>>> When error handler runs over a huge page, the reserve count is incremented
>>> so I'm not sure why the reserve count goes negative.
>>
>> I'm not sure I follow.  What specific code is incrementing the reserve
>> count?
> 
> The call path is like below:
> 
>   hugetlbfs_error_remove_page
>     hugetlb_fix_reserve_counts
>       hugepage_subpool_get_pages(spool, 1)
>         hugetlb_acct_memory(h, 1);
>           gather_surplus_pages
>             h->resv_huge_pages += delta;
> 

Ah OK.  This is a result of call to hugetlb_fix_reserve_counts which
I believe is incorrect in most instances, and is unlikely to happen 
with my patch.

>>
>> Remove the file (rm /var/opt/oracle/hugepool/foo)
>> -------------------------------------------------
>> HugePages_Total:       1
>> HugePages_Free:        0
>> HugePages_Rsvd:    18446744073709551615
>> HugePages_Surp:        0
>> Hugepagesize:       2048 kB
>>
>> I am still confused about how your test maintains a reserve count after
>> poisoning.  It may be a good idea for you to test my patch with your
>> test scenario as I can not recreate here.
> 
> Interestingly, I found that this reproduces if all hugetlb pages are
> reserved when poisoning.
> Your testing meets the condition, and mine doesn't.
> 
> In gather_surplus_pages() we determine whether we extend hugetlb pool
> with surplus pages like below:
> 
>     needed = (h->resv_huge_pages + delta) - h->free_huge_pages;
>     if (needed <= 0) {
>             h->resv_huge_pages += delta;
>             return 0;
>     }
>     ...
> 
> needed is 1 if h->resv_huge_pages == h->free_huge_pages, and then
> the reserve count gets inconsistent.
> I confirmed that your patch fixes the issue, so I'm OK with it.

Thanks.  That now makes sense to me.

hugetlb_fix_reserve_counts (which results in gather_surplus_pages being
called), is only designed to be called in the extremely rare cases when
we have free'ed a huge page but are unable to free the reservation entry.

Just curious, when the hugetlb_fix_reserve_counts call was added to
hugetlbfs_error_remove_page, was the intention to preserve the original
reservation?  I remember thinking hard about that for the hole punch
case and came to the conclusion that it was easier and less error prone
to remove the reservation as well.  That will also happen in the error
case with the patch I provided.

-- 
Mike Kravetz

> 
> Acked-by: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
> 
> Thanks,
> Naoya Horiguchi
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ