lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a455c6b1-b9ef-39ab-879e-80e13fd13c10@oracle.com>
Date:   Fri, 20 May 2022 16:31:31 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     John Hubbard <jhubbard@...dia.com>,
        Minchan Kim <minchan@...nel.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        syzbot <syzbot+acf65ca584991f3cc447@...kaller.appspotmail.com>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        llvm@...ts.linux.dev, nathan@...nel.org, ndesaulniers@...gle.com,
        syzkaller-bugs@...glegroups.com, trix@...hat.com,
        Matthew Wilcox <willy@...radead.org>,
        Stephen Rothwell <sfr@...b.auug.org.au>,
        David Hildenbrand <david@...hat.com>
Subject: Re: [syzbot] WARNING in follow_hugetlb_page

On 5/20/22 15:56, John Hubbard wrote:
> On 5/20/22 15:19, Minchan Kim wrote:
>> The memory offline would be an issue so we shouldn't allow pinning of any
>> pages in *movable zone*.
>>
>> Isn't alloc_contig_range just best effort? Then, it wouldn't be a big
>> problem to allow pinning on those area. The matter is what target range
>> on alloc_contig_range is backed by CMA or movable zone and usecases.
>>
>> IOW, movable zone should be never allowed. But CMA case, if pages
>> are used by normal process memory instead of hugeTLB, we shouldn't
>> allow longterm pinning since someone can claim those memory suddenly.
>> However, we are fine to allow longterm pinning if the CMA memory
>> already claimed and mapped at userspace(hugeTLB case IIUC).
>>
> 
> From Mike's comments and yours, plus a rather quick reading of some
> CMA-related code in mm/hugetlb.c (free_gigantic_page(), alloc_gigantic_pages()), the following seems true:
> 
> a) hugetlbfs can allocate pages *from* CMA, via cma_alloc()
> 
> b) while hugetlbfs is using those CMA-allocated pages, it is debatable
> whether those pages should be allowed to be long term pinned. That's
> because there are two cases:
> 
>     Case 1: pages are longterm pinned, then released, all while
>             owned by hugetlbfs. No problem.
> 
>     Case 2: pages are longterm pinned, but then hugetlbfs releases the
>             pages entirely (via unmounting hugetlbfs, I presume). In
>             this case, we now have CMA page that are long-term pinned,
>             and that's the state we want to avoid.

I do not think case 2 can happen.  A hugetlb page can only be changed back
to 'normal' (buddy) pages when ref count goes to zero.

It should also be noted that hugetlb code sets up the CMA area from which
hugetlb pages can be allocated.  This area is never unreserved/freed.

I do not think there is a reason to disallow long term pinning of hugetlb
pages allocated from THE hugetlb CMA area.

But, I wonder if it is possible for hugetlb pages to be allocated from
another (non-hugetlb) area.  For example if someone sets up a huge CMA area
and hugetlb allocations spill over into that area.  If this is possible
(still need to research), then we would not want to long term pin such
hugetlb pages.  We can check this in the hugetlb code to determine if
long term pinning is allowed.  

> 
> The reason it is debatable is that hugetlbfs is intended to be used
> long term, itself. The expected use cases do not normally include a
> lot of short term mounting and unmounting.
> 
> And whichever way that debate goes, we need to allow it to be
> fixable, by not tying "is pinnable" to "using gup/pup". The caller
> has the context that is needed to make that policy decision, but
> gup/pup does not.
> 
> At this point, I think it's time to fix up the problems and restore
> previous behavior, by choosing Case 1 behavior for now. And also
> lifting the is_pinnable_page() checks up a level, as noted in my
> other thread.  I can do that, unless someone sees a flaw in the
> reasoning.

Go for it.

-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ