lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190806111210.7xpmjsd4hq54vuml@box>
Date:   Tue, 6 Aug 2019 14:12:10 +0300
From:   "Kirill A. Shutemov" <kirill@...temov.name>
To:     William Kucharski <william.kucharski@...cle.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org,
        linux-fsdevel@...r.kernel.org,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Song Liu <songliubraving@...com>,
        Bob Kasten <robert.a.kasten@...el.com>,
        Mike Kravetz <mike.kravetz@...cle.com>,
        Chad Mynhier <chad.mynhier@...cle.com>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Johannes Weiner <jweiner@...com>,
        Matthew Wilcox <willy@...radead.org>
Subject: Re: [PATCH v3 2/2] mm,thp: Add experimental config option
 RO_EXEC_FILEMAP_HUGE_FAULT_THP

On Mon, Aug 05, 2019 at 09:56:45AM -0600, William Kucharski wrote:
> >> I don't really care if the start of the VMA is suitable, just whether I can map
> >> the current faulting page with a THP. As far as I know, there's nothing wrong
> >> with mapping all the pages before the VMA hits a properly aligned bound with
> >> PAGESIZE pages and then aligned chunks in the middle with THP.
> > 
> > You cannot map any paged as huge into wrongly aligned VMA.
> > 
> > THP's ->index must be aligned to HPAGE_PMD_NR, so if the combination VMA's
> > ->vm_start and ->vm_pgoff doesn't allow for this, you must fallback to
> > mapping the page with PTEs. I don't see it handled properly here.
> 
> It was my assumption that if say a VMA started at an address say one page
> before a large page alignment, you could map that page with a PAGESIZE
> page but if VMA size allowed, there was a fault on the next page, and
> VMA size allowed, you could map that next range with a large page, taking
> taking the approach of mapping chunks of the VMA with the largest page
> possible.
> 
> Is it that the start of the VMA must always align or that the entire VMA
> must be properly aligned and a multiple of the PMD size (so you either map
> with all large pages or none)?

IIUC, you are missing ->vm_pgoff from the picture. The newly allocated
page must land into page cache aligned on HPAGE_PMD_NR boundary. In other
word you cannout have huge page with ->index, let say, 1.

VMA is only suitable for at least one file-THP page if:

 - (vma->vm_start >> PAGE_SHIFT) % (HPAGE_PMD_NR - 1) is equal to
    vma->vm_pgoff % (HPAGE_PMD_NR - 1)

    This guarantees right alignment in the backing page cache.

 - *and* vma->vm_end - round_up(vma->vm_start, HPAGE_PMD_SIZE) is equal or
   greater than HPAGE_PMD_SIZE.

Does it make sense?

> 
> >> This is the page that content was just read to; readpage() will unlock the page
> >> when it is done with I/O, but the page needs to be locked before it's inserted
> >> into the page cache.
> > 
> > Then you must to lock the page properly with lock_page().
> > 
> > __SetPageLocked() is fine for just allocated pages that was not exposed
> > anywhere. After ->readpage() it's not the case and it's not safe to use
> > __SetPageLocked() for them.
> 
> In the current code, it's assumed it is not exposed, because a single read
> of a large page that does no readahead before the page is inserted into the
> cache means there are no external users of the page.

You've exposed the page to the filesystem once you call ->readpage().
It *may* track the page somehow after the call.

-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ