lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sun, 14 Aug 2016 01:20:12 -0600
From:	Andreas Dilger <adilger@...ger.ca>
To:	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc:	Theodore Ts'o <tytso@....edu>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	Jan Kara <jack@...e.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Hugh Dickins <hughd@...gle.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Dave Hansen <dave.hansen@...el.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	Matthew Wilcox <willy@...radead.org>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	linux-block@...r.kernel.org
Subject: Re: [PATCHv2, 00/41] ext4: support of huge pages

On Aug 12, 2016, at 12:37 PM, Kirill A. Shutemov <kirill.shutemov@...ux.intel.com> wrote:
> 
> Here's stabilized version of my patchset which intended to bring huge pages
> to ext4.
> 
> The basics are the same as with tmpfs[1] which is in Linus' tree now and
> ext4 built on top of it. The main difference is that we need to handle
> read out from and write-back to backing storage.
> 
> Head page links buffers for whole huge page. Dirty/writeback tracking
> happens on per-hugepage level.
> 
> We read out whole huge page at once. It required bumping BIO_MAX_PAGES to
> not less than HPAGE_PMD_NR. I defined BIO_MAX_PAGES to HPAGE_PMD_NR if
> huge pagecache enabled.
> 
> On split_huge_page() we need to free buffers before splitting the page.
> Page buffers takes additional pin on the page and can be a vector to mess
> with the page during split. We want to avoid this.
> If try_to_free_buffers() fails, split_huge_page() would return -EBUSY.
> 
> Readahead doesn't play with huge pages well: 128k max readahead window,
> assumption on page size, PageReadahead() to track hit/miss.  I've got it
> to allocate huge pages, but it doesn't provide any readahead as such.
> I don't know how to do this right. It's not clear at this point if we
> really need readahead with huge pages. I guess it's good enough for now.

Typically read-ahead is a loss if you are able to get large allocations on
disk, since you can get at least seek_rate * chunk_size throughput from the
disks even with random IO at that size.  With 1MB allocations and 7200 RPM drives this works out to be about 150MB/s, which is close to the throughput
of these drive already.

Cheers, Andreas

> Shadow entries ignored on allocation -- recently evicted page is not
> promoted to active list. Not sure if current workingset logic is adequate
> for huge pages. On eviction, we split the huge page and setup 4k shadow
> entries as usual.
> 
> Unlike tmpfs, ext4 makes use of tags in radix-tree. The approach I used
> for tmpfs -- 512 entries in radix-tree per-hugepages -- doesn't work well
> if we want to have coherent view on tags. So the first 8 patches of the
> patchset converts tmpfs to use multi-order entries in radix-tree.
> The same infrastructure used for ext4.
> 
> Encryption doesn't handle huge pages yet. To avoid regressions we just
> disable huge pages for the inode if it has EXT4_INODE_ENCRYPT.
> 
> With this version I don't see any xfstests regressions with huge pages enabled.
> Patch with new configurations for xfstests-bld is below.
> 
> Tested with 4k, 1k, encryption and bigalloc. All with and without
> huge=always. I think it's reasonable coverage.
> 
> The patchset is also in git:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/kas/linux.git hugeext4/v2
> 
> Please review and consider applying.
> 
> [1] http://lkml.kernel.org/r/1465222029-45942-1-git-send-email-kirill.shutemov@linux.intel.com


Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ