lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20160814124009.GA16848@node.shutemov.name>
Date:	Sun, 14 Aug 2016 15:40:09 +0300
From:	"Kirill A. Shutemov" <kirill@...temov.name>
To:	Andreas Dilger <adilger@...ger.ca>
Cc:	"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
	Theodore Ts'o <tytso@....edu>,
	Andreas Dilger <adilger.kernel@...ger.ca>,
	Jan Kara <jack@...e.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Hugh Dickins <hughd@...gle.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Dave Hansen <dave.hansen@...el.com>,
	Vlastimil Babka <vbabka@...e.cz>,
	Matthew Wilcox <willy@...radead.org>,
	Ross Zwisler <ross.zwisler@...ux.intel.com>,
	linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	linux-block@...r.kernel.org
Subject: Re: [PATCHv2, 00/41] ext4: support of huge pages

On Sun, Aug 14, 2016 at 01:20:12AM -0600, Andreas Dilger wrote:
> On Aug 12, 2016, at 12:37 PM, Kirill A. Shutemov <kirill.shutemov@...ux.intel.com> wrote:
> > 
> > Here's stabilized version of my patchset which intended to bring huge pages
> > to ext4.
> > 
> > The basics are the same as with tmpfs[1] which is in Linus' tree now and
> > ext4 built on top of it. The main difference is that we need to handle
> > read out from and write-back to backing storage.
> > 
> > Head page links buffers for whole huge page. Dirty/writeback tracking
> > happens on per-hugepage level.
> > 
> > We read out whole huge page at once. It required bumping BIO_MAX_PAGES to
> > not less than HPAGE_PMD_NR. I defined BIO_MAX_PAGES to HPAGE_PMD_NR if
> > huge pagecache enabled.
> > 
> > On split_huge_page() we need to free buffers before splitting the page.
> > Page buffers takes additional pin on the page and can be a vector to mess
> > with the page during split. We want to avoid this.
> > If try_to_free_buffers() fails, split_huge_page() would return -EBUSY.
> > 
> > Readahead doesn't play with huge pages well: 128k max readahead window,
> > assumption on page size, PageReadahead() to track hit/miss.  I've got it
> > to allocate huge pages, but it doesn't provide any readahead as such.
> > I don't know how to do this right. It's not clear at this point if we
> > really need readahead with huge pages. I guess it's good enough for now.
> 
> Typically read-ahead is a loss if you are able to get large allocations on
> disk, since you can get at least seek_rate * chunk_size throughput from the
> disks even with random IO at that size.  With 1MB allocations and 7200
> RPM drives this works out to be about 150MB/s, which is close to the
> throughput of these drive already.

I'm more worried about not about throughput, but latancy spikes once we
cross huge page boundaries. We can get cache miss where we had hit with
small pages.

-- 
 Kirill A. Shutemov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ