linux-kernel - Re: Linux 3.19-rc3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20150113102850.GA16524@e104818-lin.cambridge.arm.com>
Date:	Tue, 13 Jan 2015 10:28:51 +0000
From:	Catalin Marinas <catalin.marinas@....com>
To:	Rik van Riel <riel@...hat.com>
Cc:	David Lang <david@...g.hm>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Mark Langsdorf <mlangsdo@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>
Subject: Re: Linux 3.19-rc3

On Tue, Jan 13, 2015 at 03:33:12AM +0000, Rik van Riel wrote:
> On 01/09/2015 09:51 PM, David Lang wrote:
> > On Fri, 9 Jan 2015, Linus Torvalds wrote:
> > 
> >> Big pages are a bad bad bad idea. They work fine for databases,
> >> and that's pretty much just about it. I'm sure there are some
> >> other loads, but they are few and far between.
> > 
> > what about a dedicated virtualization host (where your workload is
> > a handful of virtual machines), would the file cache issue still
> > be overwelming, even though it's the virtual machines accessing
> > things?
> 
> You would still have page cache inside the guest.
> 
> Using large pages in the host, and small pages in the guest
> would not give you the TLB benefits, and that is assuming
> that different page sizes in host and guest even work...

This works on ARM. The TLB caching the full VA->PA translation would
indeed stick to the guest page size as that's the input. But, depending
on the TLB implementation, it may also cache the guest PA -> real PA
translation (a TLB with the guest/Intermediate PA as input; ARMv8 also
introduces TLB invalidation ops that take such IPA as input). A miss in
the stage 1 (guest) TLB would be cheaper if it hits in the stage 2 TLB,
especially when it needs to look up the stage 2 for each level in the
stage 1 table.

But when it doesn't hit in any of the stages, it's still beneficial to
have smaller number of levels at stage 2 (host) and that's what 64KB
pages bring on ARM. If you use the maximum 4 levels in both host and
guest, a TLB miss in the guest requires 24 memory accesses to populate
it (each guest page table level entry needs a stage 2 look-up). In
practice, you may get some locality but I think the guest page table
access pattern can get quite sparse. In addition, stage 2 entries are
not as volatile as they are per VM rather than per process as the stage
1 entries.

> Using large pages in the guests gets you back to the wasted
> memory, except you are now wasting memory in a situation where
> you have less memory available in each guest. Density is a real
> consideration for virtualization.

I agree. I think guests should stick to 4KB pages (well, unless all they
need to do is mmap large database files).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/