linux-kernel - Re: Linux 3.19-rc3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 12 Jan 2015 12:18:15 +0000
From:	Catalin Marinas <catalin.marinas@....com>
To:	Arnd Bergmann <arnd@...db.de>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	"linux-arm-kernel@...ts.infradead.org" 
	<linux-arm-kernel@...ts.infradead.org>,
	"Kirill A. Shutemov" <kirill@...temov.name>,
	Mark Langsdorf <mlangsdo@...hat.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: Linux 3.19-rc3

On Sat, Jan 10, 2015 at 09:36:13PM +0000, Arnd Bergmann wrote:
> On Saturday 10 January 2015 13:00:27 Linus Torvalds wrote:
> > > IIRC, AIX works great with 64k pages, but only because of two
> > > reasons that don't apply on Linux:
> > 
> > .. there's a few other ones:
> > 
> >  (c) nobody really runs AIX on dekstops. It's very much a DB load
> > environment, with historically some HPC.
> > 
> >  (d) the powerpc TLB fill/buildup/teardown costs are horrible, so on
> > AIX the cost of lots of small pages is much higher too.
> 
> I think (d) applies to ARM as well, since it has no hardware
> dirty/referenced bit tracking and requires the OS to mark the
> pages as invalid/readonly until the first access. ARMv8.1
> has a fix for that, but it's optional and we haven't seen any
> implementations yet.

Do you happen have any data on how significantly non-hardware
dirty/access bits impact the performance? I think it may affect the user
process start-up time a but at run-time it shouldn't be that bad.

If it is that significant, we could optimise it further in the arch
code. For example, make a fast exception path where we need to mark the
pte dirty. This would be handled by arch code without even calling
handle_pte_fault().

> > so I feel pretty confident in saying it won't happen. It's just too
> > much of a bother, for little to no actual upside. It's likely a much
> > better approach to try to instead use THP for anonymous mappings.
> 
> arm64 already supports 2MB transparent hugepages. I guess it
> wouldn't be too hard to change it so that an existing hugepage
> on an anonymous mapping that gets split up into 4KB pages gets
> split along 64KB boundaries with the contiguous mapping bit set.
> 
> Having full support for multiple hugepage sizes (64KB, 2MB and 32MB
> in case of ARM64 with 4KB PAGE_SIZE) would be even better and
> probably negate any benefits of 64KB PAGE_SIZE, but requires more
> changes to common mm code.

As I replied to your other email, I don't think that's simple for the
transparent huge pages case.

The main advantage I see with 64KB pages is not the reduced TLB pressure
but the number of levels of page tables. Take the AMD Seattle board for
example, with 4KB pages you need 4 levels but 64KB allow only 2 levels
(42-bit VA). Larger TLBs and improved walk caches (caching VA -> pmd
entry translation rather than all the way to pte/PA) make things better
but you still have the warming up time for any fork/new process as they
don't share the same TLB entries.

But as Linus said already, the trade-off with the memory wastage
is highly dependent on the targeted load.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/