lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20091029103658.GJ9640@random.random>
Date:	Thu, 29 Oct 2009 11:36:58 +0100
From:	Andrea Arcangeli <aarcange@...hat.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Andi Kleen <andi@...stfloor.org>, linux-mm@...ck.org,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Adam Litke <agl@...ibm.com>, Avi Kivity <avi@...hat.com>,
	Izik Eidus <ieidus@...hat.com>,
	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	Nick Piggin <npiggin@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel@...r.kernel.org
Subject: Re: RFC: Transparent Hugepage support

Hello Ingo, Andi, everyone,

On Thu, Oct 29, 2009 at 10:43:44AM +0100, Ingo Molnar wrote:
> 
> * Andi Kleen <andi@...stfloor.org> wrote:
> 
> > > 1GB pages can't be handled by this code, and clearly it's not 
> > > practical to hope 1G pages to materialize in the buddy (even if we
> > 
> > That seems short sightened. You do this because 2MB pages give you x% 
> > performance advantage, but then it's likely that 1GB pages will give 
> > another y% improvement and why should people stop at the smaller 
> > improvement?
> > 
> > Ignoring the gigantic pages now would just mean that this would need 
> > to be revised later again or that users still need to use hacks like 
> > libhugetlbfs.
> 
> I've read the patch and have read through this discussion and you are 
> missing the big point that it's best to do such things gradually - one 
> step at a time.
> 
> Just like we went from 2 level pagetables to 3 level pagetables, then to 
> 4 level pagetables - and we might go to 5 level pagetables in the 
> future. We didnt go from 2 level pagetables to 5 level page tables in 
> one go, despite predictions clearly pointing out the exponentially 
> increasing need for RAM.

I totally agree with your assessment.

> So your obsession with 1GB pages is misguided. If indeed transparent 
> largepages give us real benefits we can extend it to do transparent 
> gbpages as well - should we ever want to. There's nothing 'shortsighted' 
> about being gradual - the change is already ambitious enough as-is, and 
> brings very clear benefits to a difficult, decade-old problem no other 
> person was able to address.
> 
> In fact introducing transparent 2MBpages makes 1GB pages support 
> _easier_ to merge: as at that point we'll already have a (finally..) 
> successful hugetlb facility happility used by an increasing range of 
> applications.

Agreed.

> Hugetlbfs's big problem was always that it wasnt transparent and hence 
> wasnt gradual for applications. It was an opt-in and constituted an 
> interface/ABI change - that is always a big barrier to app adoption.
> 
> So i give Andrea's patch a very big thumbs up - i hope it gets reviewed 
> in fine detail and added to -mm ASAP. Our lack of decent, automatic 
> hugepage support is sticking out like a sore thumb and is hurting us in 
> high-performance setups. If largepage support within Linux has a chance, 
> this might be the way to do it.

Thanks a lot for your review!

> A small comment regarding the patch itself: i think it could be 
> simplified further by eliminating CONFIG_TRANSPARENT_HUGEPAGE and by 
> making it a natural feature of hugepage support. If the code is correct 
> i cannot see any scenario under which i wouldnt want a hugepage enabled 
> kernel i'm booting to not have transparent hugepage support as well.

The two reasons why I added a config option are:

1) because it was easy enough, gcc is smart enough to eliminate the
external calls so I didn't need to add ifdefs with the exception of
returning 0 from pmd_trans_huge and pmd_trans_frozen. I only had to
make the exports of huge_memory.c visible unconditionally so it doesn't
warn, after that I don't need to build and link huge_memory.o.

2) to avoid breaking build of archs not implementing pmd_trans_huge
and that may never be able to take advantage of it

But we could move CONFIG_TRANSPARENT_HUGEPAGE to an arch define forced
to Y on x86-64 and N on power.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ