linux-kernel - Re: [PATCH v3 00/14] mm, hugetlb: remove a hugetlb_instantiation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <5339B6F4.9000809@intel.com>
Date:	Mon, 31 Mar 2014 11:41:56 -0700
From:	Dave Hansen <dave.hansen@...el.com>
To:	Davidlohr Bueso <davidlohr@...com>
CC:	Joonsoo Kim <iamjoonsoo.kim@....com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>, Mel Gorman <mgorman@...e.de>,
	Michal Hocko <mhocko@...e.cz>,
	"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Hugh Dickins <hughd@...gle.com>,
	Davidlohr Bueso <davidlohr.bueso@...com>,
	David Gibson <david@...son.dropbear.id.au>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Joonsoo Kim <js1304@...il.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Hillf Danton <dhillf@...il.com>
Subject: Re: [PATCH v3 00/14] mm, hugetlb: remove a hugetlb_instantiation_mutex

On 03/31/2014 10:26 AM, Davidlohr Bueso wrote:
> On Mon, 2014-03-31 at 09:27 -0700, Dave Hansen wrote:
>> On 12/17/2013 10:53 PM, Joonsoo Kim wrote:
>>> * NOTE for v3
>>> - Updating patchset is so late because of other works, not issue from
>>> this patchset.
>>
>> I've got some folks with a couple TB of RAM seeing long startup times
>> with $LARGE_DATABASE_PRODUCT.  It looks to be contention on
>> hugetlb_instantiation_mutex because everyone is trying to zero hugepages
>> under that lock in parallel.  Just removing the lock sped things up
>> quite a bit.
> 
> Welcome to my world. Regarding the instantiation mutex, it is addressed,
> see commit c999c05ff595 in -next. 

Cool stuff.  That does seem to fix my parallel-fault hugetlbfs
microbenchmark.  I'll recommend that the $DATABASE folks check it as well.

> As for the clear page overhead, I brought this up in lsfmm last week,
> proposing some daemon to clear pages when we have idle cpu... but didn't
> get much positive feedback. Basically (i) not worth the additional
> complexity and (ii) can trigger different application startup times,
> which seems to be something negative. I do have a patch that implements
> huge_clear_page with non-temporal hinting but I didn't see much
> difference on my environment, would you want to give it a try?

I'd just be happy to see it happen outside of the locks.  As it stands
now, I have 1 CPU zeroing a huge page, and 159 sitting there sleeping
waiting for it to release the hugetlb_instantiation_mutex.  That's just
nonsense.  I don't think making them non-temporal will fundamentally
help that.  We need them parallelized.  According to ftrace, a
hugetlb_fault() takes ~700us.  Literally 99% of that is zeroing the page.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/