lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 25 Jan 2018 11:41:03 -0800
From:   Nitin Gupta <nitin.m.gupta@...cle.com>
To:     Zi Yan <zi.yan@...rutgers.edu>
Cc:     Michal Hocko <mhocko@...nel.org>,
        Nitin Gupta <nitingupta910@...il.com>,
        steven.sistare@...cle.com,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>, Mel Gorman <mgorman@...e.de>,
        Nadav Amit <namit@...are.com>,
        Minchan Kim <minchan@...nel.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Vegard Nossum <vegard.nossum@...cle.com>,
        "Levin, Alexander" <alexander.levin@...izon.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Hillf Danton <hillf.zj@...baba-inc.com>,
        Shaohua Li <shli@...com>,
        Anshuman Khandual <khandual@...ux.vnet.ibm.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        David Rientjes <rientjes@...gle.com>,
        Rik van Riel <riel@...hat.com>, Jan Kara <jack@...e.cz>,
        Dave Jiang <dave.jiang@...el.com>,
        Jérôme Glisse <jglisse@...hat.com>,
        Matthew Wilcox <willy@...ux.intel.com>,
        Ross Zwisler <ross.zwisler@...ux.intel.com>,
        Hugh Dickins <hughd@...gle.com>, Tobin C Harding <me@...in.cc>,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2] mm: Reduce memory bloat with THP



On 01/24/2018 04:47 PM, Zi Yan wrote:
>>>> With this change, whenever an application issues MADV_DONTNEED on a
>>>> memory region, the region is marked as "space-efficient". For such
>>>> regions, a hugepage is not immediately allocated on first write.
>>> Kirill didn't like it in the previous version and I do not like this
>>> either. You are adding a very subtle side effect which might completely
>>> unexpected. Consider userspace memory allocator which uses MADV_DONTNEED
>>> to free up unused memory. Now you have put it out of THP usage
>>> basically.
>>>
>> Userpsace may want a region to be considered by khugepaged while opting
>> out of hugepage allocation on first touch. Asking userspace memory
>> allocators to have to track and reclaim unused parts of a THP allocated
>> hugepage does not seems right, as the kernel can use simple userspace
>> hints to avoid allocating extra memory in the first place.
>>
>> I agree that this patch is adding a subtle side-effect which may take
>> some applications by surprise. However, I often see the opposite too:
>> for many workloads, disabling THP is the first advise as this aggressive
>> allocation of hugepages on first touch is unexpected and is too
>> wasteful. For e.g.:
>>
>> 1) Disabling THP for TokuDB (Storage engine for MySQL, MariaDB)
>> http://www.chriscalender.com/disabling-transparent-hugepages-for-tokudb/
>>
>> 2) Disable THP on MongoDB
>> https://docs.mongodb.com/manual/tutorial/transparent-huge-pages/
>>
>> 3) Disable THP for Couchbase Server
>> https://blog.couchbase.com/often-overlooked-linux-os-tweaks/
>>
>> 4) Redis
>> http://antirez.com/news/84
>>
>>
>>> If the memory is used really scarce then we have MADV_NOHUGEPAGE.
>>>
>> It's not really about memory scarcity but a more efficient use of it.
>> Applications may want hugepage benefits without requiring any changes to
>> app code which is what THP is supposed to provide, while still avoiding
>> memory bloat.
>>
> I read these links and find that there are mainly two complains:
> 1. THP causes latency spikes, because direction compaction slows down THP allocation,
> 2. THP bloats memory footprint when jemalloc uses MADV_DONTNEED to return memory ranges smaller than
>    THP size and fails because of THP.
>
> The first complain is not related to this patch.

I'm trying to address many different THP issues and memory bloat is
first among them.
> For second one, at least with recent kernels, MADV_DONTNEED splits THPs and returns the memory range you
> specified in madvise(). Am I missing anything?
>

Yes, MADV_DONTNEED splits THPs and releases the requested range but
this is not
solving the issue of aggressive alloc-hugepage-on-first-touch policy
of THP=madvise
on MADV_HUGEPAGE regions. Sure, some workloads may prefer that policy
but for
application that don't, this patch give them an option to give hints
to the kernel to
go for gradual hugepage promotion via khugepaged only (and not on
first touch).

It's not good if an application has to track which parts of their
(implicitly allocated)
hugepage are in use and which sub-parts are free so they can issue
MADV_DONTNEED
calls on them. This approach really does not make THP "transparent"
and requires
lot of mm tracking code in userpace.

Nitin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ