lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 11 Mar 2024 09:55:44 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Barry Song <21cnbao@...il.com>
Cc: David Hildenbrand <david@...hat.com>, Lance Yang <ioworker0@...il.com>,
 Vishal Moola <vishal.moola@...il.com>, akpm@...ux-foundation.org,
 zokeefe@...gle.com, shy828301@...il.com, mhocko@...e.com,
 fengwei.yin@...el.com, xiehuan09@...il.com, wangkefeng.wang@...wei.com,
 songmuchun@...edance.com, peterx@...hat.com, minchan@...nel.org,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] mm/madvise: enhance lazyfreeing with mTHP in
 madvise_free

[...]

>>>>> we don't want reclamation overhead later. and we want memories immediately
>>>>> available to others.
>>>>
>>>> But by that logic, you also don't want to leave the large folio partially mapped
>>>> all the way until the last subpage is CoWed. Surely you would want to reclaim it
>>>> when you reach partial map status?
>>>
>>> To some extent, I agree. But then we will have two many copies. The last
>>> subpage is small, and a safe place to copy instead.
>>>
>>> We actually had to tune userspace to decrease partial map as too much
>>> partial map both unfolded CONT-PTE and wasted too much memory. if a
>>> vma had too much partial map, we disabled mTHP on this VMA.
>>
>> I actually had a whacky idea around introducing selectable page size ABI
>> per-process that might help here. I know Android is doing work to make the
>> system 16K page compatible. You could run most of the system processes with 16K
>> ABI on top of 4K kernel. Then those processes don't even have the ability to
>> madvise/munmap/mprotect/mremap anything less than 16K alignment so that acts as
>> an anti-fragmentation mechanism while allowing non-16K capable processes to run
>> side-by-side. Just a passing thought...
> 
> Right, this project faces a challenge in supporting legacy
> 4KiB-aligned applications.
> but I don't find it will be an issue to run 16KiB-aligned applications
> on a kernel whose
> page size is 4KiB.

Yes, agreed that a 16K-aligned (or 64K-aligned) app will work without issue on
4K kernel, but it will also use getpagesize() and know what the page size is.
I'm suggesting you could actually run these apps on a 4K kernel but with a 16K
ABI and potentially get close to the native 16K performance out of them. It's
just a thought though - I don't have any data that actually shows this is better
than just running on a 4K kernel with a 4K ABI, and using 16K or 64K mTHP
opportunistically.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ