linux-kernel - Re: [RFC PATCH v1 0/5] Alternative mTHP swap allocator improvements

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f9f105d9-77ba-427c-9958-92710f70716b@arm.com>
Date: Wed, 19 Jun 2024 10:17:08 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: "Huang, Ying" <ying.huang@...el.com>,
 Andrew Morton <akpm@...ux-foundation.org>
Cc: Chris Li <chrisl@...nel.org>, Kairui Song <kasong@...cent.com>,
 Kalesh Singh <kaleshsingh@...gle.com>, Barry Song <baohua@...nel.org>,
 Hugh Dickins <hughd@...gle.com>, David Hildenbrand <david@...hat.com>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH v1 0/5] Alternative mTHP swap allocator improvements

On 19/06/2024 08:19, Huang, Ying wrote:
> Hi, Ryan,
> 
> Ryan Roberts <ryan.roberts@....com> writes:
> 
>> Hi All,
>>
>> Chris has been doing great work at [1] to clean up my mess in the mTHP swap
>> entry allocator.
> 
> I don't think the original behavior is something like mess.  It's just
> the first step in the correct direction.  It's straightforward and
> obviously correctly.  Then, we can optimize it step by step with data to
> justify the increased complexity.

OK, perhaps I was over-egging it by calling it a "mess". What you're describing
was my initial opinion too, but I saw Andrew complaining that we shouldn't be
merging a feature if it doesn't work. This series fixes the problem in a minimal
way - if you ignore the last patch, which is really is just a performance
optimization and could be dropped.

If we can ultimately get Chris's series to 0% fallback like this one, and
everyone is happy with the current state for v6.10, then agreed - let's
concentrate on Chris's series for v6.11.

Thanks,
Ryan

> 
>> But Barry posted a test program and results at [2] showing that
>> even with Chris's changes, there are still some fallbacks (around 5% - 25% in
>> some cases). I was interested in why that might be and ended up putting this PoC
>> patch set together to try to get a better understanding. This series ends up
>> achieving 0% fallback, even with small folios ("-s") enabled. I haven't done
>> much testing beyond that (yet) but thought it was worth posting on the strength
>> of that result alone.
>>
>> At a high level this works in a similar way to Chris's series; it marks a
>> cluster as being for a particular order and if a new cluster cannot be allocated
>> then it scans through the existing non-full clusters. But it does it by scanning
>> through the clusters rather than assembling them into a list. Cluster flags are
>> used to mark clusters that have been scanned and are known not to have enough
>> contiguous space, so the efficiency should be similar in practice.
>>
>> Because its not based around a linked list, there is less churn and I'm
>> wondering if this is perhaps easier to review and potentially even get into
>> v6.10-rcX to fix up what's already there, rather than having to wait until v6.11
>> for Chris's series? I know Chris has a larger roadmap of improvements, so at
>> best I see this as a tactical fix that will ultimately be superseeded by Chris's
>> work.
> 
> I don't think we need any mTHP swap entry allocation optimization to go
> into v6.10-rcX.  There's no functionality or performance regression.
> Per my understanding, we merge optimization when it's ready.
> 
> Hi, Andrew,
> 
> Please correct me if you don't agree.
> 
> [snip]
> 
> --
> Best Regards,
> Huang, Ying