linux-kernel - Re: [RFC PATCH] mm: Promote slow memory in advance to improve performance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <31367b55-d3a4-3b2b-8d5d-86b8dfce7383@linux.alibaba.com>
Date:   Tue, 23 Nov 2021 21:27:10 +0800
From:   Baolin Wang <baolin.wang@...ux.alibaba.com>
To:     "Huang, Ying" <ying.huang@...el.com>
Cc:     akpm@...ux-foundation.org, dave.hansen@...ux.intel.com,
        ziy@...dia.com, shy828301@...il.com,
        zhongjiang-ali@...ux.alibaba.com, xlpang@...ux.alibaba.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        Mel Gorman <mgorman@...hsingularity.net>
Subject: Re: [RFC PATCH] mm: Promote slow memory in advance to improve
 performance



On 2021/11/23 10:53, Huang, Ying wrote:
> Baolin Wang <baolin.wang@...ux.alibaba.com> writes:
> 
>> Some workloads access a set of data entities will follow the data locality,
>> also known as locality of reference, which means the probability of accessing
>> some data soon after some nearby data has been accessed.
>>
>> On some systems with different memory types, which will rely on the numa
>> balancing to promote slow hot memory to fast memory to improve performance.
>> So we can promote several sequential pages on slow memory at one time
>> according to the data locality for some workloads to improve the performance.
>>
>> Testing with mysql can show about 5% performance improved as below.
>>
>> Machine: 16 CPUs, 64G DRAM, 256G AEP
>>
>> sysbench /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua
>> --mysql-user=root --mysql-password=root --oltp-test-mode=complex
>> --oltp-tables-count=65 --oltp-table-size=5000000 --threads=20 --time=600
>> --report-interval=10
>>
>> No proactive promotion:
>> transactions
>> 2259245 (3765.37 per sec.)
>> 2312605 (3854.31 per sec.)
>> 2325907 (3876.47 per sec.)
>>
>> Proactive promotion bytes=16384:
>> transactions
>> 2419023 (4031.66 per sec.)
>> 2451903 (4086.47 per sec.)
>> 2441941 (4068.68 per sec.)
> 
> This is kind of readahead to promote the page before we know it's hot.
> It can definitely benefit the performance if we predict correctly, but
> may hurt if we predict wrongly.

Right.

> 
> Is it possible for us to add some self-adaptive algorithm like that in
> readahead to determine whether to adjust the fault around window
> dynamically?  A system level knob may be not sufficient to fit all
> workloads run in system?

That's a good point, and I also thought about it, but only implemented a 
simple approach now. OK, I will try to implement one flexible approach 
to adjust the fault around window dynamically and measure the 
performance. Thanks for your input.