[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <31367b55-d3a4-3b2b-8d5d-86b8dfce7383@linux.alibaba.com>
Date: Tue, 23 Nov 2021 21:27:10 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: "Huang, Ying" <ying.huang@...el.com>
Cc: akpm@...ux-foundation.org, dave.hansen@...ux.intel.com,
ziy@...dia.com, shy828301@...il.com,
zhongjiang-ali@...ux.alibaba.com, xlpang@...ux.alibaba.com,
linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Mel Gorman <mgorman@...hsingularity.net>
Subject: Re: [RFC PATCH] mm: Promote slow memory in advance to improve
performance
On 2021/11/23 10:53, Huang, Ying wrote:
> Baolin Wang <baolin.wang@...ux.alibaba.com> writes:
>
>> Some workloads access a set of data entities will follow the data locality,
>> also known as locality of reference, which means the probability of accessing
>> some data soon after some nearby data has been accessed.
>>
>> On some systems with different memory types, which will rely on the numa
>> balancing to promote slow hot memory to fast memory to improve performance.
>> So we can promote several sequential pages on slow memory at one time
>> according to the data locality for some workloads to improve the performance.
>>
>> Testing with mysql can show about 5% performance improved as below.
>>
>> Machine: 16 CPUs, 64G DRAM, 256G AEP
>>
>> sysbench /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua
>> --mysql-user=root --mysql-password=root --oltp-test-mode=complex
>> --oltp-tables-count=65 --oltp-table-size=5000000 --threads=20 --time=600
>> --report-interval=10
>>
>> No proactive promotion:
>> transactions
>> 2259245 (3765.37 per sec.)
>> 2312605 (3854.31 per sec.)
>> 2325907 (3876.47 per sec.)
>>
>> Proactive promotion bytes=16384:
>> transactions
>> 2419023 (4031.66 per sec.)
>> 2451903 (4086.47 per sec.)
>> 2441941 (4068.68 per sec.)
>
> This is kind of readahead to promote the page before we know it's hot.
> It can definitely benefit the performance if we predict correctly, but
> may hurt if we predict wrongly.
Right.
>
> Is it possible for us to add some self-adaptive algorithm like that in
> readahead to determine whether to adjust the fault around window
> dynamically? A system level knob may be not sufficient to fit all
> workloads run in system?
That's a good point, and I also thought about it, but only implemented a
simple approach now. OK, I will try to implement one flexible approach
to adjust the fault around window dynamically and measure the
performance. Thanks for your input.
Powered by blists - more mailing lists