lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87ilwjbn1j.fsf@yhuang6-desk2.ccr.corp.intel.com>
Date:   Tue, 23 Nov 2021 10:53:12 +0800
From:   "Huang, Ying" <ying.huang@...el.com>
To:     Baolin Wang <baolin.wang@...ux.alibaba.com>
Cc:     <akpm@...ux-foundation.org>, <dave.hansen@...ux.intel.com>,
        <ziy@...dia.com>, <shy828301@...il.com>,
        <zhongjiang-ali@...ux.alibaba.com>, <xlpang@...ux.alibaba.com>,
        <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>
Subject: Re: [RFC PATCH] mm: Promote slow memory in advance to improve
 performance

Baolin Wang <baolin.wang@...ux.alibaba.com> writes:

> Some workloads access a set of data entities will follow the data locality,
> also known as locality of reference, which means the probability of accessing
> some data soon after some nearby data has been accessed.
>
> On some systems with different memory types, which will rely on the numa
> balancing to promote slow hot memory to fast memory to improve performance.
> So we can promote several sequential pages on slow memory at one time
> according to the data locality for some workloads to improve the performance.
>
> Testing with mysql can show about 5% performance improved as below.
>
> Machine: 16 CPUs, 64G DRAM, 256G AEP
>
> sysbench /usr/share/sysbench/tests/include/oltp_legacy/oltp.lua
> --mysql-user=root --mysql-password=root --oltp-test-mode=complex
> --oltp-tables-count=65 --oltp-table-size=5000000 --threads=20 --time=600
> --report-interval=10
>
> No proactive promotion:
> transactions
> 2259245 (3765.37 per sec.)
> 2312605 (3854.31 per sec.)
> 2325907 (3876.47 per sec.)
>
> Proactive promotion bytes=16384:
> transactions
> 2419023 (4031.66 per sec.)
> 2451903 (4086.47 per sec.)
> 2441941 (4068.68 per sec.)

This is kind of readahead to promote the page before we know it's hot.
It can definitely benefit the performance if we predict correctly, but
may hurt if we predict wrongly.

Is it possible for us to add some self-adaptive algorithm like that in
readahead to determine whether to adjust the fault around window
dynamically?  A system level knob may be not sufficient to fit all
workloads run in system?

Best Regards,
Huang, Ying

> Suggested-by: Xunlei Pang <xlpang@...ux.alibaba.com>
> Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
> ---
> Note: This patch is based on "NUMA balancing: optimize memory placement
> for memory tiering system" [1] from Huang Ying.
>
> [1] https://lore.kernel.org/lkml/87bl2gsnrd.fsf@yhuang6-desk2.ccr.corp.intel.com/T/

[snip]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ