[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <88FCC7AA-FAAA-4B87-B382-50BD54B2886B@nvidia.com>
Date: Wed, 09 Jun 2021 14:30:18 -0400
From: Zi Yan <ziy@...dia.com>
To: Mel Gorman <mgorman@...hsingularity.net>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Dave Hansen <dave.hansen@...ux.intel.com>,
Vlastimil Babka <vbabka@...e.cz>,
Michal Hocko <mhocko@...nel.org>,
Jesper Dangaard Brouer <brouer@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
Linux-MM <linux-mm@...ck.org>
Subject: Re: [PATCH 2/2] mm/page_alloc: Allow high-order pages to be stored on the per-cpu lists
On 3 Jun 2021, at 10:22, Mel Gorman wrote:
> The per-cpu page allocator (PCP) only stores order-0 pages. This means
> that all THP and "cheap" high-order allocations including SLUB contends
> on the zone->lock. This patch extends the PCP allocator to store THP and
> "cheap" high-order pages. Note that struct per_cpu_pages increases in
> size to 256 bytes (4 cache lines) on x86-64.
>
> Note that this is not necessarily a universal performance win because of
> how it is implemented. High-order pages can cause pcp->high to be exceeded
> prematurely for lower-orders so for example, a large number of THP pages
> being freed could release order-0 pages from the PCP lists. Hence, much
> depends on the allocation/free pattern as observed by a single CPU to
> determine if caching helps or hurts a particular workload.
>
> That said, basic performance testing passed. The following is a netperf
> UDP_STREAM test which hits the relevant patches as some of the network
> allocations are high-order.
>
> netperf-udp
> 5.13.0-rc2 5.13.0-rc2
> mm-pcpburst-v3r4 mm-pcphighorder-v1r7
> Hmean send-64 261.46 ( 0.00%) 266.30 * 1.85%*
> Hmean send-128 516.35 ( 0.00%) 536.78 * 3.96%*
> Hmean send-256 1014.13 ( 0.00%) 1034.63 * 2.02%*
> Hmean send-1024 3907.65 ( 0.00%) 4046.11 * 3.54%*
> Hmean send-2048 7492.93 ( 0.00%) 7754.85 * 3.50%*
> Hmean send-3312 11410.04 ( 0.00%) 11772.32 * 3.18%*
> Hmean send-4096 13521.95 ( 0.00%) 13912.34 * 2.89%*
> Hmean send-8192 21660.50 ( 0.00%) 22730.72 * 4.94%*
> Hmean send-16384 31902.32 ( 0.00%) 32637.50 * 2.30%*
>
> From a functional point of view, a patch like this is necessary to
> make bulk allocation of high-order pages work with similar performance
> to order-0 bulk allocations. The bulk allocator is not updated in this
> series as it would have to be determined by bulk allocation users how
> they want to track the order of pages allocated with the bulk allocator.
>
> Signed-off-by: Mel Gorman <mgorman@...hsingularity.net>
> Acked-by: Vlastimil Babka <vbabka@...e.cz>
> ---
> include/linux/mmzone.h | 20 +++++-
> mm/internal.h | 2 +-
> mm/page_alloc.c | 159 +++++++++++++++++++++++++++++------------
> mm/swap.c | 2 +-
> 4 files changed, 135 insertions(+), 48 deletions(-)
>
Hi Mel,
I am not able to boot my QEMU VM with v5.13-rc5-mmotm-2021-06-07-18-33.
git bisect points to this patch. The VM got stuck at “Booting from ROM…”.
My kernel config is attached and my qemu command is:
qemu-system-x86_64 -kernel ~/repos/linux-1gb-thp/arch/x86/boot/bzImage \
-drive file=~/qemu-image/vm.qcow2,if=virtio \
-append "nokaslr root=/dev/vda1 rw console=ttyS0 " \
-pidfile vm.pid \
-netdev user,id=mynet0,hostfwd=tcp::11022-:22 \
-device virtio-net-pci,netdev=mynet0 \
-m 16g -smp 6 -cpu host -enable-kvm -nographic \
-machine hmat=on -object memory-backend-ram,size=8g,id=m0 \
-object memory-backend-ram,size=8g,id=m1 \
-numa node,memdev=m0,nodeid=0 -numa node,memdev=m1,nodeid=1
The attached config has THP disabled. The VM cannot boot with THP enabled,
either.
—
Best Regards,
Yan, Zi
View attachment ".config" of type "text/plain" (134717 bytes)
Download attachment "signature.asc" of type "application/pgp-signature" (855 bytes)
Powered by blists - more mailing lists