lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 29 Jun 2023 09:41:19 +0800
From:   Yicong Yang <yangyicong@...wei.com>
To:     Barry Song <21cnbao@...il.com>, Yajun Deng <yajun.deng@...ux.dev>,
        Yicong Yang <yangyicong@...ilicon.com>,
        Tian Tao <tiantao6@...ilicon.com>
CC:     <v-songbaohua@...o.com>, Christoph Hellwig <hch@....de>,
        <corbet@....net>, <catalin.marinas@....com>, <will@...nel.org>,
        <m.szyprowski@...sung.com>, <robin.murphy@....com>,
        <paulmck@...nel.org>, <bp@...e.de>, <peterz@...radead.org>,
        <rdunlap@...radead.org>, <kim.phillips@....com>,
        <rostedt@...dmis.org>, <thunder.leizhen@...wei.com>,
        <ardb@...nel.org>, <bhe@...hat.com>, <anshuman.khandual@....com>,
        <song.bao.hua@...ilicon.com>, <linux-doc@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>, <iommu@...ts.linux.dev>,
        Petr Tesařík <petr@...arici.cz>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH] dma-contiguous: support per-numa CMA for all
 architectures

On 2023/6/26 13:32, Barry Song wrote:
> On Sun, Jun 25, 2023 at 7:30 PM Yajun Deng <yajun.deng@...ux.dev> wrote:
>>
>> June 24, 2023 8:40 AM, "Andrew Morton" <akpm@...ux-foundation.org> wrote:
>>
>>> On Mon, 15 May 2023 13:38:21 +0200 Petr Tesařík <petr@...arici.cz> wrote:
>>>
>>>> On Mon, 15 May 2023 11:23:27 +0000
>>>> "Yajun Deng" <yajun.deng@...ux.dev> wrote:
>>>>
>>>> May 15, 2023 5:49 PM, "Christoph Hellwig" <hch@....de> wrote:
>>>
>>> This looks fine to me. Can you please work with Barry to make sure
>>> the slight different place of the initcall doesn't break anything
>>> for his setup? I doubt it would, but I'd rather have a Tested-by:
>>> tag.
>>>> Barry's email is no longer in use. I can't reach him.
>>>>
>>>> Which one? I would hope that his Gmail account is still valid:
>>>>
>>>> Barry Song <21cnbao@...il.com>
>>>
>>> Maybe his kernel.org address works...
>>>
>>> I have this patch stuck in limbo for 6.4. I guess I'll carry it over
>>> into the next -rc cycle, see what happens.
>>>
>>> fwiw, it has been in -next for six weeks, no known issues.
>>
>> Hi, Barry, The slight different place of the initcall, does break anything?
> 
> i don't see a fundamental difference as anyway it is still after
> arch_numa_init()
> which is really what we depend on.
> 
> and i did a test on qemu with the command line:
> qemu-system-aarch64 -M virt,gic-version=3 -nographic \
>  -smp cpus=8 \
>  -numa node,cpus=0-1,nodeid=0 \
>  -numa node,cpus=2-3,nodeid=1 \
>  -numa node,cpus=4-5,nodeid=2 \
>  -numa node,cpus=6-7,nodeid=3 \
>  -numa dist,src=0,dst=1,val=12 \
>  -numa dist,src=0,dst=2,val=20 \
>  -numa dist,src=0,dst=3,val=22 \
>  -numa dist,src=1,dst=2,val=22 \
>  -numa dist,src=2,dst=3,val=12 \
>  -numa dist,src=1,dst=3,val=24 \
>  -m 4096M -cpu cortex-a57 -kernel arch/arm64/boot/Image \
>  -nographic -append "cma_pernuma=32M root=/dev/vda2  rw ip=dhcp
> sched_debug irqchip.gicv3_pseudo_nmi=1" \
>  -drive if=none,file=extra/ubuntu16.04-arm64.img,id=hd0 -device
> virtio-blk-device,drive=hd0 \
>  -net nic -net user,hostfwd=tcp::2222-:22
> 
> and in system, i can see all cma areas are correctly reserved:
> ~# dmesg | grep cma
> [    0.000000] cma: cma_declare_contiguous_nid(size
> 0x0000000002000000, base 0x0000000000000000, limit 0x0000000000000000
> alignment 0x0000000000000000)
> [    0.000000] cma: Reserved 32 MiB at 0x000000007ce00000
> [    0.000000] cma: dma_pernuma_cma_reserve: reserved 32 MiB on node 0
> [    0.000000] cma: cma_declare_contiguous_nid(size
> 0x0000000002000000, base 0x0000000000000000, limit 0x0000000000000000
> alignment 0x0000000000000000)
> [    0.000000] cma: Reserved 32 MiB at 0x00000000bce00000
> [    0.000000] cma: dma_pernuma_cma_reserve: reserved 32 MiB on node 1
> [    0.000000] cma: cma_declare_contiguous_nid(size
> 0x0000000002000000, base 0x0000000000000000, limit 0x0000000000000000
> alignment 0x0000000000000000)
> [    0.000000] cma: Reserved 32 MiB at 0x00000000fce00000
> [    0.000000] cma: dma_pernuma_cma_reserve: reserved 32 MiB on node 2
> [    0.000000] cma: cma_declare_contiguous_nid(size
> 0x0000000002000000, base 0x0000000000000000, limit 0x0000000000000000
> alignment 0x0000000000000000)
> [    0.000000] cma: Reserved 32 MiB at 0x0000000100000000
> [    0.000000] cma: dma_pernuma_cma_reserve: reserved 32 MiB on node 3
> [    0.000000] cma: dma_contiguous_reserve(limit 100000000)
> [    0.000000] cma: dma_contiguous_reserve: reserving 32 MiB for global area
> [    0.000000] cma: cma_declare_contiguous_nid(size
> 0x0000000002000000, base 0x0000000000000000, limit 0x0000000100000000
> alignment 0x0000000000000000)
> [    0.000000] cma: Reserved 32 MiB at 0x00000000fae00000
> [    0.000000] Kernel command line: cma_pernuma=32M root=/dev/vda2  rw
> ip=dhcp sched_debug irqchip.gicv3_pseudo_nmi=1
> [    0.000000] Memory: 3848784K/4194304K available (16128K kernel
> code, 4152K rwdata, 10244K rodata, 8512K init, 612K bss, 181680K
> reserved, 163840K cma-reserved)
> [    0.175309] cma: cma_alloc(cma (____ptrval____), count 128, align 7)
> [    0.179264] cma: cma_alloc(): returned (____ptrval____)
> [    0.179869] cma: cma_alloc(cma (____ptrval____), count 128, align 7)
> [    0.180027] cma: cma_alloc(): returned (____ptrval____)
> [    0.180187] cma: cma_alloc(cma (____ptrval____), count 128, align 7)
> [    0.180374] cma: cma_alloc(): returned (____ptrval____)
> 
> so my feeling is that this patch is fine. but I would prefer Yicong
> and Tiantao who have a real numa machine
> and we can get some real device drivers to call dma APIs to allocate
> memory from pernuma cma on arm64
> even though it is 99.9% OK.
> 

Tested on our 4 NUMA arm64 server based on mainline commit 1ef6663a587b,
this patch works well, so:

Tested-by: Yicong Yang <yangyicong@...ilicon.com>

For pernuma cma reservation:
[    0.000000] cma: cma_declare_contiguous_nid(size 0x0000000040000000, base 0x0000000000000000, limit 0x0000000000000000 alignment 0x0000000000000000)
[    0.000000] cma: Reserved 1024 MiB at 0x0000002081800000
[    0.000000] cma: dma_pernuma_cma_reserve: reserved 1024 MiB on node 0
[    0.000000] cma: cma_declare_contiguous_nid(size 0x0000000040000000, base 0x0000000000000000, limit 0x0000000000000000 alignment 0x0000000000000000)
[    0.000000] cma: Reserved 1024 MiB at 0x0000004000000000
[    0.000000] cma: dma_pernuma_cma_reserve: reserved 1024 MiB on node 1
[    0.000000] cma: cma_declare_contiguous_nid(size 0x0000000040000000, base 0x0000000000000000, limit 0x0000000000000000 alignment 0x0000000000000000)
[    0.000000] cma: Reserved 1024 MiB at 0x0000202000000000
[    0.000000] cma: dma_pernuma_cma_reserve: reserved 1024 MiB on node 2
[    0.000000] cma: cma_declare_contiguous_nid(size 0x0000000040000000, base 0x0000000000000000, limit 0x0000000000000000 alignment 0x0000000000000000)
[    0.000000] cma: Reserved 1024 MiB at 0x0000204000000000
[    0.000000] cma: dma_pernuma_cma_reserve: reserved 1024 MiB on node 3
[    0.000000] cma: dma_contiguous_reserve(limit 100000000)
[    0.000000] cma: dma_contiguous_reserve: reserving 384 MiB for global area
[    0.000000] cma: cma_declare_contiguous_nid(size 0x0000000018000000, base 0x0000000000000000, limit 0x0000000100000000 alignment 0x0000000000000000)
[    0.000000] cma: Reserved 384 MiB at 0x0000000068000000

For allocation from pernuma cma, no failure recorded:
[root@...alhost cma]# pwd
/sys/kernel/mm/cma
[root@...alhost cma]# ls
pernuma0  pernuma1  pernuma2  pernuma3  reserved
[root@...alhost cma]# cat pernuma*/alloc_pages_fail
0
0
0
0
[root@...alhost cma]# cat pernuma*/alloc_pages_success
2144
0
2132
0

Thanks.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ