lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87a554kgo4.fsf@oracle.com>
Date: Tue, 15 Jul 2025 19:34:03 -0700
From: Ankur Arora <ankur.a.arora@...cle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Ankur Arora <ankur.a.arora@...cle.com>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, x86@...nel.org, akpm@...ux-foundation.org,
        david@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        hpa@...or.com, mingo@...hat.com, mjguzik@...il.com, luto@...nel.org,
        peterz@...radead.org, acme@...nel.org, tglx@...utronix.de,
        willy@...radead.org, raghavendra.kt@....com,
        boris.ostrovsky@...cle.com, konrad.wilk@...cle.com
Subject: Re: [PATCH v5 07/14] perf bench mem: Allow chunking on a memory region


Namhyung Kim <namhyung@...nel.org> writes:

> On Wed, Jul 09, 2025 at 05:59:19PM -0700, Ankur Arora wrote:
>> There can be a significant gap in memset/memcpy performance depending
>> on the size of the region being operated on.
>>
>> With chunk-size=4kb:
>>
>>   $ echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
>>
>>   $ perf bench mem memset -p 4kb -k 4kb -s 4gb -l 10 -f x86-64-stosq
>>   # Running 'mem/memset' benchmark:
>>   # function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
>>   # Copying 4gb bytes ...
>>
>>       13.011655 GB/sec
>>
>> With chunk-size=1gb:
>>
>>   $ echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
>>
>>   $ perf bench mem memset -p 4kb -k 1gb -s 4gb -l 10 -f x86-64-stosq
>>   # Running 'mem/memset' benchmark:
>>   # function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
>>   # Copying 4gb bytes ...
>>
>>       21.936355 GB/sec
>>
>> So, allow the user to specify the chunk-size.
>>
>> The default value is identical to the total size of the region, which
>> preserves current behaviour.
>>
>> Signed-off-by: Ankur Arora <ankur.a.arora@...cle.com>
>
> Again, please update the documentation.  With that,
>
> Reviewed-by: Namhyung Kim <namhyung@...nel.org>
>
Thanks! Will do.

--
ankur

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ