netdev - Re: [PATCH v1] cgroup/rstat: add cgroup_rstat_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ca46435a-f67a-4d85-bf8d-b5d3289b6185@redhat.com>
Date: Fri, 3 May 2024 10:30:02 -0400
From: Waiman Long <longman@...hat.com>
To: Jesper Dangaard Brouer <hawk@...nel.org>, tj@...nel.org,
 hannes@...xchg.org, lizefan.x@...edance.com, cgroups@...r.kernel.org,
 yosryahmed@...gle.com
Cc: netdev@...r.kernel.org, linux-mm@...ck.org, shakeel.butt@...ux.dev,
 kernel-team@...udflare.com, Arnaldo Carvalho de Melo <acme@...nel.org>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH v1] cgroup/rstat: add cgroup_rstat_cpu_lock helpers and
 tracepoints


On 5/3/24 10:00, Jesper Dangaard Brouer wrote:
>> I may have mistakenly thinking the lock hold time refers to just the 
>> cpu_lock. Your reported times here are about the cgroup_rstat_lock. 
>> Right? If so, the numbers make sense to me.
>>
>
> True, my reported number here are about the cgroup_rstat_lock.
> Glad to hear, we are more aligned then 🙂
>
> Given I just got some prod machines online with this patch
> cgroup_rstat_cpu_lock tracepoints, I can give you some early results,
> about hold-time for the cgroup_rstat_cpu_lock.
>
> From this oneliner bpftrace commands:
>
>   sudo bpftrace -e '
>          tracepoint:cgroup:cgroup_rstat_cpu_lock_contended {
>            @start[tid]=nsecs; @cnt[probe]=count()}
>          tracepoint:cgroup:cgroup_rstat_cpu_locked {
>            $now=nsecs;
>            if (args->contended) {
>              @wait_per_cpu_ns=hist($now-@...rt[tid]); 
> delete(@start[tid]);}
>            @cnt[probe]=count(); @locked[tid]=$now}
>          tracepoint:cgroup:cgroup_rstat_cpu_unlock {
>            $now=nsecs;
>            @locked_per_cpu_ns=hist($now-@...ked[tid]); 
> delete(@locked[tid]);
>            @cnt[probe]=count()}
>          interval:s:1 {time("%H:%M:%S "); print(@wait_per_cpu_ns);
>            print(@locked_per_cpu_ns); print(@cnt); clear(@cnt);}'
>
> Results from one 1 sec period:
>
> 13:39:55 @wait_per_cpu_ns:
> [512, 1K)              3 |      |
> [1K, 2K)              12 |@      |
> [2K, 4K)             390 
> |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [4K, 8K)              70 |@@@@@@@@@      |
> [8K, 16K)             24 |@@@      |
> [16K, 32K)           183 |@@@@@@@@@@@@@@@@@@@@@@@@      |
> [32K, 64K)            11 |@      |
>
> @locked_per_cpu_ns:
> [256, 512)         75592 |@      |
> [512, 1K)        2537357 
> |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
> [1K, 2K)          528615 |@@@@@@@@@@      |
> [2K, 4K)          168519 |@@@      |
> [4K, 8K)          162039 |@@@      |
> [8K, 16K)         100730 |@@      |
> [16K, 32K)         42276 |      |
> [32K, 64K)          1423 |      |
> [64K, 128K)           89 |      |
>
>  @cnt[tracepoint:cgroup:cgroup_rstat_cpu_lock_contended]: 3 /sec
>  @cnt[tracepoint:cgroup:cgroup_rstat_cpu_unlock]: 3200  /sec
>  @cnt[tracepoint:cgroup:cgroup_rstat_cpu_locked]: 3200  /sec
>
>
> So, we see "flush-code-path" per-CPU-holding @locked_per_cpu_ns isn't
> exceeding 128 usec.
>
> My latency requirements, to avoid RX-queue overflow, with 1024 slots,
> running at 25 Gbit/s, is 27.6 usec with small packets, and 500 usec
> (0.5ms) with MTU size packets.  This is very close to my latency
> requirements. 

Thanks for sharing the data.

This is more aligned with what I would have expected. Still, a high up 
to 128 usec is still on the high side. I remembered during my latency 
testing when I worked on cpu_lock latency patch, it was in the 2 digit 
range. Perhaps there are other sources of noise or the update list is 
really long. Anyway, it may be a bit hard to reach the 27.6 usec target 
for small packets.

Cheers,
Longman