netdev - Re: tc filter insertion rate degradation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iLSVPwjEi2ZFUCqUXV05LqY3Jx5qbQCEiayo_s-UZHcAw@mail.gmail.com>
Date:   Tue, 22 Jan 2019 14:40:30 -0800
From:   Eric Dumazet <edumazet@...gle.com>
To:     Tejun Heo <tj@...nel.org>
Cc:     Vlad Buslov <vladbu@...lanox.com>, Dennis Zhou <dennis@...nel.org>,
        Linux Kernel Network Developers <netdev@...r.kernel.org>,
        Yevgeny Kliteynik <kliteyn@...lanox.com>,
        Yossef Efraim <yossefe@...lanox.com>,
        Maor Gottlieb <maorg@...lanox.com>
Subject: Re: tc filter insertion rate degradation

On Tue, Jan 22, 2019 at 1:18 PM Tejun Heo <tj@...nel.org> wrote:
>
> Hello,
>

> Percpu storage is expensive and cache line sharing tends to be less of
> a problem (cuz they're per-cpu), so it is useful to support custom
> alignments for tighter packing.
>


We have BPF percpu maps of two 8-byte counters  (packets and bytes
counter), with millions of slots.

We update the pair for every packet sent on the hosts.

BPF uses an alignment of 8 (that can not be changed/tuned, at least
all call sites from kernel/bpf/hashtab.c )

If we are lucky, all these pairs are allocated using a single cache line.
But when we are not lucky, 25% of the pairs are crossing a cache line,
reducing performance under DDOS.

Using a nicer alignment in our case does not consume more ram, and we
did not notice
extra cost of per-cpu allocations because we keep them in the slow
path (control path)