netdev - Re: [PATCH bpf-next v2 00/13] bpfilter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a4039e82-9184-45bf-6aee-e663766d655a@mojatatu.com>
Date:   Mon, 30 Aug 2021 21:56:18 -0400
From:   Jamal Hadi Salim <jhs@...atatu.com>
To:     Dmitrii Banshchikov <me@...que.spb.ru>, bpf@...r.kernel.org
Cc:     ast@...nel.org, davem@...emloft.net, daniel@...earbox.net,
        andrii@...nel.org, kafai@...com, songliubraving@...com, yhs@...com,
        john.fastabend@...il.com, kpsingh@...nel.org,
        netdev@...r.kernel.org, rdna@...com
Subject: Re: [PATCH bpf-next v2 00/13] bpfilter

On 2021-08-29 2:35 p.m., Dmitrii Banshchikov wrote:

[..]

> And here are some performance tests.
> 
> The environment consists of two machines(sender and receiver)
> connected with 10Gbps link via switch.  The sender uses DPDK to
> simulate QUIC packets(89 bytes long) from random IP. The switch
> measures the generated traffic to be about 7066377568 bits/sec,
> 9706553 packets/sec.
> 
> The receiver is a 2 socket 2680v3 + HT and uses either iptables,
> nft or bpfilter to filter out UDP traffic.
> 
> Two tests were made. Two rulesets(default policy was to ACCEPT)
> were used in each test:
> 
> ```
> iptables -A INPUT -p udp -m udp --dport 1500 -j DROP
> ```
> and
> ```
> iptables -A INPUT -s 1.1.1.1/32 -p udp -m udp --dport 1000 -j DROP
> iptables -A INPUT -s 2.2.2.2/32 -p udp -m udp --dport 2000 -j DROP
> ...
> iptables -A INPUT -s 31.31.31.31/32 -p udp -m udp --dport 31000 -j DROP
> iptables -A INPUT -p udp -m udp --dport 1500 -j DROP
> ```
> 
> The first test measures performance of the receiver via stress-ng
> [3] in bogo-ops. The upper-bound(there are no firewall and no
> traffic) value for bogo-ops is 8148-8210. The lower bound value
> (there is traffic but no firewall) is 6567-6643.
> The stress-ng command used: stress-ng -t60 -c48 --metrics-brief.
> 
> The second test measures the number the of dropped packets. The
> receiver keeps only 1 CPU online and disables all
> others(maxcpus=1 and set number of cores per socket to 1 in
> BIOS). The number of the dropped packets is collected via
> iptables-legacy -nvL, iptables -nvL and bpftool map dump id.
> 
> Test 1: bogo-ops(the more the better)
>              iptables            nft        bpfilter
>    1 rule:  6474-6554      6483-6515       7996-8008
> 32 rules:  6374-6433      5761-5804       7997-8042
> 
> 
> Test 2: number of dropped packets(the more the better)
>              iptables            nft         bpfilter
>    1 rule:  234M-241M           220M            900M+
> 32 rules:  186M-196M        97M-98M            900M+
> 
> 
> Please let me know if you see a gap in the testing environment.

General perf testing will depend on the nature of the use case
you are trying to target.
What is the nature of the app? Is it just receiving packets and
counting? Does it exemplify something something real in your
network or is just purely benchmarking? Both are valid.
What else can it do (eg are you interested in latency accounting etc)?
What i have seen in practise for iptables deployments is a default
drop and in general an accept list. Per ruleset IP address aggregation
is typically achieved via ipset. So your mileage may vary...

Having said that:
Our testing[1] approach is typically for a worst case scenario.
i.e we make sure you structure the rulesets such that all of the
linear rulesets will be iterated and we eventually hit the default
ruleset.
We also try to reduce variability in the results. A lot of small
things could affect your reproducibility, so we try to avoid them.
For example, from what you described:
You are sending from a random IP - that means each packet will hit
a random ruleset (for the case of 32 rulesets). And some rules will
likely be hit more often than others. The likelihood of reproducing the
same results for multiple runs gets lower as you increase the number
of rulesets.
 From a collection perspective:
Looking at the nature of the CPU utilization is important
Softirq vs system calls vs user app.
Your test workload seems to be very specific to ingress host.
So in reality you are more constrained by kernel->user syscalls
(which will be hidden if you are mostly dropping in the kernel
as opposed to letting packets go to user space).

Something is not clear from your email:
You seem to indicate that no traffic was running in test 1.
If so, why would 32 rulesets give different results than 1?

cheers,
jamal

[1] https://netdevconf.info/0x15/session.html?Linux-ACL-Performance-Analysis