[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5f49527acaf5d_3ca6d208e3@john-XPS-13-9370.notmuch>
Date: Fri, 28 Aug 2020 11:52:42 -0700
From: John Fastabend <john.fastabend@...il.com>
To: Lukas Wunner <lukas@...ner.de>,
Pablo Neira Ayuso <pablo@...filter.org>,
Jozsef Kadlecsik <kadlec@...filter.org>,
Florian Westphal <fw@...len.de>
Cc: netfilter-devel@...r.kernel.org, coreteam@...filter.org,
netdev@...r.kernel.org, Daniel Borkmann <daniel@...earbox.net>,
Alexei Starovoitov <ast@...nel.org>,
Eric Dumazet <edumazet@...gle.com>,
Thomas Graf <tgraf@...g.ch>, Laura Garcia <nevola@...il.com>,
David Miller <davem@...emloft.net>
Subject: RE: [PATCH nf-next v3 3/3] netfilter: Introduce egress hook
Lukas Wunner wrote:
> Commit e687ad60af09 ("netfilter: add netfilter ingress hook after
> handle_ing() under unique static key") introduced the ability to
> classify packets on ingress.
>
> Support the same on egress. This allows filtering locally generated
> traffic such as DHCP, or outbound AF_PACKETs in general. It will also
> allow introducing in-kernel NAT64 and NAT46. A patch for nftables to
> hook up egress rules from user space has been submitted separately.
>
> Position the hook immediately before a packet is handed to traffic
> control and then sent out on an interface, thereby mirroring the ingress
> order. This order allows marking packets in the netfilter egress hook
> and subsequently using the mark in tc. Another benefit of this order is
> consistency with a lot of existing documentation which says that egress
> tc is performed after netfilter hooks.
>
> To avoid a performance degradation in the default case (with neither
> netfilter nor traffic control used), Daniel Borkmann suggests "a single
> static_key which wraps an empty function call entry which can then be
> patched by the kernel at runtime. Inside that trampoline we can still
> keep the ordering [between netfilter and traffic control] intact":
>
> https://lore.kernel.org/netdev/20200318123315.GI979@breakpoint.cc/
>
> To this end, introduce nf_sch_egress() which is dynamically patched into
> __dev_queue_xmit(), contingent on egress_needed_key. Inside that
> function, nf_egress() and sch_handle_egress() is called, each contingent
> on its own separate static_key.
>
> nf_sch_egress() is declared noinline per Florian Westphal's suggestion.
> This change alone causes a speedup if neither netfilter nor traffic
> control is used, apparently because it reduces instruction cache
> pressure. The same effect was previously observed by Eric Dumazet for
> the ingress path:
>
> https://lore.kernel.org/netdev/1431387038.566.47.camel@edumazet-glaptop2.roam.corp.google.com/
>
> Overall, performance improves with this commit if neither netfilter nor
> traffic control is used. However it degrades a little if only traffic
> control is used, due to the "noinline", the additional outer static key
> and the added netfilter code:
>
> * Before: 4730418pps 2270Mb/sec (2270600640bps)
> * After: 4759206pps 2284Mb/sec (2284418880bps)
These baseline numbers seem low to me.
>
> * Before + tc: 4063912pps 1950Mb/sec (1950677760bps)
> * After + tc: 4007728pps 1923Mb/sec (1923709440bps)
>
> * After + nft: 3714546pps 1782Mb/sec (1782982080bps)
>
> Measured on a bare-metal Core i7-3615QM.
OK I have some server class systems here I would like to run these
benchmarks again on to be sure we don't have any performance
regressions on that side.
I'll try to get to it asap, but likely will be Monday morning
by the time I get to it. I assume that should be no problem
seeing we are only on rc2.
Thanks.
>
> Commands to perform a measurement:
> ip link add dev foo type dummy
> ip link set dev foo up
> modprobe pktgen
> echo "add_device foo" > /proc/net/pktgen/kpktgend_3
> samples/pktgen/pktgen_bench_xmit_mode_queue_xmit.sh -i foo -n 400000000 -m "11:11:11:11:11:11" -d 1.1.1.1
Thats a single thread correct? -t option if I recall correctly.
I think we should also try with many threads to see if
that makes a difference. I guess probably not, but lets
see.
>
> Commands to enable egress traffic control:
> tc qdisc add dev foo clsact
> tc filter add dev foo egress bpf da bytecode '1,6 0 0 0,'
>
> Commands to enable egress netfilter:
> nft add table netdev t
> nft add chain netdev t co \{ type filter hook egress device foo priority 0 \; \}
> nft add rule netdev t co ip daddr 4.3.2.1/32 drop
>
I'll give above a try.
Powered by blists - more mailing lists