netdev - Re: [RFC PATCHv2 bpf-next 1/2] xdp: add a new helper for dev map multicast support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <874kstmlhz.fsf@toke.dk>
Date:   Wed, 06 May 2020 12:00:08 +0200
From:   Toke Høiland-Jørgensen <toke@...hat.com>
To:     Hangbin Liu <liuhangbin@...il.com>
Cc:     bpf@...r.kernel.org, netdev@...r.kernel.org,
        Jiri Benc <jbenc@...hat.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Eelco Chaudron <echaudro@...hat.com>, ast@...nel.org,
        Daniel Borkmann <daniel@...earbox.net>,
        Lorenzo Bianconi <lorenzo.bianconi@...hat.com>
Subject: Re: [RFC PATCHv2 bpf-next 1/2] xdp: add a new helper for dev map multicast support

Hangbin Liu <liuhangbin@...il.com> writes:

> Hi Toke,
>
> Thanks for your review, please see replies below.
>
> On Fri, Apr 24, 2020 at 04:34:49PM +0200, Toke Høiland-Jørgensen wrote:
>> >
>> > The general data path is kept in net/core/filter.c. The native data
>> > path is in kernel/bpf/devmap.c so we can use direct calls to
>> > get better performace.
>> 
>> Got any performance numbers? :)
>
> No, I haven't test the performance. Do you have any suggestions about how
> to test it? I'd like to try forwarding pkts to 10+ ports. But I don't know
> how to test the throughput. I don't think netperf or iperf supports
> this.

What I usually do when benchmarking XDP_REDIRECT is to just use pktgen
(samples/pktgen in the kernel source tree) on another machine,
specifically, like this:

./pktgen_sample03_burst_single_flow.sh  -i enp1s0f1 -d 10.70.2.2 -m ec:0d:9a:db:11:35 -t 4  -s 64

(adjust iface, IP and MAC address to your system, of course). That'll
flood the target machine with small UDP packets. On that machine, I then
run the 'xdp_redirect_map' program from samples/bpf. The bpf program
used by that sample will update an internal counter for every packet,
and the userspace prints it out, which gives you the performance (in
PPS). So just modifying that sample to using your new multicast helper
(and comparing it to regular REDIRECT to a single device) would be a
first approximation of a performance test.

[...]

>> > +	devmap_get_next_key(map, NULL, &key);
>> > +
>> > +	for (;;) {
>> 
>> I wonder if we should require DEVMAP_HASH maps to be indexed by ifindex
>> to avoid the loop?
>
> I guess it's not easy to force user to index the map by ifindex.

Well, the way to 'force the user' is just to assume that this is the
case, and if the map is filled in wrong, things just won't work ;)

>> > +	xdpf = convert_to_xdp_frame(xdp);
>> > +	if (unlikely(!xdpf))
>> > +		return -EOVERFLOW;
>> 
>> You do a clone for each map entry below, so I think you end up leaking
>> this initial xdpf? Also, you'll end up with one clone more than
>> necessary - redirecting to two interfaces should only require 1 clone,
>> you're doing 2.
>
> We don't know which is the latest one. So we need to keep the initial
> for clone. Is it enough to call xdp_release_frame() after the for
> loop?

You could do something like:

bool first = true;
for (;;) {

[...]

           if (!first) {
   		nxdpf = xdpf_clone(xdpf);
   		if (unlikely(!nxdpf))
   			return -ENOMEM;
   		bq_enqueue(dev, nxdpf, dev_rx);
           } else {
   		bq_enqueue(dev, xdpf, dev_rx);
   		first = false;
           }
}

/* didn't find anywhere to forward to, free buf */
if (first)
   xdp_return_frame_rx_napi(xdpf);



[...]

>> This duplication bugs me; maybe we should try to consolidate the generic
>> and native XDP code paths?
>
> Yes, I have tried to combine these two functions together. But one is generic
> code path and another is XDP code patch. One use skb_clone and another
> use xdpf_clone(). There are also some extra checks for XDP code. So maybe
> we'd better just keep it as it is.

Yeah, guess it may not be as simple as I'd like it to be ;)
Let's keep it this way for now at least; we can always consolidate in a
separate patch series.

-Toke