lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 13 May 2020 11:48:24 -0600
From:   David Ahern <>
To:     Lorenz Bauer <>, bpf <>,
        Networking <>,
        Martynas Pumputis <>,
        kernel-team <>
Subject: Re: "Forwarding" from TC classifier

On 5/13/20 10:40 AM, Lorenz Bauer wrote:
> We've recently open sourced a key component of our L4 load balancer:
> cls_redirect [1].
> In the commit description, I call out the following caveat:
>     cls_redirect relies on receiving encapsulated packets directly
> from a router. This is
>     because we don't have access to the neighbour tables from BPF, yet.

Can you explain more about this limitation? Why does access to neighbor
tables solve the problem?

> The code in question lives in forward_to_next_hop() [2], and does the following:
> 1. Swap source and destination MAC of the packet
> 2. Update source and destination IP address
> 3. Transmit the packet via bpf_redirect(skb->ifindex, 0)
> Really, I'd like to get rid of step 1, and instead rely on the network
> stack to switch or route
> the packet for me. The bpf_fib_lookup helper is very close to what I need. I've
> hacked around a bit, and come up with the following replacement for step 1:
>     switch (bpf_fib_lookup(skb, &fib, sizeof(fib), 0)) {
>         /* There is a cached neighbour, bpf_redirect without going
> through the stack. */
>         return bpf_redirect(...);
>         /* We have no information about this target. Let the stack handle it. */
>         return TC_ACT_OK;
>         return TC_ACT_SHOT;
>     default:
>         return TC_ACT_SHOT;
>     }
> I have a couple of questions:
> First, I think I can get BPF_FIB_LKUP_RET_NO_NEIGH if the packet needs
> to be routed,
> but there is no neighbour entry for the default gateway. Is that correct?


> Second, is it possible to originate the packet from the local machine,
> instead of keeping
> the original source address when passing the packet to the stack on NO_NEIGH?

Network address or MAC address? Swapping the network address is not a
usual part of routing a packet so I presume you mean mac but just making
sure. Swapping mac addresses should be done for all routed packets.

> This is what I get with my current approach:
>   IP (tos 0x0, ttl 64, id 25769, offset 0, flags [DF], proto UDP (17),
> length 124)
> > [bad udp cksum 0x14d3 ->
> 0x3c0d!] UDP, length 96
>   IP (tos 0x0, ttl 63, id 25769, offset 0, flags [DF], proto UDP (17),
> length 124)
> > [no cksum] UDP, length 96
>   IP (tos 0x0, ttl 64, id 51342, offset 0, flags [none], proto ICMP
> (1), length 84)
> > ICMP echo reply, id 33779, seq 0, length 64
> The first and second packet are using our custom GUE header, they
> contain an ICMP echo request. Packet three contains the answer to the
> request. As you can see, the second packet keeps the source
> address instead of using
> Third, what effect does BPF_FIB_LOOKUP_OUTPUT have? Seems like I should set it,
> but I get somewhat sensible results without it as well. Same for LOOKUP_DIRECT.

BPF_FIB_LOOKUP_OUTPUT affects the flow parameters passed to the FIB lookup:
        if (flags & BPF_FIB_LOOKUP_OUTPUT) {
                fl4.flowi4_iif = 1;
                fl4.flowi4_oif = params->ifindex;
        } else {
                fl4.flowi4_iif = params->ifindex;
                fl4.flowi4_oif = 0;

iif / oif set can have an influence on the FIB lookup result - e.g., FIB
rules directing the lookup to a table or requiring the lookup result to
use the specified device.

Usually, 'output' is for locally generated traffic headed out. XDP
programs run on ingress are from an Rx perspective and do the lookup
from the perspective of 'is this forwarded or locally delivered'.

BPF_FIB_LOOKUP_DIRECT is really  only useful for complex FIB setups -
e.g., VRF. It means skip the FIB rules and go direct to the table
associated with the device.

Powered by blists - more mailing lists