netdev - Re: [PATCH net-next,v3 0/9] netfilter: flowtable bridge and vlan enhancements

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201114090347.2e7c1457@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Sat, 14 Nov 2020 09:03:47 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Tobias Waldekranz <tobias@...dekranz.com>
Cc:     Pablo Neira Ayuso <pablo@...filter.org>,
        netfilter-devel@...r.kernel.org, davem@...emloft.net,
        netdev@...r.kernel.org, razor@...ckwall.org, jeremy@...zel.net
Subject: Re: [PATCH net-next,v3 0/9] netfilter: flowtable bridge and vlan
 enhancements

On Sat, 14 Nov 2020 15:00:03 +0100 Tobias Waldekranz wrote:
> On Sat, Nov 14, 2020 at 12:59, Pablo Neira Ayuso <pablo@...filter.org> wrote:
> > If any of the flowtable device goes down / removed, the entries are
> > removed from the flowtable. This means packets of existing flows are
> > pushed up back to classic bridge / forwarding path to re-evaluate the
> > fast path.
> >
> > For each new flow, the fast path that is selected freshly, so they use
> > the up-to-date FDB to select a new bridge port.
> >
> > Existing flows still follow the old path. The same happens with FIB
> > currently.
> >
> > It should be possible to explore purging entries in the flowtable that
> > are stale due to changes in the topology (either in FDB or FIB).
> >
> > What scenario do you have specifically in mind? Something like VM
> > migrates from one bridge port to another?  

Indeed, 2 VMs A and B, talking to each other, A is _outside_ the
system (reachable via eth0), B is inside (veth1). When A moves inside
and gets its veth. Neither B's veth1 not eth0 will change state, so
cache wouldn't get flushed, right?

> This should work in the case when the bridge ports are normal NICs or
> switchdev ports, right?
> 
> In that case, relying on link state is brittle as you can easily have a
> switch or a media converter between the bridge and the end-station:
> 
>         br0                  br0
>         / \                  / \
>     eth0   eth1          eth0   eth1
>      /      \      =>     /      \
>   [sw0]     [sw1]      [sw0]     [sw1]
>    /          \         /          \
>   A                                 A
> 
> In a scenario like this, A has clearly moved. But neither eth0 nor eth1
> has seen any changes in link state.
> 
> This particular example is a bit contrived. But this is essentially what
> happens in redundant topologies when reconfigurations occur (e.g. STP).
> 
> These protocols will typically signal reconfigurations to all bridges
> though, so as long as the affected flows are flushed at the same time as
> the FDB it should work.
> 
> Interesting stuff!

Agreed, could be interesting for all NAT/conntrack setups, not just VMs.