[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aGAWMLjQhKPvKx2R@pop-os.localdomain>
Date: Sat, 28 Jun 2025 09:20:00 -0700
From: Cong Wang <xiyou.wangcong@...il.com>
To: Aaron Conole <aconole@...hat.com>
Cc: dev@...nvswitch.org, netdev@...r.kernel.org,
Andrew Lunn <andrew+netdev@...n.ch>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Eelco Chaudron <echaudro@...hat.com>,
Ilya Maximets <i.maximets@....org>,
Adrián Moreno <amorenoz@...hat.com>,
Mike Pattrick <mpattric@...hat.com>,
Florian Westphal <fw@...len.de>,
John Fastabend <john.fastabend@...il.com>,
Jakub Sitnicki <jakub@...udflare.com>, Joe Stringer <joe@....org>
Subject: Re: [RFC] net: openvswitch: Inroduce a light-weight socket map
concept.
On Fri, Jun 27, 2025 at 05:00:54PM -0400, Aaron Conole wrote:
> The Open vSwitch module allows a user to implemnt a flow-based
> layer 2 virtual switch. This is quite useful to model packet
> movement analagous to programmable physical layer 2 switches.
> But the openvswitch module doesn't always strictly operate at
> layer 2, since it implements higher layer concerns, like
> fragmentation reassembly, connection tracking, TTL
> manipulations, etc. Rightly so, it isn't *strictly* a layer
> 2 virtual forwarding function.
>
> Other virtual forwarding technologies allow for additional
> concepts that 'break' this strict layer separation beyond
> what the openvswitch module provides. The most handy one for
> openvswitch to start looking at is the concept of the socket
> map, from eBPF. This is very useful for TCP connections,
> since in many cases we will do container<->container
> communication (although this can be generalized for the
> phy->container case).
>
> This patch provides two different implementations of actions
> that can be used to construct the same kind of socket map
> capability within the openvswitch module. There are additional
> ways of supporting this concept that I've discussed offline,
> but want to bring it all up for discussion on the mailing list.
> This way, "spirited debate" can occur before I spend too much
> time implementing specific userspace support for an approach
> that may not be acceptable. I did 'port' these from
> implementations that I had done some preliminary testing with
> but no guarantees that what is included actually works well.
>
> For all of these, they are implemented using raw access to
> the tcp socket. This isn't ideal, and a proper
> implementation would reuse the psock infrastructure - but
> I wanted to get something that we can all at least poke (fun)
> at rather than just being purely theoretical. Some of the
> validation that we may need (for example re-writing the
> packet's headers) have been omitted to hopefully make the
> implementations a bit easier to parse. The idea would be
> to validate these in the validate_and_copy routines.
Maybe it is time to introduce eBPF to openvswitch so that they can
share, for example, socket maps, from other layers?
Thanks.
Powered by blists - more mailing lists