[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG=2xmOxYAeQ2g9M4bk_sbEYcW2XC7846FUu0CTaGQ7+jp0nsg@mail.gmail.com>
Date: Thu, 27 Jun 2024 06:48:35 -0700
From: Adrián Moreno <amorenoz@...hat.com>
To: Aaron Conole <aconole@...hat.com>
Cc: Ilya Maximets <i.maximets@....org>, Eelco Chaudron <echaudro@...hat.com>, netdev@...r.kernel.org,
horms@...nel.org, dev@...nvswitch.org,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Donald Hunter <donald.hunter@...il.com>, Pravin B Shelar <pshelar@....org>
Subject: Re: [PATCH net-next v5 05/10] net: openvswitch: add emit_sample action
On Thu, Jun 27, 2024 at 09:30:08AM GMT, Aaron Conole wrote:
> Ilya Maximets <i.maximets@....org> writes:
>
> > On 6/27/24 12:15, Adrián Moreno wrote:
> >> On Thu, Jun 27, 2024 at 11:31:41AM GMT, Eelco Chaudron wrote:
> >>>
> >>>
> >>> On 27 Jun 2024, at 11:23, Ilya Maximets wrote:
> >>>
> >>>> On 6/27/24 11:14, Eelco Chaudron wrote:
> >>>>>
> >>>>>
> >>>>> On 27 Jun 2024, at 10:36, Ilya Maximets wrote:
> >>>>>
> >>>>>> On 6/27/24 09:52, Adrián Moreno wrote:
> >>>>>>> On Thu, Jun 27, 2024 at 09:06:46AM GMT, Eelco Chaudron wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 26 Jun 2024, at 22:34, Adrián Moreno wrote:
> >>>>>>>>
> >>>>>>>>> On Wed, Jun 26, 2024 at 04:28:17PM GMT, Eelco Chaudron wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 25 Jun 2024, at 22:51, Adrian Moreno wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Add support for a new action: emit_sample.
> >>>>>>>>>>>
> >>>>>>>>>>> This action accepts a u32 group id and a variable-length cookie and uses
> >>>>>>>>>>> the psample multicast group to make the packet available for
> >>>>>>>>>>> observability.
> >>>>>>>>>>>
> >>>>>>>>>>> The maximum length of the user-defined cookie is set to 16, same as
> >>>>>>>>>>> tc_cookie, to discourage using cookies that will not be offloadable.
> >>>>>>>>>>
> >>>>>>>>>> I’ll add the same comment as I had in the user space part, and that
> >>>>>>>>>> is that I feel from an OVS perspective this action should be called
> >>>>>>>>>> emit_local() instead of emit_sample() to make it Datapath independent.
> >>>>>>>>>> Or quoting the earlier comment:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> “I’ll start the discussion again on the naming. The name "emit_sample()"
> >>>>>>>>>> does not seem appropriate. This function's primary role is to copy the
> >>>>>>>>>> packet and send it to a local collector, which varies depending on the
> >>>>>>>>>> datapath. For the kernel datapath, this collector is psample, while for
> >>>>>>>>>> userspace, it will likely be some kind of probe. This action is distinct
> >>>>>>>>>> from the sample() action by design; it is a standalone action that can
> >>>>>>>>>> be combined with others.
> >>>>>>>>>>
> >>>>>>>>>> Furthermore, the action itself does not involve taking a sample; it
> >>>>>>>>>> consistently pushes the packet to the local collector. Therefore, I
> >>>>>>>>>> suggest renaming "emit_sample()" to "emit_local()". This same goes for
> >>>>>>>>>> all the derivative ATTR naming.”
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> This is a blurry semantic area.
> >>>>>>>>> IMO, "sample" is the act of extracting (potentially a piece of)
> >>>>>>>>> someting, in this case, a packet. It is common to only take some packets
> >>>>>>>>> as samples, so this action usually comes with some kind of "rate", but
> >>>>>>>>> even if the rate is 1, it's still sampling in this context.
> >>>>>>>>>
> >>>>>>>>> OTOH, OVS kernel design tries to be super-modular and define small
> >>>>>>>>> combinable actions, so the rate or probability generation is done with
> >>>>>>>>> another action which is (IMHO unfortunately) named "sample".
> >>>>>>>>>
> >>>>>>>>> With that interpretation of the term it would actually make more sense
> >>>>>>>>> to rename "sample" to something like "random" (of course I'm not
> >>>>>>>>> suggestion we do it). "sample" without any nested action that actually
> >>>>>>>>> sends the packet somewhere is not sampling, it's just doing something or
> >>>>>>>>> not based on a probability. Where as "emit_sample" is sampling even if
> >>>>>>>>> it's not nested inside a "sample".
> >>>>>>>>
> >>>>>>>> You're assuming we are extracting a packet for sampling, but this function
> >>>>>>>> can be used for various other purposes. For instance, it could handle the
> >>>>>>>> packet outside of the OVS pipeline through an eBPF program (so we are not
> >>>>>>>> taking a sample, but continue packet processing outside of the OVS
> >>>>>>>> pipeline). Calling it emit_sampling() in such cases could be very
> >>>>>>>> confusing.
> >>>>>>
> >>>>>> We can't change the implementation of the action once it is part of uAPI.
> >>>>>> We have to document where users can find these packets and we can't just
> >>>>>> change the destination later.
> >>>>>
> >>>>> I'm not suggesting we change the uAPI implementation, but we could use the
> >>>>> emit_xxx() action with an eBPF probe on the action to perform other tasks.
> >>>>> This is just an example.
> >>>>
> >>>> Yeah, but as Adrian said below, you could do that with any action and
> >>>> this doesn't change the semantics of the action itself.
> >>>
> >>> Well this was just an example, what if we have some other need for getting
> >>> a packet to userspace through emit_local() other than sampling? The
> >>> emit_sample() action naming in this case makes no sense.
> >>>
> >>>>>>> Well, I guess that would be clearly abusing the action. You could say
> >>>>>>> that of anything really. You could hook into skb_consume and continue
> >>>>>>> processing the skb but that doesn't change the intended behavior of the
> >>>>>>> drop action.
> >>>>>>>
> >>>>>>> The intended behavior of the action is sampling, as is the intended
> >>>>>>> behavior of "psample".
> >>>>>>
> >>>>>> The original OVS_ACTION_ATTR_SAMPLE "Probabilitically executes actions",
> >>>>>> that is it takes some packets from the whole packet stream and executes
> >>>>>> actions of them. Without tying this to observability purposes the name
> >>>>>> makes sense as the first definition of the word is "to take a representative
> >>>>>> part or a single item from a larger whole or group".
> >>>>>>
> >>>>>> Now, our new action doesn't have this particular semantic in a way that
> >>>>>> it doesn't take a part of a whole packet stream but rather using the
> >>>>>> part already taken. However, it is directly tied to the parent
> >>>>>> OVS_ACTION_ATTR_SAMPLE action, since it reports probability of that parent
> >>>>>> action. If there is no parent, then probability is assumed to be 100%,
> >>>>>> but that's just a corner case. The name of a psample module has the
> >>>>>> same semantics in its name, it doesn't sample on it's own, but it is
> >>>>>> assuming that sampling was performed as it relays the rate of it.
> >>>>>>
> >>>>>> And since we're directly tied here with both OVS_ACTION_ATTR_SAMPLE and
> >>>>>> the psample module, the emit_sample() name makes sense to me.
> >>>>>
> >>>>> This is the part I don't like. emit_sample() should be treated as a
> >>>>> standalone action. While it may have potential dependencies on
> >>>>> OVS_ACTION_ATTR_SAMPLE, it should also be perfectly fine to use it
> >>>>> independently.
> >>>>
> >>>> It is fine to use it, we just assume implicit 100% sampling.
> >>>
> >>> Agreed, but the name does not make sense ;) I do not think we
> >>> currently have any actions that explicitly depend on each other
> >>> (there might be attributes carried over) and I want to keep it
> >>> as such.
> >>>
> >>>>>>>>> Having said that, I don't have a super strong favor for "emit_sample". I'm
> >>>>>>>>> OK with "emit_local" or "emit_packet" or even just "emit".
> >>>>>>
> >>>>>> The 'local' or 'packet' variants are not descriptive enough on what we're
> >>>>>> trying to achieve and do not explain why the probability is attached to
> >>>>>> the action, i.e. do not explain the link between this action and the
> >>>>>> OVS_ACTION_ATTR_SAMPLE.
> >>>>>>
> >>>>>> emit_Psample() would be overly specific, I agree, but making the name too
> >>>>>> generic will also make it hard to add new actions. If we use some overly
> >>>>>> broad term for this one, we may have to deal with overlapping semantics in
> >>>>>> the future.
> >>>>>>
> >>>>>>>>> I don't think any term will fully satisfy everyone so I hope we can find
> >>>>>>>>> a reasonable compromise.
> >>>>>>>>
> >>>>>>>> My preference would be emit_local() as we hand it off to some local
> >>>>>>>> datapath entity.
> >>>>>>
> >>>>>> What is "local datapath entity" ? psample module is not part of OVS datapath.
> >>>>>> And what is "local" ? OpenFlow has the OFPP_LOCAL port that is represented
> >>>>>> by a bridge port on a datapath level, that will be another source of confusion
> >>>>>> as it can be interpreted as sending a packet via a local bridge port.
> >>>>>
> >>>>> I guess I hinted at a local exit point in the specific netdev/netlink datapath,
> >>>>> where exit is to the local host. So maybe we should call it emit_localhost?
> >>>>
> >>>> For me sending to localhost means sending to a loopback interface or otherwise
> >>>> sending the packet to the host networking stack. And we're not doing that.
> >>>
> >>> That might be confusing too... Maybe emit_external()?
> >>
> >> "External" was the word I used for the original userspace RFC. The
> >> rationale being: We're sending the packet to something external from OVS
> >> (datapath or userspace). Compared with IPFIX-based observability which
> >> where the sample is first processed ("internally") by ovs-vswitchd.
> >>
> >> In userspace it kept the sampling/observability meaning because it was
> >> part of the Flow_Sample_Collector_Set which is intrinsically an
> >> observability thing.
> >>
> >> However, in the datapath we loose that meaning and could be confused
> >> with some external packet-processing entity. How about "external_observe"
> >> or something that somehow keeps that meaning?
> >
> > This semantics conversation doesn't seem productive as we're going in circles
> > around what we already discussed what feels like at least three separate times
> > on this and ovs-dev lists.
>
> +1
>
> > I'd say if we can't agree on OVS_ACTION_ATTR_EMIT_SAMPLE, then just call
> > it OVS_ACTION_ATTR_SEND_TO_PSAMPLE. Simple, describes exactly what it does.
> > And if we ever want to have "local" sampling for OVS userspace datapath,
> > we can create a userspace-only datapath action for it and call it in a way
> > that describes what it does, e.g. OVS_ACTION_ATTR_SEND_TO_USDT or whatever.
> > Unlike key attributes, we can relatively safely create userspace-only actions
> > without consequences for kernel uAPI. In fact, we have a few such actions.
> > And we can choose which one to use based on which one is supported by the
> > current datapath.
>
> I'm okay with the emit_sample or with send_to_psample. There are
> probably hundreds of colors to paint this shed, tbh. We could argue
> that it could even be an extension to userspace() instead of a separate
> action, or that we could have a generic socket_write(type=psample) type
> of action. But in the end, I don't have a strong feeling either way,
> whether it's:
>
> OVS_ACTION_ATTR_EMIT_SAMPLE / emit_sample()
> OVS_ACTION_ATTR_SEND_TO_PSAMPLE / psample() or emit_psample()
> OVS_ACTION_ATTR_EMIT_EXTERNAL / emit_external()
>
> There aren't really too many differences in them, and it wouldn't bother
> me in any case. I guess a XXX?psample() action does end up being the
> clearest since it has 'psample' right in the name and then we can know
> right away what it is doing.
>
The original purpose of the name was to have the same action for both
userspace and kernel so that, name aside, the semantics
(ACT_XXXX(group=10,cookie=0x123)) remains the same. If we break that, we
risk having userspace and kernel actions differ in ways that makes it
difficult to unify at the xlate/OpenFlow/OVSDB layers.
But if we can enforce that somehow I guess it's OK.
Thanks.
Adrián
Powered by blists - more mailing lists