[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f7ttthdxhdn.fsf@redhat.com>
Date: Fri, 28 Jun 2024 10:37:56 -0400
From: Aaron Conole <aconole@...hat.com>
To: Adrián Moreno <amorenoz@...hat.com>
Cc: Ilya Maximets <i.maximets@....org>, Eelco Chaudron
<echaudro@...hat.com>, netdev@...r.kernel.org, horms@...nel.org,
dev@...nvswitch.org, "David S. Miller" <davem@...emloft.net>, Eric
Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo
Abeni <pabeni@...hat.com>, Donald Hunter <donald.hunter@...il.com>,
Pravin B Shelar <pshelar@....org>
Subject: Re: [PATCH net-next v5 05/10] net: openvswitch: add emit_sample action
Adrián Moreno <amorenoz@...hat.com> writes:
> On Thu, Jun 27, 2024 at 09:30:08AM GMT, Aaron Conole wrote:
>> Ilya Maximets <i.maximets@....org> writes:
>>
>> > On 6/27/24 12:15, Adrián Moreno wrote:
>> >> On Thu, Jun 27, 2024 at 11:31:41AM GMT, Eelco Chaudron wrote:
>> >>>
>> >>>
>> >>> On 27 Jun 2024, at 11:23, Ilya Maximets wrote:
>> >>>
>> >>>> On 6/27/24 11:14, Eelco Chaudron wrote:
>> >>>>>
>> >>>>>
>> >>>>> On 27 Jun 2024, at 10:36, Ilya Maximets wrote:
>> >>>>>
>> >>>>>> On 6/27/24 09:52, Adrián Moreno wrote:
>> >>>>>>> On Thu, Jun 27, 2024 at 09:06:46AM GMT, Eelco Chaudron wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On 26 Jun 2024, at 22:34, Adrián Moreno wrote:
>> >>>>>>>>
>> >>>>>>>>> On Wed, Jun 26, 2024 at 04:28:17PM GMT, Eelco Chaudron wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> On 25 Jun 2024, at 22:51, Adrian Moreno wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Add support for a new action: emit_sample.
>> >>>>>>>>>>>
>> >>>>>>>>>>> This action accepts a u32 group id and a variable-length
>> >>>>>>>>>>> cookie and uses
>> >>>>>>>>>>> the psample multicast group to make the packet available for
>> >>>>>>>>>>> observability.
>> >>>>>>>>>>>
>> >>>>>>>>>>> The maximum length of the user-defined cookie is set to
>> >>>>>>>>>>> 16, same as
>> >>>>>>>>>>> tc_cookie, to discourage using cookies that will not be
>> >>>>>>>>>>> offloadable.
>> >>>>>>>>>>
>> >>>>>>>>>> I’ll add the same comment as I had in the user space part, and that
>> >>>>>>>>>> is that I feel from an OVS perspective this action should be called
>> >>>>>>>>>> emit_local() instead of emit_sample() to make it Datapath
>> >>>>>>>>>> independent.
>> >>>>>>>>>> Or quoting the earlier comment:
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> “I’ll start the discussion again on the naming. The name
>> >>>>>>>>>> "emit_sample()"
>> >>>>>>>>>> does not seem appropriate. This function's primary role
>> >>>>>>>>>> is to copy the
>> >>>>>>>>>> packet and send it to a local collector, which varies
>> >>>>>>>>>> depending on the
>> >>>>>>>>>> datapath. For the kernel datapath, this collector is
>> >>>>>>>>>> psample, while for
>> >>>>>>>>>> userspace, it will likely be some kind of probe. This
>> >>>>>>>>>> action is distinct
>> >>>>>>>>>> from the sample() action by design; it is a standalone
>> >>>>>>>>>> action that can
>> >>>>>>>>>> be combined with others.
>> >>>>>>>>>>
>> >>>>>>>>>> Furthermore, the action itself does not involve taking a sample; it
>> >>>>>>>>>> consistently pushes the packet to the local collector. Therefore, I
>> >>>>>>>>>> suggest renaming "emit_sample()" to "emit_local()". This
>> >>>>>>>>>> same goes for
>> >>>>>>>>>> all the derivative ATTR naming.”
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> This is a blurry semantic area.
>> >>>>>>>>> IMO, "sample" is the act of extracting (potentially a piece of)
>> >>>>>>>>> someting, in this case, a packet. It is common to only
>> >>>>>>>>> take some packets
>> >>>>>>>>> as samples, so this action usually comes with some kind of
>> >>>>>>>>> "rate", but
>> >>>>>>>>> even if the rate is 1, it's still sampling in this context.
>> >>>>>>>>>
>> >>>>>>>>> OTOH, OVS kernel design tries to be super-modular and define small
>> >>>>>>>>> combinable actions, so the rate or probability generation
>> >>>>>>>>> is done with
>> >>>>>>>>> another action which is (IMHO unfortunately) named "sample".
>> >>>>>>>>>
>> >>>>>>>>> With that interpretation of the term it would actually
>> >>>>>>>>> make more sense
>> >>>>>>>>> to rename "sample" to something like "random" (of course I'm not
>> >>>>>>>>> suggestion we do it). "sample" without any nested action
>> >>>>>>>>> that actually
>> >>>>>>>>> sends the packet somewhere is not sampling, it's just
>> >>>>>>>>> doing something or
>> >>>>>>>>> not based on a probability. Where as "emit_sample" is
>> >>>>>>>>> sampling even if
>> >>>>>>>>> it's not nested inside a "sample".
>> >>>>>>>>
>> >>>>>>>> You're assuming we are extracting a packet for sampling,
>> >>>>>>>> but this function
>> >>>>>>>> can be used for various other purposes. For instance, it
>> >>>>>>>> could handle the
>> >>>>>>>> packet outside of the OVS pipeline through an eBPF program
>> >>>>>>>> (so we are not
>> >>>>>>>> taking a sample, but continue packet processing outside of the OVS
>> >>>>>>>> pipeline). Calling it emit_sampling() in such cases could be very
>> >>>>>>>> confusing.
>> >>>>>>
>> >>>>>> We can't change the implementation of the action once it is
>> >>>>>> part of uAPI.
>> >>>>>> We have to document where users can find these packets and we
>> >>>>>> can't just
>> >>>>>> change the destination later.
>> >>>>>
>> >>>>> I'm not suggesting we change the uAPI implementation, but we
>> >>>>> could use the
>> >>>>> emit_xxx() action with an eBPF probe on the action to perform
>> >>>>> other tasks.
>> >>>>> This is just an example.
>> >>>>
>> >>>> Yeah, but as Adrian said below, you could do that with any action and
>> >>>> this doesn't change the semantics of the action itself.
>> >>>
>> >>> Well this was just an example, what if we have some other need for getting
>> >>> a packet to userspace through emit_local() other than sampling? The
>> >>> emit_sample() action naming in this case makes no sense.
>> >>>
>> >>>>>>> Well, I guess that would be clearly abusing the action. You could say
>> >>>>>>> that of anything really. You could hook into skb_consume and continue
>> >>>>>>> processing the skb but that doesn't change the intended
>> >>>>>>> behavior of the
>> >>>>>>> drop action.
>> >>>>>>>
>> >>>>>>> The intended behavior of the action is sampling, as is the intended
>> >>>>>>> behavior of "psample".
>> >>>>>>
>> >>>>>> The original OVS_ACTION_ATTR_SAMPLE "Probabilitically
>> >>>>>> executes actions",
>> >>>>>> that is it takes some packets from the whole packet stream and executes
>> >>>>>> actions of them. Without tying this to observability purposes the name
>> >>>>>> makes sense as the first definition of the word is "to take a
>> >>>>>> representative
>> >>>>>> part or a single item from a larger whole or group".
>> >>>>>>
>> >>>>>> Now, our new action doesn't have this particular semantic in a way that
>> >>>>>> it doesn't take a part of a whole packet stream but rather using the
>> >>>>>> part already taken. However, it is directly tied to the parent
>> >>>>>> OVS_ACTION_ATTR_SAMPLE action, since it reports probability
>> >>>>>> of that parent
>> >>>>>> action. If there is no parent, then probability is assumed to be 100%,
>> >>>>>> but that's just a corner case. The name of a psample module has the
>> >>>>>> same semantics in its name, it doesn't sample on it's own, but it is
>> >>>>>> assuming that sampling was performed as it relays the rate of it.
>> >>>>>>
>> >>>>>> And since we're directly tied here with both OVS_ACTION_ATTR_SAMPLE and
>> >>>>>> the psample module, the emit_sample() name makes sense to me.
>> >>>>>
>> >>>>> This is the part I don't like. emit_sample() should be treated as a
>> >>>>> standalone action. While it may have potential dependencies on
>> >>>>> OVS_ACTION_ATTR_SAMPLE, it should also be perfectly fine to use it
>> >>>>> independently.
>> >>>>
>> >>>> It is fine to use it, we just assume implicit 100% sampling.
>> >>>
>> >>> Agreed, but the name does not make sense ;) I do not think we
>> >>> currently have any actions that explicitly depend on each other
>> >>> (there might be attributes carried over) and I want to keep it
>> >>> as such.
>> >>>
>> >>>>>>>>> Having said that, I don't have a super strong favor for
>> >>>>>>>>> "emit_sample". I'm
>> >>>>>>>>> OK with "emit_local" or "emit_packet" or even just "emit".
>> >>>>>>
>> >>>>>> The 'local' or 'packet' variants are not descriptive enough
>> >>>>>> on what we're
>> >>>>>> trying to achieve and do not explain why the probability is attached to
>> >>>>>> the action, i.e. do not explain the link between this action and the
>> >>>>>> OVS_ACTION_ATTR_SAMPLE.
>> >>>>>>
>> >>>>>> emit_Psample() would be overly specific, I agree, but making
>> >>>>>> the name too
>> >>>>>> generic will also make it hard to add new actions. If we use
>> >>>>>> some overly
>> >>>>>> broad term for this one, we may have to deal with overlapping
>> >>>>>> semantics in
>> >>>>>> the future.
>> >>>>>>
>> >>>>>>>>> I don't think any term will fully satisfy everyone so I
>> >>>>>>>>> hope we can find
>> >>>>>>>>> a reasonable compromise.
>> >>>>>>>>
>> >>>>>>>> My preference would be emit_local() as we hand it off to some local
>> >>>>>>>> datapath entity.
>> >>>>>>
>> >>>>>> What is "local datapath entity" ? psample module is not part
>> >>>>>> of OVS datapath.
>> >>>>>> And what is "local" ? OpenFlow has the OFPP_LOCAL port that
>> >>>>>> is represented
>> >>>>>> by a bridge port on a datapath level, that will be another
>> >>>>>> source of confusion
>> >>>>>> as it can be interpreted as sending a packet via a local bridge port.
>> >>>>>
>> >>>>> I guess I hinted at a local exit point in the specific
>> >>>>> netdev/netlink datapath,
>> >>>>> where exit is to the local host. So maybe we should call it
>> >>>>> emit_localhost?
>> >>>>
>> >>>> For me sending to localhost means sending to a loopback
>> >>>> interface or otherwise
>> >>>> sending the packet to the host networking stack. And we're not
>> >>>> doing that.
>> >>>
>> >>> That might be confusing too... Maybe emit_external()?
>> >>
>> >> "External" was the word I used for the original userspace RFC. The
>> >> rationale being: We're sending the packet to something external from OVS
>> >> (datapath or userspace). Compared with IPFIX-based observability which
>> >> where the sample is first processed ("internally") by ovs-vswitchd.
>> >>
>> >> In userspace it kept the sampling/observability meaning because it was
>> >> part of the Flow_Sample_Collector_Set which is intrinsically an
>> >> observability thing.
>> >>
>> >> However, in the datapath we loose that meaning and could be confused
>> >> with some external packet-processing entity. How about "external_observe"
>> >> or something that somehow keeps that meaning?
>> >
>> > This semantics conversation doesn't seem productive as we're going
>> > in circles
>> > around what we already discussed what feels like at least three
>> > separate times
>> > on this and ovs-dev lists.
>>
>> +1
>>
>> > I'd say if we can't agree on OVS_ACTION_ATTR_EMIT_SAMPLE, then just call
>> > it OVS_ACTION_ATTR_SEND_TO_PSAMPLE. Simple, describes exactly what it does.
>> > And if we ever want to have "local" sampling for OVS userspace datapath,
>> > we can create a userspace-only datapath action for it and call it in a way
>> > that describes what it does, e.g. OVS_ACTION_ATTR_SEND_TO_USDT or whatever.
>> > Unlike key attributes, we can relatively safely create
>> > userspace-only actions
>> > without consequences for kernel uAPI. In fact, we have a few such actions.
>> > And we can choose which one to use based on which one is supported by the
>> > current datapath.
>>
>> I'm okay with the emit_sample or with send_to_psample. There are
>> probably hundreds of colors to paint this shed, tbh. We could argue
>> that it could even be an extension to userspace() instead of a separate
>> action, or that we could have a generic socket_write(type=psample) type
>> of action. But in the end, I don't have a strong feeling either way,
>> whether it's:
>>
>> OVS_ACTION_ATTR_EMIT_SAMPLE / emit_sample()
>> OVS_ACTION_ATTR_SEND_TO_PSAMPLE / psample() or emit_psample()
>> OVS_ACTION_ATTR_EMIT_EXTERNAL / emit_external()
>>
>> There aren't really too many differences in them, and it wouldn't bother
>> me in any case. I guess a XXX?psample() action does end up being the
>> clearest since it has 'psample' right in the name and then we can know
>> right away what it is doing.
>>
>
> The original purpose of the name was to have the same action for both
> userspace and kernel so that, name aside, the semantics
> (ACT_XXXX(group=10,cookie=0x123)) remains the same. If we break that, we
> risk having userspace and kernel actions differ in ways that makes it
> difficult to unify at the xlate/OpenFlow/OVSDB layers.
>
> But if we can enforce that somehow I guess it's OK.
I think it's less important for that. We do have actions that exist
which are datapath specific, so there is precedent.
> Thanks.
> Adrián
Powered by blists - more mailing lists