[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <959012cfd753586b81ff60b37301247849eb274c.camel@sipsolutions.net>
Date: Fri, 19 Aug 2022 18:57:01 +0200
From: Johannes Berg <johannes@...solutions.net>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, corbet@....net,
stephen@...workplumber.org, sdf@...gle.com, ecree.xilinx@...il.com,
benjamin.poirier@...il.com, idosch@...sch.org,
f.fainelli@...il.com, jiri@...nulli.us, dsahern@...nel.org,
fw@...len.de, linux-doc@...r.kernel.org, jhs@...atatu.com,
tgraf@...g.ch, jacob.e.keller@...el.com, svinota.saveliev@...il.com
Subject: Re: [PATCH net-next 2/2] docs: netlink: basic introduction to
Netlink
On Fri, 2022-08-19 at 09:20 -0700, Jakub Kicinski wrote:
> > >
> > > +Headers in netlink must be aligned to 4 bytes from the start of the message,
> >
> > s/Headers/Attribute headers/ perhaps?
>
> Theoretically I think we also align what I called "fixed metadata
> headers", practically all of those are multiple of 4 :S
But they're not really aligned, are they? Hmm. Well I guess practically
it doesn't matter. I just read this and wasn't really sure what the
mention of "[h]eaders" was referring to in this context.
> > > +hence the extra ``\0\0`` at the end of the message.
> >
> > And I think technically for the _last_ attribute it wouldn't be needed?
>
> True, it's not strictly necessary AFAIU. Should I mention it
> or would that be over-complicating things?
>
> I believe that kernel will accept both forms (without tripping
> the trailing data warning), and both the kernel and mnl will pad
> out the last attr.
Yeah, probably not worth mentioning it.
I think what threw me off was the explicit mention of "at the end of the
message" - perhaps just say "after the family name attribute"?
> > > +``NLMSGERR_ATTR_MSG`` carries a message in English describing
> > > +the encountered problem. These messages are far more detailed
> > > +than what can be expressed thru standard UNIX error codes.
> >
> > "through"?
>
> How much do you care? Maybe Jon has guidelines?
Hah, not much really.
> > > +Querying family information is useful in rare cases when user space needs
> >
> > debatable if that's "rare", but yeah, today it's not done much :)
>
> Some of the text is written with the implicit goal of comforting
> the newcomer ;)
:-)
In this document it just feels like saying it _should_ be rare, but I'm
not sure it should? We ignore it a lot in practice, but maybe we should
be doing it more?
> > I think this could be written a bit better - we call this thing a "port
> > ID" internally now, and yes, it might default to a process ID (more
> > specifically task group ID) ... but it feels like this could explain
> > bind vs. autobind etc. a bit more? And IMHO it should focus less on the
> > process ID/PID than saying "port ID" with a (historical) default of
> > using the PID/TGID.
>
> I'll rewrite. The only use I'm aware of is OvS upcalls, are there more?
In nl80211 we have quite a few "unicast an event message to a specific
portid" uses, e.g. if userspace subscribes to certain action frames, the
frame notification for it would be unicast to the subscribed socket, or
the TX status response after a frame was transmitted, etc. etc.
> Practically speaking for a person trying to make a ethtool, FOU,
> devlink etc. call to the kernel this is 100% irrelevant.
Fair point, depends on what you're using and what programming model that
has.
> > > +Strict checking
> > > +---------------
> > > +
> > > +The ``NETLINK_GET_STRICT_CHK`` socket option enables strict input checking
> > > +in ``NETLINK_ROUTE``. It was needed because historically kernel did not
> > > +validate the fields of structures it didn't process. This made it impossible
> > > +to start using those fields later without risking regressions in applications
> > > +which initialized them incorrectly or not at all.
> > > +
> > > +``NETLINK_GET_STRICT_CHK`` declares that the application is initializing
> > > +all fields correctly. It also opts into validating that message does not
> > > +contain trailing data and requests that kernel rejects attributes with
> > > +type higher than largest attribute type known to the kernel.
> > > +
> > > +``NETLINK_GET_STRICT_CHK`` is not used outside of ``NETLINK_ROUTE``.
> >
> > However, there are also more generally strict checks in policy
> > validation ... maybe a discussion of all that would be worthwhile?
>
> Yeah :( It's too much to describe to a newcomer, I figured. I refer
> those who care to the enum field in the next section. We'd need a full
> table of families and attrs which start strict(er) validation.. bah. Too
> much technical debt.
Yes ... Also sometimes attributes in a dump request are checked,
sometimes not, sometimes just ignored, etc. Certainly not worth handling
here though.
> > - maybe not the appropriate place here, but maybe some best practices
> > for handling attributes, such as the multi-attribute array thing we
> > discussed in the other thread?
>
> Right, this doc is meant for the user rather than kernel dev.
>
Makes sense. The user anyway will have to look at how their specific
family does things, and do it accordingly.
> I'm planning to write a separate doc for the kernel dev.
Nice, looking forward to that! :)
> I started writing this one as guide for a person who would like to write
> a YAML NL library for their fav user space language but has no prior
> knowledge of netlink and does not know where to start.
Makes sense.
> > - maybe more userspace recommendations such as using different sockets
> > for multicast listeners and requests, because otherwise it gets
> > tricky to wait for the ACK of a request since you have to handle
> > notifications that happen meanwhile?
>
> Hm, good point. I should add a section on multicast and make it part
> of that.
True that, multicast more generally is something to know about.
> > - maybe some mention of the fact that sometimes we now bind kernel
> > object or state lifetime to a socket, e.g. in wireless you can
> > connect and if your userspace crashes/closes the socket, the
> > connection is automatically torn down (because you can't handle the
> > things needed anymore)
>
> 😍 Can you point me to the code? (probably too advanced for this doc
> but the idea seems super useful!)
Look at the uses of NL80211_ATTR_SOCKET_OWNER, e.g. you can
* create a virtual interface and have it disappear if you close the
socket (_nl80211_new_interface)
* AP stopped if you close the socket (nl80211_start_ap)
* some regulatory stuff reset (nl80211_req_set_reg)
* background ("scheduled") scan stopped (nl80211_start_sched_scan)
* connection torn down (nl80211_associate)
* etc.
The actual teardown handling is in nl80211_netlink_notify().
I guess I can agree though it doesn't really belong here - again
something specific to the operations you're doing.
> > - maybe something about message sizes? we've had lots of trouble with
> > that in nl80211, but tbh I'm not really sure what we should say about
> > it other than making sure you use large enough buffers ...
>
> Yes :S What's the error reported when the buffer is too small?
> recv() = -1, errno = EMSGSIZE? Does the message get discarded
> or can it be re-read? I don't have practical experience with
> that one.
Ugh, I repressed all those memories ... I don't remember now, I guess
I'd have to try it. Also it doesn't just apply to normal stuff but also
multicast, and that can be even trickier.
johannes
Powered by blists - more mailing lists