lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 19 Aug 2022 10:54:51 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Johannes Berg <johannes@...solutions.net>
Cc:     davem@...emloft.net, netdev@...r.kernel.org, corbet@....net,
        stephen@...workplumber.org, sdf@...gle.com, ecree.xilinx@...il.com,
        benjamin.poirier@...il.com, idosch@...sch.org,
        f.fainelli@...il.com, jiri@...nulli.us, dsahern@...nel.org,
        fw@...len.de, linux-doc@...r.kernel.org, jhs@...atatu.com,
        tgraf@...g.ch, jacob.e.keller@...el.com, svinota.saveliev@...il.com
Subject: Re: [PATCH net-next 2/2] docs: netlink: basic introduction to
 Netlink

On Fri, 19 Aug 2022 18:57:01 +0200 Johannes Berg wrote:
> > Theoretically I think we also align what I called "fixed metadata
> > headers", practically all of those are multiple of 4 :S  
> 
> But they're not really aligned, are they? Hmm. Well I guess practically
> it doesn't matter. I just read this and wasn't really sure what the
> mention of "[h]eaders" was referring to in this context.

Aligned in what sense? Absolute address? My understanding
was that every layer of fixed headers should round itself
off with NLMSG_ALIGN().

But you're right, I think it will be easier to understand
if I say "attribute", and it's practically equivalent.

> > True, it's not strictly necessary AFAIU. Should I mention it 
> > or would that be over-complicating things?
> > 
> > I believe that kernel will accept both forms (without tripping 
> > the trailing data warning), and both the kernel and mnl will pad 
> > out the last attr.  
> 
> Yeah, probably not worth mentioning it.
> 
> I think what threw me off was the explicit mention of "at the end of the
> message" - perhaps just say "after the family name attribute"?

Good point, I replaced with "after ``CTRL_ATTR_FAMILY_NAME``".

> > Some of the text is written with the implicit goal of comforting 
> > the newcomer ;)  
> 
> :-)
> 
> In this document it just feels like saying it _should_ be rare, but I'm
> not sure it should? We ignore it a lot in practice, but maybe we should
> be doing it more?

OK, I'll drop the "rare".

> > I'll rewrite. The only use I'm aware of is OvS upcalls, are there more?  
> 
> In nl80211 we have quite a few "unicast an event message to a specific
> portid" uses, e.g. if userspace subscribes to certain action frames, the
> frame notification for it would be unicast to the subscribed socket, or
> the TX status response after a frame was transmitted, etc. etc.

Interesting! So there is a "please subscribe me" netlink message 
in addition to NETLINK_ADD_MEMBERSHIP? Does the port ID get passed
explicitly or the kernel takes it from the socket by itself? Or I 
guess you may not use the Port ID at all if you're hooking up to 
socket destruction..

> > Practically speaking for a person trying to make a ethtool, FOU,
> > devlink etc. call to the kernel this is 100% irrelevant.  
> 
> Fair point, depends on what you're using and what programming model that
> has.

Right, my thinking was that the programming model of the family is
opaque to a person implementing the YAML netlink plumbing.

> > Hm, good point. I should add a section on multicast and make it part 
> > of that.  
> 
> True that, multicast more generally is something to know about.

FWIW this is what I typed:

Multicast notifications
-----------------------

One of the strengths of Netlink is the ability to send event notifications
to user space. This is a unidirectional form of communication (kernel ->
user) and does not involve any control messages like ``NLMSG_ERROR`` or
``NLMSG_DONE``.

For example the Generic Netlink family itself defines a set of multicast
notifications about registered families. When a new family is added the
sockets subscribed to the notifications will get the following message::

  struct nlmsghdr:
    __u32 nlmsg_len:	136
    __u16 nlmsg_type:	GENL_ID_CTRL
    __u16 nlmsg_flags:	0
    __u32 nlmsg_seq:	0
    __u32 nlmsg_pid:	0

  struct genlmsghdr:
    __u8 cmd:		CTRL_CMD_NEWFAMILY
    __u8 version:	2
    __u16 reserved:	0

  struct nlattr:
    __u16 nla_len:	10
    __u16 nla_type:	CTRL_ATTR_FAMILY_NAME
    char data: 		test1\0

  (padding:)
    data:		\0\0

  struct nlattr:
    __u16 nla_len:	6
    __u16 nla_type:	CTRL_ATTR_FAMILY_ID
    __u16: 		123  /* The Family ID we are after */

  (padding:)
    char data:		\0\0

  struct nlattr:
    __u16 nla_len:	9
    __u16 nla_type:	CTRL_ATTR_FAMILY_VERSION
    __u16: 		1

  /* ... etc, more attributes will follow. */

The notification contains the same information as the response to the
``CTRL_CMD_GETFAMILY`` request. It is most common for "new object"
notifications to contain the same exact data as the respective ``GET``.

The Netlink headers of the notification are mostly 0 and irrelevant.
The :c:member:`nlmsghdr.nlmsg_seq` may be either zero or an monotonically
increasing notification sequence number maintained by the family.

To receive notifications the user socket must subscribe to the relevant
notification group. Much like the Family ID, the Group ID for a given
multicast group is dynamic and can be found inside the Family information.
The ``CTRL_ATTR_MCAST_GROUPS`` attribute contains nests with names
(``CTRL_ATTR_MCAST_GRP_NAME``) and IDs (``CTRL_ATTR_MCAST_GRP_ID``) of
the groups family.

Once the Group ID is known a setsockopt() call adds the socket to the group:

.. code-block:: c

  unsigned int group_id;

  /* .. find the group ID... */

  setsockopt(fd, SOL_NETLINK, NETLINK_ADD_MEMBERSHIP,
             &group_id, sizeof(group_id));

The socket will now receive notifications. It is recommended to use
a separate sockets for receiving notifications and sending requests
to the kernel. The asynchronous nature of notifications means that
they may get mixed in with the responses making the parsing much
harder.

> > 😍 Can you point me to the code? (probably too advanced for this doc
> > but the idea seems super useful!)  
> 
> Look at the uses of NL80211_ATTR_SOCKET_OWNER, e.g. you can
>  * create a virtual interface and have it disappear if you close the
>    socket (_nl80211_new_interface)
>  * AP stopped if you close the socket (nl80211_start_ap)
>  * some regulatory stuff reset (nl80211_req_set_reg)
>  * background ("scheduled") scan stopped (nl80211_start_sched_scan)
>  * connection torn down (nl80211_associate)
>  * etc.
> 
> The actual teardown handling is in nl80211_netlink_notify().
> 
> I guess I can agree though it doesn't really belong here - again
> something specific to the operations you're doing.

I didn't know about the notifier chain, super useful!

> > Yes :S What's the error reported when the buffer is too small?
> > recv() = -1, errno = EMSGSIZE? Does the message get discarded 
> > or can it be re-read? I don't have practical experience with
> > that one.  
> 
> Ugh, I repressed all those memories ... I don't remember now, I guess
> I'd have to try it. Also it doesn't just apply to normal stuff but also
> multicast, and that can be even trickier.

No worries, let me try myself. Annoyingly I have this doc on a different
branch than my netlink code, that's why I was being lazy :)

Powered by blists - more mailing lists