[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181008110412.43o5qgaaqvsf2znw@brauner.io>
Date: Mon, 8 Oct 2018 13:04:13 +0200
From: Christian Brauner <christian@...uner.io>
To: David Ahern <dsahern@...nel.org>
Cc: netdev@...r.kernel.org, davem@...emloft.net, jbenc@...hat.com,
stephen@...workplumber.org, David Ahern <dsahern@...il.com>
Subject: Re: [PATCH v2 net-next 00/23] rtnetlink: Add support for rigid
checking of data in dump request
On Sun, Oct 07, 2018 at 08:16:21PM -0700, David Ahern wrote:
> From: David Ahern <dsahern@...il.com>
>
> There are many use cases where a user wants to influence what is
> returned in a dump for some rtnetlink command: one is wanting data
> for a different namespace than the one the request is received and
> another is limiting the amount of data returned in the dump to a
> specific set of interest to userspace, reducing the cpu overhead of
> both kernel and userspace. Unfortunately, the kernel has historically
> not been strict with checking for the proper header or checking the
> values passed in the header. This lenient implementation has allowed
> iproute2 and other packages to pass any struct or data in the dump
> request as long as the family is the first byte. For example, ifinfomsg
> struct is used by iproute2 for all generic dump requests - links,
> addresses, routes and rules when it is really only valid for link
> requests.
>
> There is 1 is example where the kernel deals with the wrong struct: link
> dumps after VF support was added. Older iproute2 was sending rtgenmsg as
> the header instead of ifinfomsg so a patch was added to try and detect
> old userspace vs new:
> e5eca6d41f53 ("rtnetlink: fix userspace API breakage for iproute2 < v3.9.0")
>
> The latest example is Christian's patch set wanting to return addresses for
> a target namespace. It guesses the header struct is an ifaddrmsg and if it
> guesses wrong a netlink warning is generated in the kernel log on every
> address dump which is unacceptable.
>
> Another example where the kernel is a bit lenient is route dumps: iproute2
> can send either a request with either ifinfomsg or a rtmsg as the header
> struct, yet the kernel always treats the header as an rtmsg (see
> inet_dump_fib and rtm_flags check). The header inconsistency impacts the
> ability to add kernel side filters for route dumps - a necessary feature
> for scale setups with 100k+ routes.
>
> How to resolve the problem of not breaking old userspace yet be able to
> move forward with new features such as kernel side filtering which are
> crucial for efficient operation at high scale?
>
> This patch set addresses the problem by adding a new socket flag,
> NETLINK_DUMP_STRICT_CHK, that userspace can use with setsockopt to
> request strict checking of headers and attributes on dump requests and
> hence unlock the ability to use kernel side filters as they are added.
>
> Kernel side, the dump handlers are updated to verify the message contains
> at least the expected header struct:
> RTM_GETLINK: ifinfomsg
> RTM_GETADDR: ifaddrmsg
> RTM_GETMULTICAST: ifaddrmsg
> RTM_GETANYCAST: ifaddrmsg
> RTM_GETADDRLABEL: ifaddrlblmsg
> RTM_GETROUTE: rtmsg
> RTM_GETSTATS: if_stats_msg
> RTM_GETNEIGH: ndmsg
> RTM_GETNEIGHTBL: ndtmsg
> RTM_GETNSID: rtgenmsg
> RTM_GETRULE: fib_rule_hdr
> RTM_GETNETCONF: netconfmsg
> RTM_GETMDB: br_port_msg
>
> And then every field in the header struct should be 0 with the exception
> of the family. There are a few exceptions to this rule where the kernel
> already influences the data returned by values in the struct. Next the
> message should not contain attributes unless the kernel implements
> filtering for it. Any unexpected data causes the dump to fail with EINVAL.
> If the new flag is honored by the kernel and the dump contents adjusted
> by any data passed in the request, the dump handler can set the
> NLM_F_DUMP_FILTERED flag in the netlink message header.
>
> For old userspace on new kernel there is no impact as all checks are
> wrapped in a check on the new strict flag. For new userspace on old
> kernel, the data in the headers and any appended attributes are
> silently ignored though the setsockopt failing is the clue to userspace
> the feature is not supported. New userspace on new kernel gets the
> requested data dump.
>
> iproute2 patches can be found here:
> https://github.com/dsahern/iproute2 dump-enhancements
>
> Major changes since v1
> - inner header is supposed to be 4-bytes aligned. So for dumps that
> should not have attributes appended changed the check to use:
> if (nlmsg_attrlen(nlh, sizeof(hdr)))
> Only impacts patches with headers that are not multiples of 4-bytes
> (rtgenmsg, netconfmsg), but applied the change to all patches not
> calling nlmsg_parse for consistency.
>
> - Added nlmsg_parse_strict and nla_parse_strict for tighter control on
> attribute parsing. There should be no unknown attribute types or extra
> bytes.
>
> - Moved validation to a helper in most cases
>
> Changes since rfc-v2
> - dropped the NLM_F_DUMP_FILTERED flag from target nsid dumps per
> Jiri's objections
> - changed the opt-in uapi from a netlink message flag to a socket
> flag. setsockopt provides an api for userspace to definitively
> know if the kernel supports strict checking on dumps.
> - re-ordered patches to peel off the extack on dumps if needed to
> keep this set size within limits
> - misc cleanups in patches based on testing
>
> David Ahern (23):
> netlink: Pass extack to dump handlers
> netlink: Add extack message to nlmsg_parse for invalid header length
> net: Add extack to nlmsg_parse
> netlink: Add strict version of nlmsg_parse and nla_parse
> net/ipv6: Refactor address dump to push inet6_fill_args to
> in6_dump_addrs
> netlink: Add new socket option to enable strict checking on dumps
> net/ipv4: Update inet_dump_ifaddr for strict data checking
> net/ipv6: Update inet6_dump_addr for strict data checking
> rtnetlink: Update rtnl_dump_ifinfo for strict data checking
> rtnetlink: Update rtnl_bridge_getlink for strict data checking
> rtnetlink: Update rtnl_stats_dump for strict data checking
> rtnetlink: Update inet6_dump_ifinfo for strict data checking
> rtnetlink: Update ipmr_rtm_dumplink for strict data checking
> rtnetlink: Update fib dumps for strict data checking
> net/neighbor: Update neigh_dump_info for strict data checking
> net/neighbor: Update neightbl_dump_info for strict data checking
> net/namespace: Update rtnl_net_dumpid for strict data checking
> net/fib_rules: Update fib_nl_dumprule for strict data checking
> net/ipv6: Update ip6addrlbl_dump for strict data checking
> net: Update netconf dump handlers for strict data checking
> net/bridge: Update br_mdb_dump for strict data checking
> rtnetlink: Move input checking for rtnl_fdb_dump to helper
> rtnetlink: Update rtnl_fdb_dump for strict data checking
At this point it's all nits so it's got my ACK but keener eyes than mine
might see other issues.
Acked-by: Christian Brauner <christian@...uner.io>
>
> include/linux/netlink.h | 2 +
> include/net/ip_fib.h | 2 +
> include/net/netlink.h | 21 ++-
> include/uapi/linux/netlink.h | 1 +
> lib/nlattr.c | 48 +++++--
> net/bridge/br_mdb.c | 30 ++++
> net/core/devlink.c | 2 +-
> net/core/fib_rules.c | 36 ++++-
> net/core/neighbour.c | 119 ++++++++++++---
> net/core/net_namespace.c | 6 +
> net/core/rtnetlink.c | 318 ++++++++++++++++++++++++++++++++---------
> net/ipv4/devinet.c | 101 ++++++++++---
> net/ipv4/fib_frontend.c | 42 +++++-
> net/ipv4/ipmr.c | 39 +++++
> net/ipv6/addrconf.c | 177 ++++++++++++++++++-----
> net/ipv6/addrlabel.c | 34 ++++-
> net/ipv6/ip6_fib.c | 8 ++
> net/ipv6/ip6mr.c | 9 ++
> net/ipv6/route.c | 2 +-
> net/mpls/af_mpls.c | 28 +++-
> net/netfilter/ipvs/ip_vs_ctl.c | 2 +-
> net/netlink/af_netlink.c | 33 ++++-
> net/netlink/af_netlink.h | 1 +
> net/sched/act_api.c | 2 +-
> net/sched/cls_api.c | 6 +-
> net/sched/sch_api.c | 2 +-
> net/xfrm/xfrm_user.c | 2 +-
> 27 files changed, 908 insertions(+), 165 deletions(-)
>
> --
> 2.11.0
>
Powered by blists - more mailing lists