[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20181126113857.29270-1-fw@strlen.de>
Date: Mon, 26 Nov 2018 12:38:54 +0100
From: Florian Westphal <fw@...len.de>
To: <netdev@...r.kernel.org>
Subject: [RFC PATCH 0/3] sk_buff: add skb extension infrastructure
The (out-of-tree) Multipath-TCP implementation needs a significant amount
of extra space in the skb control buffer.
Increasing skb->cb[] size in mainline is a non-starter for memory and
and performance reasons (f.e. increase in cb size also moves several
frequently-accessed fields to other cache lines).
One approach that might work for MPTCP is to extend skb_shared_info instead
of sk_buff. However, this comes with other drawbacks, e.g. it either
needs special skb allocation to make sure there is enough space for such
'extended shinfo' at the end of data buffer (which makes this only useable
for tx path) or increased size of skb_shared_info.
This adds an extension infrastructure for sk_buff instead:
1. extension memory is released when the sk_buff is free'd.
2. data is shared after cloning an skb.
3. adding extension to an skb will COW the extension
buffer if needed.
This is also how xfrm and bridge_nf extra data (skb->sp, skb->nf_bridge)
are handled.
In the future, protocols that need to store more than 48 bytes in skb->cb[]
could add a 'SKB_EXT_EXTRA_CB' or similar to allocate extra space.
Two new members are added to sk_buff:
1. 'active_extensions' byte (filling a hole), telling which extensions
have been enabled for this skb.
2. extension pointer, located at the end of the sk_buff.
If active_extensions byte is 0, pointer value is undefined.
Last patch converts nf_bridge to use the extension infrastructure:
The 'nf_bridge' pointer is removed, i.e. sk_buff size remains the same.
Extra code added to skb clone and free paths (to deal with
refcount/free of extension area) replace the existing code that
deals with skb->nf_bridge.
Conversion of skb->sp (ipsec/xfrm secpath) to an skb extension could be
done as a followup, but I'm reluctant to work on this before there is
agreement that this is the right direction.
Comments welcome.
include/linux/netfilter_bridge.h | 33 +++++---
include/linux/skbuff.h | 142 +++++++++++++++++++++++++++++------
include/net/netfilter/br_netfilter.h | 14 ---
net/Kconfig | 4
net/bridge/br_netfilter_hooks.c | 39 +++------
net/bridge/br_netfilter_ipv6.c | 4
net/core/skbuff.c | 134 ++++++++++++++++++++++++++++++++-
net/ipv4/ip_output.c | 1
net/ipv4/netfilter/nf_reject_ipv4.c | 6 -
net/ipv6/ip6_output.c | 1
net/ipv6/netfilter/nf_reject_ipv6.c | 10 +-
net/netfilter/nf_log_common.c | 20 ++--
net/netfilter/nf_queue.c | 50 ++++++++----
net/netfilter/nfnetlink_queue.c | 23 ++---
net/netfilter/xt_physdev.c | 2
15 files changed, 368 insertions(+), 115 deletions(-)
Powered by blists - more mailing lists