[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55F2AC29.3030209@iogearbox.net>
Date: Fri, 11 Sep 2015 12:25:45 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: David Miller <davem@...emloft.net>
CC: chamaken@...il.com, fw@...len.de, netdev@...r.kernel.org
Subject: Re: [PATCH net] netlink, mmap: transform mmap skb into full skb on
taps
On 09/11/2015 07:11 AM, David Miller wrote:
...
> Looking more deeply into this, I think we have the same exact problem
> with netlink skbs that use vmalloc memory at skb->head.
Yes, agreed, the test in the patch covered those as well via:
if (netlink_skb_is_mmaped(skb) || is_vmalloc_addr(skb->head))
> We have a special hack, but only when netlink_skb_clone() is used,
> to preserve the destructor. But these skbs can escape anywhere
> and be eventually cloned as we have seen with the mmap stuff.
Yes, it looks like they are currently only used from user space to
kernel space. I saw that 3a36515f7294 ("netlink: fix splat in skb_clone
with large messages") fixed a couple of these in upper layers with
regards to large skbs, so there's a chance that this can be overseen
rather easily as well in other places and then only come to light in
cases where we allocate more than NLMSG_GOODSIZE, so we don't actually
use the alloc_skb() path. :/ So I like your idea below!
> I'm wondering if we should do something more precise to fix this,
> and in a way that handles both the mmap and vmalloc cases.
Perhaps it might also be useful if the kernel would one day want to
use netlink_alloc_large_skb() for answers back to user space, or in
places where we use network skbs (haven't looked into it with regards
to this).
Some more comments below:
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 2738d35..77b804c 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -584,8 +584,9 @@ struct sk_buff {
> fclone:2,
> peeked:1,
> head_frag:1,
> - xmit_more:1;
> - /* one bit hole */
> + xmit_more:1,
> + clone_preserves_destructor;
( Nit: maybe as clone_preserves_destructor:1 )
> +
> kmemcheck_bitfield_end(flags1);
>
> /* fields enclosed in headers_start/headers_end are copied
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index dad4dd3..4a7b8e3 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -825,7 +825,8 @@ static struct sk_buff *__skb_clone(struct sk_buff *n, struct sk_buff *skb)
> n->hdr_len = skb->nohdr ? skb_headroom(skb) : skb->hdr_len;
> n->cloned = 1;
> n->nohdr = 0;
> - n->destructor = NULL;
> + if (!skb->clone_preserves_destructor)
> + n->destructor = NULL;
I think we also need here:
else
C(destructor);
> C(tail);
> C(end);
> C(head);
[...]
> diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
> index 7f86d3b..214f1a1 100644
> --- a/net/netlink/af_netlink.c
> +++ b/net/netlink/af_netlink.c
> @@ -719,6 +719,7 @@ static void netlink_ring_setup_skb(struct sk_buff *skb, struct sock *sk,
> skb->end = skb->tail + size;
> skb->len = 0;
>
> + skb->clone_preserves_destructor = 1;
> skb->destructor = netlink_skb_destructor;
> NETLINK_CB(skb).flags |= NETLINK_SKB_MMAPED;
> NETLINK_CB(skb).sk = sk;
> @@ -854,6 +855,14 @@ static void netlink_ring_set_copied(struct sock *sk, struct sk_buff *skb)
> #define netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group, scm) 0
> #endif /* CONFIG_NETLINK_MMAP */
One more thing that came to my mind. For netlink mmap skbs, the
skb->clone_preserves_destructor is actually not enough.
Already calling into skb_clone() is an issue itself, as the data area is
user space buffer, and skb_clone() as well as skb_copy() access skb_shinfo()
area. :/ So in that regard netlink mmap skbs are even further restrictive
on what we can do than netlink large skbs.
Best,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists