netdev - Re: [PATCHv1 net-next 0/5] netlink: mmap: kernel panic and some issues

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55CDC51D.1060204@iogearbox.net>
Date:	Fri, 14 Aug 2015 12:38:21 +0200
From:	Daniel Borkmann <daniel@...earbox.net>
To:	Ken-ichirou MATSUZAWA <chamaken@...il.com>,
	David Miller <davem@...emloft.net>
CC:	netdev@...r.kernel.org, fw@...len.de
Subject: Re: [PATCHv1 net-next 0/5] netlink: mmap: kernel panic and some issues

On 08/14/2015 12:01 PM, Daniel Borkmann wrote:
> On 08/14/2015 10:58 AM, Ken-ichirou MATSUZAWA wrote:
>>   Hi,
>>
>> Thank you for taking your time.
>> Please let me explain these with code samples on gist.
>> I can not describe and arrange it well, sorry.
>>
>>      normal socket nflog sample:
>>      https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/nflog.c
>>
>> set iptables
>>
>>      iptables -A INPUT -p icmp --icmp-type echo-request \
>>          -j NFLOG --nflog-group 2 --nflog-threshold 4
>>
>> monitor nlmon (like netsniff-ng), run this sample and
>> ping -i 0.2 -c 10 from another hosts. This sample only shows receive
>> size and nlmsg_type. Same things can be done with rx mmaped socket.
>>
>>      rx only mmaped nflog sample:
>>      https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/rxring-nflog.c
>>
>> This sample gets a panic if monitoring nlmon.
>>
>>      panic message:
>>      https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/mmaped_netlink_panic
>>
>> I think it's because of accessing a skb_shared_info when releasing
>> skb, although mmaped netlink skb does not have a skb_shared_info. I
>> tried to fix this at patch 1 and 2 by introducing helper function
>> which will not access a skb_shared_info.
>>
>> And I think nm_status should be set to UNUSED when releasing it so
>> also tried to fix it patch 3.
>
> Ok, I'm trying to understand the issue: you are saying that whenever
> there's an skb_clone on an netlink mmaped skb, we have the situation
> that skb->data, skb->head etc points to the mmaped user space buffer
> slot, and thus _must_ have no shared info.
>
> Currently, what happens is that the shared info accesses whatever
> memory is there in the mmaped region. So when you already do an
> skb_clone() you should already get into trouble right there f.e. when
> we test for orphaning frags etc (if at the right offset in the mmap
> buffer, the tx_flags member would contain a SKBTX_DEV_ZEROCOPY bit).

Ken-ichirou, have you observed this issue only in relation to nlmon?
Haven't checked yet if there are any upper layer netlink consumers that
would call for some reason into skb_clone() as well. I am thinking that
if taps are indeed the only ones affected, it might probably not be
worth adding that much complexity for a fix itself, but to keep it simple
instead. I don't know if there are any real users of netlink mmap, but
if it would really be needed, we could think about it on a net-next basis?
It seems you have some other, separate fixes in your series, so you might
want to submit them separately against the net tree, instead?

  include/linux/netlink.h  |  4 ++++
  net/netlink/af_netlink.c | 12 +++++++-----
  2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index 9120edb..42cdcd8 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -35,6 +35,10 @@ struct netlink_skb_parms {
  #define NETLINK_CB(skb)		(*(struct netlink_skb_parms*)&((skb)->cb))
  #define NETLINK_CREDS(skb)	(&NETLINK_CB((skb)).creds)

+static inline bool netlink_skb_is_mmaped(const struct sk_buff *skb)
+{
+	return NETLINK_CB(skb).flags & NETLINK_SKB_MMAPED;
+}

  extern void netlink_table_grab(void);
  extern void netlink_table_ungrab(void);
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 67d2104..4307446 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -238,6 +238,13 @@ static void __netlink_deliver_tap(struct sk_buff *skb)

  static void netlink_deliver_tap(struct sk_buff *skb)
  {
+	/* Netlink mmaped skbs must not access shared info, and thus
+	 * are not allowed to be cloned. For now, just don't allow
+	 * them to get inspected by taps.
+	 */
+	if (netlink_skb_is_mmaped(skb))
+		return;
+
  	rcu_read_lock();

  	if (unlikely(!list_empty(&netlink_tap_all)))
@@ -278,11 +285,6 @@ static void netlink_rcv_wake(struct sock *sk)
  }

  #ifdef CONFIG_NETLINK_MMAP
-static bool netlink_skb_is_mmaped(const struct sk_buff *skb)
-{
-	return NETLINK_CB(skb).flags & NETLINK_SKB_MMAPED;
-}
-
  static bool netlink_rx_is_mmaped(struct sock *sk)
  {
  	return nlk_sk(sk)->rx_ring.pg_vec != NULL;
-- 
1.9.3




>> ----
>>
>> With both tx/rx mmaped,
>>
>>      both tx/rx mmaped nflog sample:
>>      https://gist.github.com/chamaken/dc0f80c14862e8061c06/raw/2d6da8fff31ef61af77e68713fdb1d71978746a6/ring-nflog.c
>>
>> This sample will not work, since msg->msg_iter.type in
>> netlink_sendmsg() is set to 1 (WRITE) when this sample calls
>> sendto(). patch 4 fix this by accepting it.
>>
>> ----
>>
>> After applying patch 1 and 2, rx only sample can work but it behaves
>> differ from normal one. patch 5 may fix this.
>>
>> And it also works well with my another code which set frame
>> nm_status to SKIP and passes it to worker threads and the worker
>> threads set status to UNUSED, even though ring becomes full.
>>
>> That my another code may set UNUSED status in random, not
>> sequensially, so that it seems I need to check whole ring.
>>
>> Thanks,
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@...r.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@...r.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html