netdev - Re: [PATCH net-next 2/4] ip/options: explicitly provide net ns to __ip_options

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1504645381.3841.6.camel@redhat.com>
Date:   Tue, 05 Sep 2017 23:03:01 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Eric Dumazet <eric.dumazet@...il.com>,
        "David S. Miller" <davem@...emloft.net>
Cc:     netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
        Hannes Frederic Sowa <hannes@...essinduktion.org>
Subject: Re: [PATCH net-next 2/4] ip/options: explicitly provide net ns to
 __ip_options_echo()

On Tue, 2017-09-05 at 10:18 -0700, Eric Dumazet wrote:
> On Thu, 2017-08-03 at 18:07 +0200, Paolo Abeni wrote:
> > __ip_options_echo() uses the current network namespace, and
> > currently retrives it via skb->dst->dev.
> > 
> > This commit adds an explicit 'net' argument to __ip_options_echo()
> > and update all the call sites to provide it, usually via a simpler
> > sock_net().
> > 
> > After this change, __ip_options_echo() no more needs to access
> > skb->dst and we can drop a couple of hack to preserve such
> > info in the rx path.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@...hat.com>
> > ---
> 
> David, Paolo
> 
> This commit (91ed1e666a4ea2e260452a7d7d311ac5ae852cba "ip/options:
> explicitly provide net ns to __ip_options_echo()")
> 
> needs to be backported to linux-4.13 stable version to avoid these kind
> of crashes [1]
> 
> This is because of MSG_PEEK operation, hitting skb_consume_udp() while
> skb is still in receive queue.
> 
> Next read() finding again the skb then can see a NULL skb->dst
> 
> Thanks !
> 
> [1]
> general protection fault: 0000 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 3017 Comm: syzkaller446772 Not tainted 4.13.0+ #68
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> task: ffff8801cd0a4380 task.stack: ffff8801cc498000
> RIP: 0010:__ip_options_echo+0xea8/0x1430 net/ipv4/ip_options.c:143
> RSP: 0018:ffff8801cc49f628 EFLAGS: 00010246
> RAX: dffffc0000000000 RBX: ffff8801cc49f928 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000004
> RBP: ffff8801cc49f6b8 R08: ffff8801cc49f936 R09: ffffed0039893f28
> R10: 0000000000000003 R11: ffffed0039893f27 R12: ffff8801cc49f918
> R13: ffff8801ccbcf36c R14: 000000000000000f R15: 0000000000000018
> FS:  0000000000979880(0000) GS:ffff8801db200000(0000)
> knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000200c0ff0 CR3: 00000001cc4ed000 CR4: 00000000001406f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  ip_options_echo include/net/ip.h:574 [inline]
>  ip_cmsg_recv_retopts net/ipv4/ip_sockglue.c:91 [inline]
>  ip_cmsg_recv_offset+0xa17/0x1280 net/ipv4/ip_sockglue.c:207
>  udp_recvmsg+0xe0b/0x1260 net/ipv4/udp.c:1641
>  inet_recvmsg+0x14c/0x5f0 net/ipv4/af_inet.c:793
>  sock_recvmsg_nosec net/socket.c:792 [inline]
>  sock_recvmsg+0xc9/0x110 net/socket.c:799
>  SYSC_recvfrom+0x2dc/0x570 net/socket.c:1788
>  SyS_recvfrom+0x40/0x50 net/socket.c:1760
>  entry_SYSCALL_64_fastpath+0x1f/0xbe
> RIP: 0033:0x444c89
> RSP: 002b:00007ffd80c788e8 EFLAGS: 00000286 ORIG_RAX: 000000000000002d
> RAX: ffffffffffffffda RBX: ffffffffffffffff RCX: 0000000000444c89
> RDX: 0000000000000000 RSI: 0000000020bc0000 RDI: 0000000000000004
> RBP: 0000000000000082 R08: 00000000200c0ff0 R09: 0000000000000010
> R10: 0000000000000000 R11: 0000000000000286 R12: 0000000000402390
> R13: 0000000000402420 R14: 0000000000000000 R15: 0000000000000000
> Code: f6 c1 01 0f 85 a5 01 00 00 48 89 4d b8 e8 31 e9 6b fd 48 8b 4d b8
> 48 b8 00 00 00 00 00 fc ff df 48 83 e1 fe 48 89 ca 48 c1 ea 03 <80> 3c
> 02 00 0f 85 41 02 00 00 48 8b 09 48 b8 00 00 00 00 00 fc 
> RIP: __ip_options_echo+0xea8/0x1430 net/ipv4/ip_options.c:143 RSP:
> ffff8801cc49f628
> ---[ end trace b30d95b284222843 ]---
> Kernel panic - not syncing: Fatal exception

Thank you Eric for the report! 

Darn me, I seriously messed-up with the stateless consume.

I think we can have similar issues pith ipsec/secpath and MSG_PEEK,
even if with less catastropthic outcome.

What about the following, which should cover both cases? (only compile
tested, I'll test it tomorrow morning my time)
---
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index d67a8182e5eb..63df75ae70ee 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -885,7 +885,7 @@ void kfree_skb(struct sk_buff *skb);
 void kfree_skb_list(struct sk_buff *segs);
 void skb_tx_error(struct sk_buff *skb);
 void consume_skb(struct sk_buff *skb);
-void consume_stateless_skb(struct sk_buff *skb);
+void __consume_stateless_skb(struct sk_buff *skb);
 void  __kfree_skb(struct sk_buff *skb);
 extern struct kmem_cache *skbuff_head_cache;
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index e07556606284..f2411a8744d7 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -753,14 +753,11 @@ EXPORT_SYMBOL(consume_skb);
  *	consume_stateless_skb - free an skbuff, assuming it is stateless
  *	@skb: buffer to free
  *
- *	Works like consume_skb(), but this variant assumes that all the head
- *	states have been already dropped.
+ *	Alike consume_skb(), but this variant assumes that all the head
+ *	states have been already dropped and usage count is one
  */
-void consume_stateless_skb(struct sk_buff *skb)
+void __consume_stateless_skb(struct sk_buff *skb)
 {
-	if (!skb_unref(skb))
-		return;
-
 	trace_consume_skb(skb);
 	if (likely(skb->head))
 		skb_release_data(skb);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 62344804baae..979e4d8526ba 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1386,12 +1386,15 @@ void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len)
 		unlock_sock_fast(sk, slow);
 	}
 
+	if (!skb_unref(skb))
+		return;
+
 	/* In the more common cases we cleared the head states previously,
 	 * see __udp_queue_rcv_skb().
 	 */
 	if (unlikely(udp_skb_has_head_state(skb)))
 		skb_release_head_state(skb);
-	consume_stateless_skb(skb);
+	__consume_stateless_skb(skb);
 }
 EXPORT_SYMBOL_GPL(skb_consume_udp);