lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6d194ac2-15e8-76d7-31d0-b4c4eed68d5a@gmail.com>
Date:   Thu, 20 Sep 2018 06:29:57 -0700
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Sean Tranchetti <stranche@...eaurora.org>, netdev@...r.kernel.org,
        steffen.klassert@...unet.com
Subject: Re: [PATCH net] af_key: free SKBs under RCU protection



On 09/19/2018 05:18 PM, Sean Tranchetti wrote:
> pfkey_broadcast() can make calls to pfkey_broadcast_one() which
> will clone or copy the passed in SKB. This new SKB will be assigned
> the sock_rfree() function as its destructor, which requires that
> the socket reference the SKB contains is valid when the SKB is freed.
> 
> If this SKB is ever passed to pfkey_broadcast() again by some other
> function (such as pkfey_dump() or pfkey_promisc) it will then be
> freed there. However, since this free occurs outside of RCU protection,
> it is possible that userspace could close the socket and trigger
> pfkey_release() to free the socket before sock_rfree() can run, creating
> the following race condition:
> 
> 1: An SKB belonging to the pfkey socket is passed to pfkey_broadcast().
>    It performs the broadcast to any other sockets, and calls
>    rcu_read_unlock(), but does not yet reach kfree_skb().
> 2: Userspace closes the socket, triggering pfkey_realse(). Since no one
>    holds the RCU lock, synchronize_rcu() returns and it is allowed to
>    continue. It calls sock_put() to free the socket.
> 3: pfkey_broadcast() now calls kfree_skb() on the original SKB it was
>    passed, triggering a call to sock_rfree(). This function now accesses
>    the freed struct sock * via skb->sk, and attempts to update invalid
>    memory.
> 
> By ensuring that the pfkey_broadcast() also frees the SKBs while it holds
> the RCU lock, we can ensure that the socket will remain valid when the SKB
> is freed, avoiding crashes like the following:
> 
> Unable to handle kernel paging request at virtual address 6b6b6b6b6b6c4b
> [006b6b6b6b6b6c4b] address between user and kernel address ranges
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> task: fffffff78f65b380 task.stack: ffffff8049a88000
> pc : sock_rfree+0x38/0x6c
> lr : skb_release_head_state+0x6c/0xcc
> Process repro (pid: 7117, stack limit = 0xffffff8049a88000)
> Call trace:
> 	sock_rfree+0x38/0x6c
> 	skb_release_head_state+0x6c/0xcc
> 	skb_release_all+0x1c/0x38
> 	__kfree_skb+0x1c/0x30
> 	kfree_skb+0xd0/0xf4
> 	pfkey_broadcast+0x14c/0x18c
> 	pfkey_sendmsg+0x1d8/0x408
> 	sock_sendmsg+0x44/0x60
> 	___sys_sendmsg+0x1d0/0x2a8
> 	__sys_sendmsg+0x64/0xb4
> 	SyS_sendmsg+0x34/0x4c
> 	el0_svc_naked+0x34/0x38
> Kernel panic - not syncing: Fatal exception
> 
> Signed-off-by: Sean Tranchetti <stranche@...eaurora.org>
> ---
>  net/key/af_key.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/key/af_key.c b/net/key/af_key.c
> index 9d61266..dd257c7 100644
> --- a/net/key/af_key.c
> +++ b/net/key/af_key.c
> @@ -275,13 +275,13 @@ static int pfkey_broadcast(struct sk_buff *skb, gfp_t allocation,
>  		if ((broadcast_flags & BROADCAST_REGISTERED) && err)
>  			err = err2;
>  	}
> -	rcu_read_unlock();
>  
>  	if (one_sk != NULL)
>  		err = pfkey_broadcast_one(skb, &skb2, allocation, one_sk);
>  
>  	kfree_skb(skb2);
>  	kfree_skb(skb);
> +	rcu_read_unlock();
>  	return err;
>  }
>  
> 

I do not believe the changelog or the patch makes sense.

Having skb still referencing a socket prevents this socket being released.

If you think about it, what would prevent the freeing happening 
_before_ the rcu_read_lock() in pfkey_broadcast() ?

Maybe the correct fix is that pfkey_broadcast_one() should ensure the socket is still valid.

I would suggest something like :

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 9d61266526e767770d9a1ce184ac8cdd59de309a..5ce309d020dda5e46e4426c4a639bfb551e2260d 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -201,7 +201,9 @@ static int pfkey_broadcast_one(struct sk_buff *skb, struct sk_buff **skb2,
 {
        int err = -ENOBUFS;
 
-       sock_hold(sk);
+       if (!refcount_inc_not_zero(&sk->sk_refcnt))
+               return -ENOENT;
+
        if (*skb2 == NULL) {
                if (refcount_read(&skb->users) != 1) {
                        *skb2 = skb_clone(skb, allocation);



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ