lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <27f87dd8-f6e4-b2b0-2b3a-9378fddf147f@virtuozzo.com>
Date:   Thu, 2 Sep 2021 11:31:59 +0300
From:   Vasily Averin <vvs@...tuozzo.com>
To:     Eric Dumazet <eric.dumazet@...il.com>,
        Christoph Paasch <christoph.paasch@...il.com>,
        "David S. Miller" <davem@...emloft.net>
Cc:     Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
        David Ahern <dsahern@...nel.org>,
        Jakub Kicinski <kuba@...nel.org>,
        netdev <netdev@...r.kernel.org>, linux-kernel@...r.kernel.org,
        kernel@...nvz.org, Alexey Kuznetsov <kuznet@....inr.ac.ru>,
        Julian Wiedmann <jwi@...ux.ibm.com>
Subject: Re: [PATCH net-next v4] skb_expand_head() adjust skb->truesize
 incorrectly

On 9/2/21 10:33 AM, Vasily Averin wrote:
> On 9/2/21 10:13 AM, Vasily Averin wrote:
>> On 9/2/21 7:48 AM, Eric Dumazet wrote:
>>> On 9/1/21 9:32 PM, Eric Dumazet wrote:
>>>> I think you missed netem case, in particular
>>>> skb_orphan_partial() which I already pointed out.
>>>>
>>>> You can setup a stack of virtual devices (tunnels),
>>>> with a qdisc on them, before ip6_xmit() is finally called...
>>>>
>>>> Socket might have been closed already.
>>>>
>>>> To test your patch, you could force a skb_orphan_partial() at the beginning
>>>> of skb_expand_head() (extending code coverage)
>>>
>>> To clarify :
>>>
>>> It is ok to 'downgrade' an skb->destructor having a ref on sk->sk_wmem_alloc to
>>> something owning a ref on sk->refcnt.
>>>
>>> But the opposite operation (ref on sk->sk_refcnt -->  ref on sk->sk_wmem_alloc) is not safe.
>>
>> Could you please explain in more details, since I stil have a completely opposite point of view?
>>
>> Every sk referenced in skb have sk_wmem_alloc > 9 
>> It is assigned to 1 in sk_alloc and decremented right before last __sk_free(),
>> inside  both sk_free() sock_wfree() and __sock_wfree()
>>
>> So it is safe to adjust skb->sk->sk_wmem_alloc, 
>> because alive skb keeps reference to alive sk and last one keeps sk_wmem_alloc > 0
>>
>> So any destructor used sk->sk_refcnt will already have sk_wmem_alloc > 0, 
>> because last sock_put() calls sk_free().
>>
>> However now I'm not sure in reversed direction.
>> skb_set_owner_w() check !sk_fullsock(sk) and call sock_hold(sk);
>> If sk->sk_refcnt can be 0 here (i.e. after execution of old destructor inside skb_orphan) 
>> -- it can be trigger pointed problem:
>> "refcount_add() will trigger a warning (panic under KASAN)".
>>
>> Could you please explain where I'm wrong?
> 
> To clarify:
> I'm agree it is unsafe  to call on alive skb:

I badly explained the problem in previous letter, let me repeat once again:

I'm told about this piece of code:
+	} else if (sk && skb->destructor != sock_edemux) {
+		delta = osize - skb_end_offset(skb);
+		if (!is_skb_wmem(skb))
+			skb_set_owner_w(skb, sk);
+		skb->truesize += delta;
+		if (sk_fullsock(sk))
+			refcount_add(delta, &sk->sk_wmem_alloc);
 	}

it is called on alive expanded skb and it is incorrect because 2 reasons:

a) if old destructor use ref on sk->sk_wmem_alloc
   It can decrease to 0 and release sk.
b) if old descriptor use ref on sk->refcnt and !sk_fullsock(sk)
    old decriptor can release last reference and release sk.

We can workaround release of sk by move of 
refcount_add(delta, &sk->sk_wmem_alloc) before skb_set_owner_w()

        } else if (sk && skb->destructor != sock_edemux) {
                delta = osize - skb_end_offset(skb);
                refcount_add(delta, &sk->sk_wmem_alloc);
                if (!is_skb_wmem(skb))
                        skb_set_owner_w(skb, sk);
                skb->truesize += delta;
#ifdef CONFIG_INET
                if (!sk_fullsock(sk))
                        refcount_dec(delta, &sk->sk_wmem_alloc);
#endif
        }

However it it does not resolve b) completely
 
oid skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
{
        skb_orphan(skb); <<< old destructor releases last sk->refcnt ...
        skb->sk = sk;
...
        if (unlikely(!sk_fullsock(sk))) {
                skb->destructor = sock_edemux;
                sock_hold(sk);   <<<< ...and it trigger wrining/panic 
                return;
        }       

Thank you,
	Vasily Averin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ