netdev - Re: [PATCH 0/3] net: time stamping fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1319030503.8416.11.camel@edumazet-laptop>
Date:	Wed, 19 Oct 2011 15:21:43 +0200
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Johannes Berg <johannes@...solutions.net>
Cc:	Richard Cochran <richardcochran@...il.com>,
	David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH 0/3] net: time stamping fixes

Le mercredi 19 octobre 2011 à 14:58 +0200, Johannes Berg a écrit :
> On Wed, 2011-10-19 at 14:38 +0200, Eric Dumazet wrote:
> > Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> > > On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > > > The only thing I'm not completely sure about is whether or not it is
> > > > permissible to sock_hold() at that point. I'm probably just missing
> > > > something, but: if sk_free() was called before hard_start_xmit() which
> > > > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> > > > 
> > 
> > This is not possible, or something is really broken. We specifically
> > dont skb_orphan(skb) if we know tx timestamping is enabled for this skb.
> 
> Why can't sk_free() have been called? I'm not thinking of sock_wfree()
> which can't have been called -- so the socket surely still exists
> because skb->truesize is still accounted to it -- but what says
> sk_refcnt hasn't reached 0 yet?
> 
> > /*
> >  * Try to orphan skb early, right before transmission by the device.
> >  * We cannot orphan skb if tx timestamp is requested or the sk-reference
> >  * is needed on driver level for other reasons, e.g. see net/can/raw.c
> >  */
> > static inline void skb_orphan_try(struct sk_buff *skb)
> > {
> >         struct sock *sk = skb->sk;
> > 
> >         if (sk && !skb_shinfo(skb)->tx_flags) {
> >                 /* skb_tx_hash() wont be able to get sk.
> >                  * We copy sk_hash into skb->rxhash
> >                  */
> >                 if (!skb->rxhash)
> >                         skb->rxhash = sk->sk_hash;
> >                 skb_orphan(skb);
> >         }
> > }
> 
> Right.
> 
> > I dont really understand what's the concern, since sk_free() doesnt care
> > at all about sk_refcnt, but sk_wmem_alloc.
> 
> Right.
> 
> > void sk_free(struct sock *sk)
> [snip]
> 
> > If one skb is in flight, and still linked to a socket, then this socket
> > cannot disappear, because this skb->truesize was accounted into
> > sk->sk_wmem_alloc
> 
> This is undoubtedly true, I'm not disputing this.
> 
> > Of course, this point is valid as long as skb had not been orphaned.
> > 
> > sk_refcnt can be 0, if user closed the socket, but socket wont disappear
> > as long as sk_wmem_alloc is not 0.
> 
> Not disputing this either. But you said sk_refcnt can be 0, so why can't
> the following happen:
> 
> /* skb; skb->sk = sk; skb->destructor = sock_wfree; */
> 
> /* skb is on qdisc, some time passes */
> 
> sk_free(sk); /* user closed socket,
>                 sk->sk_refcnt reaches 0,
> 		sk->sk_wmem_alloc == skb->truesize,
> 		__sk_free not called, socket still lives,
> 		but no more +1 in sk_wmem_alloc */
> 
> /* some more time passes */
> 
> /* ethernet hard_start_xmit calls skb_clone_tx_timestamp() */
> skb2 = skb_clone(skb);
> skb2->sk = skb->sk;
> sock_hold(skb->sk);
> 
> /* ethernet TX completion calls skb_free(skb) */
> skb_free(skb):
>   sock_wfree(skb); /* sk_wmem_alloc reaches 0,
>                       __sk_free called DESPITE sk_refcnt > 0 */
> 
> /* later, in skb_complete_tx_timestamp() */
> sock_put(sk);	/* KABOOM */
> 
> 
> I just want to understand why this can't happen :-)

Since you answer your own question :)

Hmm, oh well, sk_refcnt is/should not be changed if a xmit packet is
duped, but sk_wmem_alloc should be, exactly paired with skb->truesize



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html