[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1319027881.3103.27.camel@edumazet-laptop>
Date: Wed, 19 Oct 2011 14:38:01 +0200
From: Eric Dumazet <eric.dumazet@...il.com>
To: Richard Cochran <richardcochran@...il.com>
Cc: Johannes Berg <johannes@...solutions.net>,
David Miller <davem@...emloft.net>, netdev@...r.kernel.org
Subject: Re: [PATCH 0/3] net: time stamping fixes
Le mercredi 19 octobre 2011 à 13:50 +0200, Richard Cochran a écrit :
> On Wed, Oct 19, 2011 at 07:15:36AM +0200, Johannes Berg wrote:
> > The only thing I'm not completely sure about is whether or not it is
> > permissible to sock_hold() at that point. I'm probably just missing
> > something, but: if sk_free() was called before hard_start_xmit() which
> > will call skb_clone_tx_timestamp(), can we really call sock_hold()?
> >
This is not possible, or something is really broken. We specifically
dont skb_orphan(skb) if we know tx timestamping is enabled for this skb.
/*
* Try to orphan skb early, right before transmission by the device.
* We cannot orphan skb if tx timestamp is requested or the sk-reference
* is needed on driver level for other reasons, e.g. see net/can/raw.c
*/
static inline void skb_orphan_try(struct sk_buff *skb)
{
struct sock *sk = skb->sk;
if (sk && !skb_shinfo(skb)->tx_flags) {
/* skb_tx_hash() wont be able to get sk.
* We copy sk_hash into skb->rxhash
*/
if (!skb->rxhash)
skb->rxhash = sk->sk_hash;
skb_orphan(skb);
}
}
> > The reason I ask is that sock_wfree() doesn't check sk_refcnt, so if it
> > is possible for sk_free() to have been called before hard_start_xmit(),
> > maybe because the packet was stuck on the qdisc for a while, the socket
> > won't be released (sk_free checks sk_wmem_alloc) but the sk_wfree() when
> > the original skb is freed will actually free the socket, invalidating
> > the clone's sk pointer *even though* we called sock_hold() right after
> > making the clone.
> >
> > So what guarantees that sk_refcnt is still non-zero when we make the
> > clone?
>
> In the non-qdisc path, the kernel is in a send() call, so the initial
> reference taken in socket() is held.
>
> I really don't know the qdisc code, whether it is somehow holding the
> skb->sk indirectly or not.
>
> Eric? David?
I dont really understand what's the concern, since sk_free() doesnt care
at all about sk_refcnt, but sk_wmem_alloc.
void sk_free(struct sock *sk)
{
/*
* We subtract one from sk_wmem_alloc and can know if
* some packets are still in some tx queue.
* If not null, sock_wfree() will call __sk_free(sk) later
*/
if (atomic_dec_and_test(&sk->sk_wmem_alloc))
__sk_free(sk);
}
EXPORT_SYMBOL(sk_free);
If one skb is in flight, and still linked to a socket, then this socket
cannot disappear, because this skb->truesize was accounted into
sk->sk_wmem_alloc
Of course, this point is valid as long as skb had not been orphaned.
sk_refcnt can be 0, if user closed the socket, but socket wont disappear
as long as sk_wmem_alloc is not 0.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists