[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1c2719f0805121522g4767f585h4b33790318f44264@mail.gmail.com>
Date: Mon, 12 May 2008 15:22:55 -0700
From: "Jerry Chu" <hkchu@...gle.com>
To: "David Miller" <davem@...emloft.net>
Cc: netdev@...r.kernel.org
Subject: Re: Socket buffer sizes with autotuning
I did a quick prototype based on your idea of adding an "in_flight"
field to skb_shared_info to track how many in-flight clones in the
host. I tested
it quickly and it doesn't work. After some thought it was obvious why it
won't work. It's because what the TCP stack needs is to track how
many in-flight pkts are in the host, but your proposed patch increments
"in_flight" once on the 1st __skb_clone() to be sent to the driver, but
decrements "in_flight" TWICE, one for each of the clones to be freed.
I did a quick hack to make it work for my limited test case but I haven't
figured out an acceptable (non-hack) solution.
Continued testing, I discovered the problem I described below where
"in_flight" may point to a tp that has already been freed can not be
addressed by zapping skb_shinfo(skb)->in_flight in sock_wfree(). The
reason is that pkts may be acked and freed by TCP before driver freeing
up its clone copy (e.g., due to driver lazy reclaim...) When that happens
the "host_inflight" accounting will get messed up.
Jerry
On Wed, May 7, 2008 at 8:33 PM, Jerry Chu <hkchu@...gle.com> wrote:
> There seems to be quite a bit of complexity plus one additional pointer
> field per skb_shared_info to make skb better track when a pkt leaves
> the host. Now I wonder if it's really a better solution than my original,
> simply checking dataref==1 approach which, although not bullet proof,
> may be "good enough" for all practical purposes?
>
> Jerry
>
>
>
> On Wed, May 7, 2008 at 6:43 PM, David Miller <davem@...emloft.net> wrote:
> > From: "Jerry Chu" <hkchu@...gle.com>
> > Date: Wed, 7 May 2008 18:37:01 -0700
> >
> >
> > > Ok, will give it a try. First i'll fix your patch to
> > > atomic_add()/atomic_sub() by
> > > skb_shinfo(skb)->gso_segs rather than always 1, in order for GSO/TSO to work.
> >
> > That might not work. gso_segs can change over time as retransmit
> > packets get split up due to SACKs etc. it needs to be audited,
> > at the very least.
> >
> >
> > > One problem came up to my mind - it seems possible for __kfree_skb() to
> > > access skb_shinfo(skb)->in_flight whose tp has been freed up since only the
> > > original skb's on TCP's rexmit list have the owner set and socket
> > > held. One solution
> > > is for TCP to zap skb_shinfo(skb)->in_flight field when it's ready to
> > > free up skb.
> > > I can hack sock_wfree() to do this, but I don't know how to do it right.
> >
> > There will be references to the socket, so this should be ok.
> >
> > If it isn't we can adjust the count and zap the pointer in
> > skb_orphan().
> >
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists