lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1c2719f0805121522g4767f585h4b33790318f44264@mail.gmail.com>
Date:	Mon, 12 May 2008 15:22:55 -0700
From:	"Jerry Chu" <hkchu@...gle.com>
To:	"David Miller" <davem@...emloft.net>
Cc:	netdev@...r.kernel.org
Subject: Re: Socket buffer sizes with autotuning

I did a quick prototype based on your idea of adding an "in_flight"
field to skb_shared_info to track how many in-flight clones in the
host. I tested
it quickly and it doesn't work. After some thought it was obvious why it
won't work. It's because what the TCP stack needs is to track how
many in-flight pkts are in the host, but your proposed patch increments
"in_flight" once on the 1st __skb_clone() to be sent to the driver, but
decrements "in_flight" TWICE, one for each of the clones to be freed.
I did a quick hack to make it work for my limited test case but I haven't
figured out an acceptable (non-hack) solution.

Continued testing, I discovered the problem I described below where
"in_flight" may point to a tp that has already been freed can not be
addressed by zapping skb_shinfo(skb)->in_flight in sock_wfree(). The
reason is that pkts may be acked and freed by TCP before driver freeing
up its clone copy (e.g., due to driver lazy reclaim...) When that happens
the "host_inflight" accounting will get messed up.

Jerry

On Wed, May 7, 2008 at 8:33 PM, Jerry Chu <hkchu@...gle.com> wrote:
> There seems to be quite a bit of complexity plus one additional pointer
>  field per skb_shared_info to make skb better track when a pkt leaves
>  the host. Now I wonder if it's really a better solution than my original,
>  simply checking dataref==1 approach which, although not bullet proof,
>  may be "good enough" for all practical purposes?
>
>  Jerry
>
>
>
>  On Wed, May 7, 2008 at 6:43 PM, David Miller <davem@...emloft.net> wrote:
>  > From: "Jerry Chu" <hkchu@...gle.com>
>  >  Date: Wed, 7 May 2008 18:37:01 -0700
>  >
>  >
>  >  > Ok, will give it a try. First i'll fix your patch to
>  >  > atomic_add()/atomic_sub() by
>  >  > skb_shinfo(skb)->gso_segs rather than always 1, in order for GSO/TSO to work.
>  >
>  >  That might not work.  gso_segs can change over time as retransmit
>  >  packets get split up due to SACKs etc.  it needs to be audited,
>  >  at the very least.
>  >
>  >
>  >  > One problem came up to my mind - it seems possible for __kfree_skb() to
>  >  > access skb_shinfo(skb)->in_flight whose tp has been freed up since only the
>  >  > original skb's on TCP's rexmit list have the owner set and socket
>  >  > held. One solution
>  >  > is for TCP to zap skb_shinfo(skb)->in_flight field when it's ready to
>  >  > free up skb.
>  >  > I can hack sock_wfree() to do this, but I don't know how to do it right.
>  >
>  >  There will be references to the socket, so this should be ok.
>  >
>  >  If it isn't we can adjust the count and zap the pointer in
>  >  skb_orphan().
>  >
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ