| lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
|
Open Source and information security mailing list archives
| ||
|
Message-ID: <1318834974.2500.61.camel@edumazet-laptop> Date: Mon, 17 Oct 2011 09:02:54 +0200 From: Eric Dumazet <eric.dumazet@...il.com> To: David Miller <davem@...emloft.net> Cc: rick.jones2@...com, netdev@...r.kernel.org Subject: Re: [PATCH net-next] tcp: reduce memory needs of out of order queue Le dimanche 16 octobre 2011 à 20:53 -0400, David Miller a écrit : > So perhaps the best solution is to divorce truesize from such driver > and device details? If there is one calculation, then TCP need only > be concerned with one case. > > Look at how confusing and useless tcp_adv_win_scale ends up being for > this problem. > > Therefore I'll make the mostly-serious propsal that truesize be > something like "initial_real_total_data + sizeof(metadata)" > > So if a device receives a 512 byte packet, it's: > > 512 + sizeof(metadata) > That would probably OOM in stress situation, with thousand of sockets. > It still provides the necessary protection that truesize is meant to > provide, yet sanitizes all of the receive and send buffer overhead > handling. > > TCP should be absoultely, and completely, impervious to details like > how buffering needs to be done for some random wireless card. Just > the mere fact that using a larger buffer in a driver ruins TCP > performance indicates a serious design failure. > I dont think its a design failure. Its the same problem when computing the TCP window given the rcvspace (memory we allow to be consumed for the socket) based on the MSS : If the sender uses 1-bytes frames only, then receiver hit the memory limit and performance drops. Right now our tcp-window tuning really assumes too much : perfect MSS skb using _exactly_ MSS + sizeof(metadata), while we already know that real slab cost is higher : __roundup_pow_of_two(MSS + sizeof(struct skb_shared_info)) + SKB_DATA_ALIGN(sizeof(struct sk_buff)) and now with paged frag devices : PAGE_SIZE + SKB_DATA_ALIGN(sizeof(struct sk_buff)) We assume sender behaves correctly and drivers dont use 64KB pages to store a single 72-bytes frame I would say the first thing TCP stack must respect is the memory limits that the admin set for it. Thats what skb->truesize is for. # cat /proc/sys/net/ipv4/tcp_rmem 4096 87380 4127616 In this case, we allow up to 4Mbytes or receiver memory per session. Not 20 or 30 Mbytes... We must translate this to a TCP window, suitable for current hardware. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists