lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 09 Nov 2011 18:49:44 +0100 From: Eric Dumazet <eric.dumazet@...il.com> To: Ian Campbell <Ian.Campbell@...rix.com> Cc: David Miller <davem@...emloft.net>, Jesse Brandeburg <jesse.brandeburg@...el.com>, netdev@...r.kernel.org Subject: Re: [PATCH 0/4] skb paged fragment destructors Le mercredi 09 novembre 2011 à 15:01 +0000, Ian Campbell a écrit : > The following series makes use of the skb fragment API (which is in 3.2) > to add a per-paged-fragment destructor callback. This can be used by > creators of skbs who are interested in the lifecycle of the pages > included in that skb after they have handed it off to the network stack. > I think these have all been posted before, but have been backed up > behind the skb fragment API. > > The mail at [0] contains some more background and rationale but > basically the completed series will allow entities which inject pages > into the networking stack to receive a notification when the stack has > really finished with those pages (i.e. including retransmissions, > clones, pull-ups etc) and not just when the original skb is finished > with, which is beneficial to many subsystems which wish to inject pages > into the network stack without giving up full ownership of those page's > lifecycle. It implements something broadly along the lines of what was > described in [1]. > > I have also included a patch to the RPC subsystem which uses this API to > fix the bug which I describe at [2]. > > I presented this work at LPC in September and there was a > question/concern raised (by Jesse Brandenburg IIRC) regarding the > overhead of adding this extra field per fragment. If I understand > correctly it seems that in the there have been performance regressions > in the past with allocations outgrowing one allocation size bucket and > therefore using the next. The change in datastructure size resulting > from this series is: > BEFORE AFTER > AMD64: sizeof(struct skb_frag_struct) = 16 24 > sizeof(struct skb_shared_info) = 344 488 Thats a real problem, because 488 is soo big. (its even rounded to 512 bytes) Now, on x86, a half page (2048 bytes) wont be big enough to contain a typical frame (MTU=1500) NET_SKB_PAD (64) + 1500 + 14 + 512 > 2048 Even if we dont round 488 to 512, (no cache align skb_shared_info) we have a problem. NET_SKB_PAD (64) + 1500 + 14 + 488 > 2048 Why not using a low order bit to mark 'page' being a pointer to struct skb_frag_page_desc { struct page *p; atomic_t ref; int (*destroy)(void *data); /* void *data; */ /* no need, see container_of() */ }; struct skb_frag_struct { struct { union { struct page *p; /* low order bit not set */ struct skb_frag_page_desc *skbpage; /* low order bit set */ }; } page; ... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists