netdev - Re: [PATCH 05/10] net: move destructor_arg to the front of sk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4F85B1EA.9000600@intel.com>
Date:	Wed, 11 Apr 2012 09:31:38 -0700
From:	Alexander Duyck <alexander.h.duyck@...el.com>
To:	Ian Campbell <Ian.Campbell@...rix.com>
CC:	Eric Dumazet <eric.dumazet@...il.com>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	David Miller <davem@...emloft.net>,
	"Michael S. Tsirkin" <mst@...hat.com>,
	"Wei Liu (Intern)" <wei.liu2@...rix.com>,
	"xen-devel@...ts.xen.org" <xen-devel@...ts.xen.org>
Subject: Re: [PATCH 05/10] net: move destructor_arg to the front of sk_buff.

On 04/11/2012 01:00 AM, Ian Campbell wrote:
> On Tue, 2012-04-10 at 20:15 +0100, Alexander Duyck wrote:
>> On 04/10/2012 11:41 AM, Eric Dumazet wrote:
>>> On Tue, 2012-04-10 at 11:33 -0700, Alexander Duyck wrote:
>>>
>>>> Have you checked this for 32 bit as well as 64?  Based on my math your
>>>> next patch will still mess up the memset on 32 bit with the structure
>>>> being split somewhere just in front of hwtstamps.
>>>>
>>>> Why not just take frags and move it to the start of the structure?  It
>>>> is already an unknown value because it can be either 16 or 17 depending
>>>> on the value of PAGE_SIZE, and since you are making changes to frags the
>>>> changes wouldn't impact the alignment of the other values later on since
>>>> you are aligning the end of the structure.  That way you would be
>>>> guaranteed that all of the fields that will be memset would be in the
>>>> last 64 bytes.
>>>>
>>> Now when a fragmented packet is copied in pskb_expand_head(), you access
>>> two separate zones of memory to copy the shinfo. But its supposed to be
>>> slow path.
>>>
>>> Problem with this is that the offsets of often used fields will be big
>>> (instead of being < 127) and code will be bigger on x86.
>> Actually now that I think about it my concerns go much further than the
>> memset.  I'm convinced that this is going to cause a pretty significant
>> performance regression on multiple drivers, especially on non x86_64
>> architecture.  What we have right now on most platforms is a
>> skb_shared_info structure in which everything up to and including frag 0
>> is all in one cache line.  This gives us pretty good performance for igb
>> and ixgbe since that is our common case when jumbo frames are not
>> enabled is to split the head and place the data in a page.
> With all the changes in this series it is still possible to fit a
> maximum standard MTU frame and the shinfo on the same 4K page while also
> have the skb_shared_info up to and including frag [0] aligned to the
> same 64 byte cache line. 
>
> The only exception is destructor_arg on 64 bit which is on the preceding
> cache line but that is not a field used in any hot path.
The problem I have is that this is only true on x86_64.  Proper work
hasn't been done to guarantee this on any other architectures.

I think what I would like to see is instead of just setting things up
and hoping it comes out cache aligned on nr_frags why not take steps to
guarantee it?  You could do something like place and size the structure
based on:
SKB_DATA_ALIGN(sizeof(skb_shared_info) - offsetof(struct
skb_shared_info, nr_frags)) + offsetof(struct skb_shared_info, nr_frags)

That way you would have your alignment still guaranteed based off of the
end of the structure, but anything placed before nr_frags would be
placed on the end of the previous cache line.

>> However the change being recommend here only resolves the issue for one
>> specific architecture, and that is what I don't agree with.  What we
>> need is a solution that also works for 64K pages or 32 bit pointers and
>> I am fairly certain this current solution does not.
> I think it does work for 32 bit pointers. What issue to do you see with
> 64K pages?
>
> Ian.
With 64K pages the MAX_SKB_FRAGS value drops from 17 to 16.  That will
undoubtedly mess up the alignment.

Thanks,

Alex
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html