lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4a682251-3b40-b16a-8999-69acb36634f3@huawei.com>
Date:   Fri, 17 Sep 2021 17:17:08 +0800
From:   Yunsheng Lin <linyunsheng@...wei.com>
To:     Ilias Apalodimas <ilias.apalodimas@...aro.org>
CC:     Jesper Dangaard Brouer <jbrouer@...hat.com>, <brouer@...hat.com>,
        Alexander Duyck <alexander.duyck@...il.com>,
        <davem@...emloft.net>, <kuba@...nel.org>, <netdev@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <linuxarm@...neuler.org>,
        <hawk@...nel.org>, <jonathan.lemon@...il.com>, <alobakin@...me>,
        <willemb@...gle.com>, <cong.wang@...edance.com>,
        <pabeni@...hat.com>, <haokexin@...il.com>, <nogikh@...gle.com>,
        <elver@...gle.com>, <memxor@...il.com>, <edumazet@...gle.com>,
        <dsahern@...il.com>
Subject: Re: [PATCH net-next v2 3/3] skbuff: keep track of pp page when
 __skb_frag_ref() is called

On 2021/9/17 14:38, Ilias Apalodimas wrote:
> Hi Yunsheng,
> 
> [...]
>>>>>>>> I am not sure "pp_recycle_bit was introduced to make the checking faster" is a
>>>>>>>> valid. The size of "struct page" is only about 9 words(36/72 bytes), which is
>>>>>>>> mostly to be in the same cache line, and both standard path and recycle path have
>>>>>>>> been touching the "struct page", so it seems the overhead for checking signature
>>>>>>>> seems minimal.
>>>>>>>>
>>>>>>>> I agree that we need to be cautious and measure potential regression on the
>>>>>>>> standard path.
>>>>>>>
>>>>>>> well pp_recycle is on the same cache line boundary with the head_frag we
>>>>>>> need to decide on recycling. After that we start checking page signatures
>>>>>>> etc,  which means the default release path remains mostly unaffected.  
>>>>>>>
>>>>>>> I guess what you are saying here, is that 'struct page' is going to be
>>>>>>> accessed eventually by the default network path,  so there won't be any 
>>>>>>> noticeable performance hit?  What about the other usecases we have
>>>>>>
>>>>>> Yes.
>>>>>
>>>>> In that case you'd need to call virt_to_head_page() early though, get it
>>>>> and then compare the signature.   I guess that's avoidable by using 
>>>>> frag->bv_page for the fragments?
>>>>
>>>> If a page of a skb frag is from page pool, It seems frag->bv_page is
>>>> always point to head_page of a compound page, so the calling of
>>>> virt_to_head_page() does not seems necessary.
>>>>
>>>
>>> I was mostly referring to the skb head here and how would you trigger the
>>> recycling path. 
>>>
>>> I think we are talking about different things here.  
>>> One idea is to use the last bit of frag->bv_page to identify fragments
>>> allocated from page_pool, which is done today with the signature.
>>>
>>> The signature however exists in the head page so my question was, can we rid
>>> of that without having a performance penalty?
>>
>> As both skb frag and head page is eventually operated on the head page
>> of a compound page(if it is a compound page) for normal case too, maybe
>> we can refactor the code to get the head page of a compound page before
>> the signature checking without doing a second virt_to_head_page() or
>> compound_head() call?
> 
> Yea that's doable, but my concern is different here.  If we do that the
> standard network stack, even for drivers that don't use page_pool,  will
> have to do a virt_to_head_page() -> check signature, to decide if it has to
> try recycling the packet.  That's the performance part I am worried about,
> since it happens for every packet. 

Yes, there is theoretically performance penalty for virt_to_head_page() or
compound_head(), will do more test if we decide to go with the signature
checking.

> 
>>
>>>
>>> IOW in skb_free_head() an we replace:
>>>
>>> if (skb_pp_recycle(skb, head)) 
>>> with
>>> if (page->pp_magic & ~0x3UL) == PP_SIGNATURE)
>>> and get rid of the 'bool recycle' argument in __skb_frag_unref()?
>>
>> For the frag page of a skb, it seems ok to get rid of the 'bool recycle'
>> argument in __skb_frag_unref(), as __skb_frag_unref() and __skb_frag_ref()
>> is symmetrically called to put/get a page.
>>
>> For the head page of a skb, we might need to make sure the head page
>> passed to __build_skb_around() meet below condition:
>> do pp_frag_count incrementing instead of _refcount incrementing when
>> the head page is not newly allocated and it is from page pool.
>> It seems hard to audit that?
> 
> Yea that seems a bit weird at least to me and I am not sure, it's the only
> place we'll have to go and do that.

Yes, That is why I avoid changing the behavior of a head page for a skb.
In other word, maybe we should not track if head page for a skb is pp page
or not when the page'_refcount is incremented during network stack journey,
just treat it as normal page?

> 
>>
>>
>>>
>>>> bit 0 of frag->bv_page is different way of indicatior for a pp page,
>>>> it is better we do not confuse with the page signature way. Using
>>>> a bit 0 may give us a free word in 'struct page' if we manage to
>>>> use skb->pp_recycle to indicate a head page of the skb uniquely, meaning
>>>> page->pp_magic can be used for future feature.
>>>>
>>>>
>>>>>
>>>>>>
>>>>>>> for pp_recycle right now?  __skb_frag_unref() in skb_shift() or
>>>>>>> skb_try_coalesce() (the latter can probably be removed tbh).
>>>>>>
>>>>>> If we decide to go with accurate indicator of a pp page, we just need
>>>>>> to make sure network stack use __skb_frag_unref() and __skb_frag_ref()
>>>>>> to put and get a page frag, the indicator checking need only done in
>>>>>> __skb_frag_unref() and __skb_frag_ref(), so the skb_shift() and
>>>>>> skb_try_coalesce() should be fine too.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Another way is to use the bit 0 of frag->bv_page ptr to indicate if a frag
>>>>>>>> page is from page pool.
>>>>>>>
>>>>>>> Instead of the 'struct page' signature?  And the pp_recycle bit will
>>>>>>> continue to exist?  
>>>>>>
>>>>>> pp_recycle bit might only exist or is only used for the head page for the skb.
>>>>>> The bit 0 of frag->bv_page ptr can be used to indicate a frag page uniquely.
>>>>>> Doing a memcpying of shinfo or "*fragto = *fragfrom" automatically pass the
>>>>>> indicator to the new shinfo before doing a __skb_frag_ref(), and __skb_frag_ref()
>>>>>> will increment the _refcount or pp_frag_count according to the bit 0 of
>>>>>> frag->bv_page.
>>>>>>
>>>>>> By the way, I also prototype the above idea, and it seems to work well too.
>>>>>>
>>>>>
>>>>> As long as no one else touches this, it's just another way of identifying a
>>>>> page_pool allocated page.  But are we gaining by that?  Not using
>>>>> virt_to_head_page() as stated above? But in that case you still need to
>>>>> keep pp_recycle around. 
>>>>
>>>> No, we do not need the pp_recycle, as long as the we make sure __skb_frag_ref()
>>>> is called after memcpying the shinfo or doing "*fragto = *fragfrom".
>>>
>>> But we'll have to keep it for the skb head in this case.
>>
>> As above, I am not really look into skb head case:)
> 
> Let me take a step back here, because I think we drifted a bit. 
> The page signature was introduced in order to be able to identify skb
> fragments. The problem was that you couldn't rely on the pp_recycle bit of
> the skb head,  since fragments could come from anywhere.  So you use the
> skb bit as a hint for skb frags, and you eventually decide using the page
> signature.
> 
> So we got 3 options (Anything I've missed ?)
> - try to remove pp_recycle bit, since the page signature is enough for the
>   skb head and fragments.  That in my opinion is the cleanest option,  as
>   long as we can prove there's no performance hit on the standard network
>   path.
> 
> - Replace the page signature with frag->bv_page bit0.  In that case we
>   still have to keep the pp_recycle bit,  but we do have an 'easier'
>   indication that a skb frag comes from page_pool.  That's still pretty
>   safe, since you now have unique identifiers for the skb and page
>   fragments and you can be sure of their origin (page pool or not).
>   What I am missing here, is what do we get out of this?  I think the
>   advantage is not having to call virt_to_head_page() for frags ?

Not using the signature will free a word space in struct page for future
feature?

> 
> - Keep all of them(?) and use frag->bv_page bit0 similarly to pp_recycle
>   bit?  I don't see much value on this one,  I am just keeping it here for
>   completeness.


For safty and performance reason:
1. maybe we should move the pp_recycle bit from "struct sk_buff" to
   "struct skb_shared_info", and use it to only indicate if the head page of
   a skb is from page pool.

2. The frag->bv_page bit0 is used to indicate if the frag page of a skb is
   from page pool, and modify __skb_frag_unref() and __skb_frag_ref() to keep
   track of it.

3. For safty or debugging reason, keep the page signature for now, and put a
   page signature WARN_ON checking in page pool to catch any misbehaviour?

If there is not bug showing up later, maybe we can free the page signature space
for other usage?

> 
> Thanks
> /Ilias

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ