lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c06f6f59-6c35-4944-8f7a-7f6f0e076649@huawei.com>
Date: Thu, 15 Jun 2023 14:39:11 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: Alexander Duyck <alexander.duyck@...il.com>
CC: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>,
	<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>, Lorenzo Bianconi
	<lorenzo@...nel.org>, Jesper Dangaard Brouer <hawk@...nel.org>, Ilias
 Apalodimas <ilias.apalodimas@...aro.org>, Eric Dumazet <edumazet@...gle.com>
Subject: Re: [PATCH net-next v3 3/4] page_pool: introduce page_pool_alloc()
 API

On 2023/6/14 22:18, Alexander Duyck wrote:
> On Tue, Jun 13, 2023 at 8:51 PM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>>
>> On 2023/6/13 22:36, Alexander Duyck wrote:
>>> On Fri, Jun 9, 2023 at 6:20 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>>
>> ...
>>
>>>>
>>>> +static inline struct page *page_pool_alloc(struct page_pool *pool,
>>>> +                                          unsigned int *offset,
>>>> +                                          unsigned int *size, gfp_t gfp)
>>>> +{
>>>> +       unsigned int max_size = PAGE_SIZE << pool->p.order;
>>>> +       struct page *page;
>>>> +
>>>> +       *size = ALIGN(*size, dma_get_cache_alignment());
>>>> +
>>>> +       if (WARN_ON(*size > max_size))
>>>> +               return NULL;
>>>> +
>>>> +       if ((*size << 1) > max_size || PAGE_POOL_DMA_USE_PP_FRAG_COUNT) {
>>>> +               *size = max_size;
>>>> +               *offset = 0;
>>>> +               return page_pool_alloc_pages(pool, gfp);
>>>> +       }
>>>> +
>>>> +       page = __page_pool_alloc_frag(pool, offset, *size, gfp);
>>>> +       if (unlikely(!page))
>>>> +               return NULL;
>>>> +
>>>> +       /* There is very likely not enough space for another frag, so append the
>>>> +        * remaining size to the current frag to avoid truesize underestimate
>>>> +        * problem.
>>>> +        */
>>>> +       if (pool->frag_offset + *size > max_size) {
>>>> +               *size = max_size - *offset;
>>>> +               pool->frag_offset = max_size;
>>>> +       }
>>>> +
>>>
>>> Rather than preventing a truesize underestimation this will cause one.
>>> You are adding memory to the size of the page reserved and not
>>> accounting for it anywhere as this isn't reported up to the network
>>> stack. I would suggest dropping this from your patch.
>>
>> I was thinking about the driver author reporting it up to the network
>> stack using the new API as something like below:
>>
>> int truesize = size;
>> struct page *page;
>> int offset;
>>
>> page = page_pool_dev_alloc(pool, &offset, &truesize);
>> if (unlikely(!page))
>>         goto err;
>>
>> skb_add_rx_frag(skb, skb_shinfo(skb)->nr_frags, page,
>>                 offset, size, truesize);
>>
>> and similiar handling for *_build_skb() case too.
>>
>> Does it make senses for that? or did I miss something obvious here?
> 
> It is more the fact that you are creating a solution in search of a
> problem. As I said before most of the drivers will deal with these
> sorts of issues by just doing the fragmentation themselves or
> allocating fixed size frags and knowing how it will be divided into
> the page.

It seems that there are already some drivers which using the page pool
API with different frag size for almost every calling, the virtio_net
and veth are the obvious ones.

When reviewing the page frag support for virtio_net, I found that it
was manipulating the page_pool->frag_offset directly to do something
as this patch does, see:

https://lore.kernel.org/lkml/CAKhg4tL9PrUebqQHL+s7A6-xqNnju3erNQejMr7UFjwTaOduZw@mail.gmail.com/

I am not sure we are both agreed that drivers should not be manipulating
the page_pool->frag_offset directly unless it is really necessary?

For the specific case for virtio_net, it seems we have the below options:
1. both the driver and page pool do not handle it.
2. the driver handles it by manipulating the page_pool->frag_offset
   directly.
3. the page pool handles it as this patch does.

Is there any other options I missed for the specific case for virtio_net?
What is your perfer option? And why?

> 
> If you are going to go down this path then you should have a consumer
> for the API and fully implement it instead of taking half measures and
> making truesize underreporting worse by evicting pages earlier.

I am not sure I understand what do you mean by "a consumer for the API",
Do you mean adding a new API something like page_pool_free() to do
something ligthweight, such as decrementing the frag user and adjusting
the frag_offset, which is corresponding to the page_pool_alloc() API
introduced in this patch?
If yes, I was considering about that before, but I am not sure it worth
the effort, as for most usecase, it is a very rare case for error handling
as my understanding.

I just note that we already have page_pool_free() used by the page pool
destroy process,we might need to do something to avoid the confusion
between page_pool_alloc() and page_pool_free() :(

> 
> .
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ