netdev - Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <910156da-0b43-0a86-67a0-f4e7e6547373@huawei.com>
Date:   Wed, 11 Dec 2019 11:01:15 +0800
From:   Yunsheng Lin <linyunsheng@...wei.com>
To:     Saeed Mahameed <saeedm@...lanox.com>,
        "brouer@...hat.com" <brouer@...hat.com>
CC:     "jonathan.lemon@...il.com" <jonathan.lemon@...il.com>,
        "ilias.apalodimas@...aro.org" <ilias.apalodimas@...aro.org>,
        Li Rongqing <lirongqing@...du.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE
 condition

On 2019/12/11 3:45, Saeed Mahameed wrote:
>>> maybe assume that __page_pool_recycle_direct() is always called
>>> from
>>> the right node and change the current bogus check:
>>>
>>> from:
>>> page_to_nid(page) == pool->p.nid 
>>>
>>> to:
>>> page_to_nid(page) == numa_mem_id()
>>>
>>> This will allow recycling only if handling node is the same as
>>> where
>>> the page was allocated, regardless of pool->p.nid.
>>>
>>> so semantics are:
>>>
>>> 1) allocate from: pool->p.nid, as chosen by user.
>>> 2) recycle when: page_to_nid(page) == numa_mem_id().
>>> 3) pool user must guarantee that the handler will run on the right
>>> node. which should always be the case. otherwise recycling will be
>>> skipped (no cross numa recycling).
>>>
>>>
>>> a) if the pool migrates, we will stop recycling until the pool
>>> moves
>>> back to original node, or user calls pool_update_nid() as we do in
>>> mlx5.
>>> b) if pool is NUMA_NO_NODE, then allocation and handling will be
>>> done
>>> on numa_mem_id(), which means the above check will work.
>>
>> Only checking page_to_nid(page) == numa_mem_id() may not work for the
>> below
>> case in mvneta.c:
>>
>> static int mvneta_create_page_pool(struct mvneta_port *pp,
>> 				   struct mvneta_rx_queue *rxq, int
>> size)
>> {
>> 	struct bpf_prog *xdp_prog = READ_ONCE(pp->xdp_prog);
>> 	struct page_pool_params pp_params = {
>> 		.order = 0,
>> 		.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
>> 		.pool_size = size,
>> 		.nid = cpu_to_node(0),
>> 		.dev = pp->dev->dev.parent,
>> 		.dma_dir = xdp_prog ? DMA_BIDIRECTIONAL :
>> DMA_FROM_DEVICE,
>> 		.offset = pp->rx_offset_correction,
>> 		.max_len = MVNETA_MAX_RX_BUF_SIZE,
>> 	};
>>
>> the pool->p.nid is not NUMA_NO_NODE, then the node of page allocated
>> for rx
>> may not be numa_mem_id() when running in the NAPI polling, because
>> pool->p.nid
>> is not the same as the node of cpu running in the NAPI polling.
>>
>> Does the page pool support recycling for above case?
>>
> 
> I don't think you want to allow cross numa recycling.

Cross numa recycling is not what I want.

> 
>> Or we "fix' the above case by setting pool->p.nid to
>> NUMA_NO_NODE/dev_to_node(),
>> or by calling pool_update_nid() in NAPI polling as mlx5 does?
>>
> 
> Yes just update_nid when needed, and make sure the NAPI polling runs on
> a consistent core and eventually alloc/recycling will happen on the
> same core.

To me, passing NUMA_NO_NODE/dev_to_node() seems to always work.
Calling pool_update_nid() in NAPI polling is another way of passing
NUMA_NO_NODE to page_pool_init().

And it seems it is a copy & paste problem for mvneta and netsec
driver that uses cpu_to_node(0) as pool->p.nid but does not call
page_pool_nid_changed() in the NAPI polling as mlx5 does.

So I suggest to remove page_pool_nid_changed() and always use
NUMA_NO_NODE/dev_to_node() as pool->p.nid or make it clear (
by comment or warning?)that page_pool_nid_changed() should be
called when pool->p.nid is NUMA_NO_NODE/dev_to_node().

I prefer to remove page_pool_nid_changed() if we do not allow
cross numa recycling.


> 
>>
>>> Thanks,
>>> Saeed.
>>>
>>>
>>>
>>>
>>>
>>>
>>>