netdev - Re: [PATCH net-next v6 08/12] libie: add Rx buffer management (via Page Pool)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <03d7e8b0-8766-4f59-afd4-15b592693a83@intel.com>
Date: Mon, 11 Dec 2023 11:16:20 +0100
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Yunsheng Lin <linyunsheng@...wei.com>
CC: "David S. Miller" <davem@...emloft.net>, Eric Dumazet
	<edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni
	<pabeni@...hat.com>, Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
	Michal Kubiak <michal.kubiak@...el.com>, Larysa Zaremba
	<larysa.zaremba@...el.com>, Alexander Duyck <alexanderduyck@...com>, "David
 Christensen" <drc@...ux.vnet.ibm.com>, Jesper Dangaard Brouer
	<hawk@...nel.org>, Ilias Apalodimas <ilias.apalodimas@...aro.org>, "Paul
 Menzel" <pmenzel@...gen.mpg.de>, <netdev@...r.kernel.org>,
	<intel-wired-lan@...ts.osuosl.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net-next v6 08/12] libie: add Rx buffer management (via
 Page Pool)

From: Yunsheng Lin <linyunsheng@...wei.com>
Date: Fri, 8 Dec 2023 17:28:21 +0800

> On 2023/12/8 1:20, Alexander Lobakin wrote:
> ...
> 
>> +
>> +/**
>> + * libie_rx_page_pool_create - create a PP with the default libie settings
>> + * @bq: buffer queue struct to fill
>> + * @napi: &napi_struct covering this PP (no usage outside its poll loops)
>> + *
>> + * Return: 0 on success, -errno on failure.
>> + */
>> +int libie_rx_page_pool_create(struct libie_buf_queue *bq,
>> +			      struct napi_struct *napi)
>> +{
>> +	struct page_pool_params pp = {
>> +		.flags		= PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV,
>> +		.order		= LIBIE_RX_PAGE_ORDER,
>> +		.pool_size	= bq->count,
>> +		.nid		= NUMA_NO_NODE,
> 
> Is there a reason the NUMA_NO_NODE is used here instead of
> dev_to_node(napi->dev->dev.parent)?

NUMA_NO_NODE creates a "dynamic" page_pool and makes sure the pages are
local to the CPU where PP allocation functions are called. Setting ::nid
to a "static" value pins the PP to a particular node.
But the main reason is that Rx queues can be distributed across several
nodes and in that case NUMA_NO_NODE will make sure each page_pool is
local to the queue it's running on. dev_to_node() will return the same
value, thus forcing some PPs to allocate remote pages.

Ideally, I'd like to pass a CPU ID this queue will be run on and use
cpu_to_node(), but currently there's no NUMA-aware allocations in the
Intel drivers and Rx queues don't get the corresponding CPU ID when
configuring. I may revisit this later, but for now NUMA_NO_NODE is the
most optimal solution here.

[...]

Thanks,
Olek