lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 16 Dec 2019 14:34:26 +0200
From:   Ilias Apalodimas <ilias.apalodimas@...aro.org>
To:     Michal Hocko <mhocko@...nel.org>
Cc:     Yunsheng Lin <linyunsheng@...wei.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        "brouer@...hat.com" <brouer@...hat.com>,
        "jonathan.lemon@...il.com" <jonathan.lemon@...il.com>,
        Li Rongqing <lirongqing@...du.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        peterz@...radead.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        bhelgaas@...gle.com,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE
 condition

Hi Michal, 
On Mon, Dec 16, 2019 at 01:15:57PM +0100, Michal Hocko wrote:
> On Thu 12-12-19 09:34:14, Yunsheng Lin wrote:
> > +CC Michal, Peter, Greg and Bjorn
> > Because there has been disscusion about where and how the NUMA_NO_NODE
> > should be handled before.
> 
> I do not have a full context. What is the question here?

When we allocate pages for the page_pool API, during the init, the driver writer
decides which NUMA node to use. The API can,  in some cases recycle the memory,
instead of freeing it and re-allocating it. If the NUMA node has changed (irq
affinity for example), we forbid recycling and free the memory, since recycling
and using memory on far NUMA nodes is more expensive (more expensive than
recycling, at least on the architectures we tried anyway).
Since this would be expensive to do it per packet, the burden falls on the 
driver writer for that. Drivers *have* to call page_pool_update_nid() or 
page_pool_nid_changed() if they want to check for that which runs once
per NAPI cycle.

The current code in the API though does not account for NUMA_NO_NODE. That's
what this is trying to fix.
If the page_pool params are initialized with that, we *never* recycle
the memory. This is happening because the API is allocating memory with 
'nid = numa_mem_id()' if NUMA_NO_NODE is configured so the current if statement
'page_to_nid(page) == pool->p.nid' will never trigger.

The initial proposal was to check:
pool->p.nid == NUMA_NO_NODE && page_to_nid(page) == numa_mem_id()));

After that the thread span out of control :)
My question is do we *really* have to check for 
page_to_nid(page) == numa_mem_id()? if the architecture is not NUMA aware
wouldn't pool->p.nid == NUMA_NO_NODE be enough?

Thanks
/Ilias
> -- 
> Michal Hocko
> SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ