lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 9 Dec 2019 03:47:50 +0000
From:   "Li,Rongqing" <lirongqing@...du.com>
To:     Yunsheng Lin <linyunsheng@...wei.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        "jonathan.lemon@...il.com" <jonathan.lemon@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "brouer@...hat.com" <brouer@...hat.com>,
        "ilias.apalodimas@...aro.org" <ilias.apalodimas@...aro.org>
CC:     "ivan.khoronzhuk@...aro.org" <ivan.khoronzhuk@...aro.org>,
        "grygorii.strashko@...com" <grygorii.strashko@...com>
Subject: 答复: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE condition

Cc: Grygorii Strashko  Ivan Khoronzhuk

I see that cpsw is using NUMA_NO_NODE when init page pool

> On 2019/12/7 11:52, Saeed Mahameed wrote:
> > On Fri, 2019-12-06 at 17:32 +0800, Li RongQing wrote:
> >> some drivers uses page pool, but not require to allocate pages from
> >> bound node, or simply assign pool.p.nid to NUMA_NO_NODE, and the
> >> commit d5394610b1ba ("page_pool:
> >> Don't recycle non-reusable pages") will block this kind of driver to
> >> recycle
> >>
> >> so take page as reusable when page belongs to current memory node if
> >> nid is NUMA_NO_NODE
> >>
> >> v1-->v2: add check with numa_mem_id from Yunsheng
> >>
> >> Fixes: d5394610b1ba ("page_pool: Don't recycle non-reusable pages")
> >> Signed-off-by: Li RongQing <lirongqing@...du.com>
> >> Suggested-by: Yunsheng Lin <linyunsheng@...wei.com>
> >> Cc: Saeed Mahameed <saeedm@...lanox.com>
> >> ---
> >>  net/core/page_pool.c | 7 ++++++-
> >>  1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/net/core/page_pool.c b/net/core/page_pool.c index
> >> a6aefe989043..3c8b51ccd1c1 100644
> >> --- a/net/core/page_pool.c
> >> +++ b/net/core/page_pool.c
> >> @@ -312,12 +312,17 @@ static bool __page_pool_recycle_direct(struct
> >> page *page,
> >>  /* page is NOT reusable when:
> >>   * 1) allocated when system is under some pressure.
> >> (page_is_pfmemalloc)
> >>   * 2) belongs to a different NUMA node than pool->p.nid.
> >> + * 3) belongs to a different memory node than current context
> >> + * if pool->p.nid is NUMA_NO_NODE
> >>   *
> >>   * To update pool->p.nid users must call page_pool_update_nid.
> >>   */
> >>  static bool pool_page_reusable(struct page_pool *pool, struct page
> >> *page)
> >>  {
> >> -	return !page_is_pfmemalloc(page) && page_to_nid(page) == pool-
> >>> p.nid;
> >> +	return !page_is_pfmemalloc(page) &&
> >> +		(page_to_nid(page) == pool->p.nid ||
> >> +		(pool->p.nid == NUMA_NO_NODE &&
> >> +		page_to_nid(page) == numa_mem_id()));
> >>  }
> >>
> >
> > Cc'ed Jesper, Ilias & Jonathan.
> >
> > I don't think it is correct to check that the page nid is same as
> > numa_mem_id() if pool is NUMA_NO_NODE. In such case we should allow
> > all pages to recycle, because you can't assume where pages are
> > allocated from and where they are being handled.
> >
> > I suggest the following:
> >
> > return !page_pfmemalloc() &&
> > ( page_to_nid(page) == pool->p.nid || pool->p.nid == NUMA_NO_NODE );
> >
> > 1) never recycle emergency pages, regardless of pool nid.
> > 2) always recycle if pool is NUMA_NO_NODE.
> 
> As I can see, below are the cases that the pool->p.nid could be
> NUMA_NO_NODE:
> 
> 1. kernel built with the CONFIG_NUMA being off.
> 
> 2. kernel built with the CONFIG_NUMA being on, but FW/BIOS dose not provide
>    a valid node id through ACPI/DT, and it has the below cases:
> 
>    a). the hardware itself is single numa node system, so maybe it is valid
>        to not provide a valid node for the device.
>    b). the hardware itself is multi numa nodes system, and the FW/BIOS is
>        broken that it does not provide a valid one.
> 
> 3. kernel built with the CONFIG_NUMA being on, and FW/BIOS dose provide a
>    valid node id through ACPI/DT, but the driver does not pass the valid
>    node id when calling page_pool_init().
> 
> I am not sure which case this patch is trying to fix, maybe Rongqing can help to
> clarify.
> 
> For case 1 and case 2 (a), I suppose checking pool->p.nid being
> NUMA_NO_NODE is enough.
> 
> For case 2 (b), There may be argument that it should be fixed in the BIOS/FW,
> But it also can be argued that the numa_mem_id() checking has been done in
> the driver that has not using page pool yet when deciding whether to do
> recycling, see [1]. If I understanding correctly, recycling is normally done at the
> NAPI pooling, which has the same affinity as the rx interrupt, and rx interrupt is
> not changed very often. If it does change to different memory node, maybe it
> makes sense not to recycle the old page belongs to old node?
> 
> 
> For case 3, I am not sure if any driver is doing that, and if the page pool API
> even allow that?
> 

I think pool_page_reusable should support NUMA_NO_NODE no matter which cases


And recycling is normally done at the NAPI pooling, NUMA_NO_NODE hint to use the
local node, except memory usage is unbalance, so I add the check that the page nid is
same as numa_mem_id() if pool is NUMA_NO_NODE

-Li


> [1] https://elixir.bootlin.com/linux/latest/ident/numa_mem_id
> 
> >
> > the above change should not add any overhead, a modest branch
> > predictor will handle this with no effort.
> >
> > Jesper et al. what do you think?
> >
> > -Saeed.
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ