lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 16 Dec 2019 12:13:50 +0200
From:   Ilias Apalodimas <ilias.apalodimas@...aro.org>
To:     "Li,Rongqing" <lirongqing@...du.com>
Cc:     Yunsheng Lin <linyunsheng@...wei.com>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Saeed Mahameed <saeedm@...lanox.com>,
        "jonathan.lemon@...il.com" <jonathan.lemon@...il.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "mhocko@...nel.org" <mhocko@...nel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Björn Töpel <bjorn.topel@...el.com>
Subject: Re: 答复: [PATCH][v2] page_pool:
 handle page recycle for NUMA_NO_NODE condition

On Mon, Dec 16, 2019 at 04:02:04AM +0000, Li,Rongqing wrote:
> 
> 
> > -----邮件原件-----
> > 发件人: Yunsheng Lin [mailto:linyunsheng@...wei.com]
> > 发送时间: 2019年12月16日 9:51
> > 收件人: Jesper Dangaard Brouer <brouer@...hat.com>
> > 抄送: Li,Rongqing <lirongqing@...du.com>; Saeed Mahameed
> > <saeedm@...lanox.com>; ilias.apalodimas@...aro.org;
> > jonathan.lemon@...il.com; netdev@...r.kernel.org; mhocko@...nel.org;
> > peterz@...radead.org; Greg Kroah-Hartman <gregkh@...uxfoundation.org>;
> > bhelgaas@...gle.com; linux-kernel@...r.kernel.org; Björn Töpel
> > <bjorn.topel@...el.com>
> > 主题: Re: [PATCH][v2] page_pool: handle page recycle for NUMA_NO_NODE
> > condition
> > 
> > On 2019/12/13 16:48, Jesper Dangaard Brouer wrote:> You are basically saying
> > that the NUMA check should be moved to
> > > allocation time, as it is running the RX-CPU (NAPI).  And eventually
> > > after some time the pages will come from correct NUMA node.
> > >
> > > I think we can do that, and only affect the semi-fast-path.
> > > We just need to handle that pages in the ptr_ring that are recycled
> > > can be from the wrong NUMA node.  In __page_pool_get_cached() when
> > > consuming pages from the ptr_ring (__ptr_ring_consume_batched), then
> > > we can evict pages from wrong NUMA node.
> > 
> > Yes, that's workable.
> > 
> > >
> > > For the pool->alloc.cache we either accept, that it will eventually
> > > after some time be emptied (it is only in a 100% XDP_DROP workload that
> > > it will continue to reuse same pages).   Or we simply clear the
> > > pool->alloc.cache when calling page_pool_update_nid().
> > 
> > Simply clearing the pool->alloc.cache when calling page_pool_update_nid()
> > seems better.
> > 
> 
> How about the below codes, the driver can configure p.nid to any, which will be adjusted in NAPI polling, irq migration will not be problem, but it will add a check into hot path.

We'll have to check the impact on some high speed (i.e 100gbit) interface
between doing anything like that. Saeed's current patch runs once per NAPI. This
runs once per packet. The load might be measurable. 
The READ_ONCE is needed in case all producers/consumers run on the same CPU
right?


Thanks
/Ilias
> 
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index a6aefe989043..4374a6239d17 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -108,6 +108,10 @@ static struct page *__page_pool_get_cached(struct page_pool *pool)
>                 if (likely(pool->alloc.count)) {
>                         /* Fast-path */
>                         page = pool->alloc.cache[--pool->alloc.count];
> +
> +                       if (unlikely(READ_ONCE(pool->p.nid) != numa_mem_id()))
> +                               WRITE_ONCE(pool->p.nid, numa_mem_id());
> +
>                         return page;
>                 }
>                 refill = true;
> @@ -155,6 +159,10 @@ static struct page *__page_pool_alloc_pages_slow(struct page_pool *pool,
>         if (pool->p.order)
>                 gfp |= __GFP_COMP;
>  
> +
> +       if (unlikely(READ_ONCE(pool->p.nid) != numa_mem_id()))
> +               WRITE_ONCE(pool->p.nid, numa_mem_id());
> +
>         /* FUTURE development:
>          *
>          * Current slow-path essentially falls back to single page
> Thanks
> 
> -Li
> > >
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ