[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190618171951.17128ed8@carbon>
Date: Tue, 18 Jun 2019 17:19:51 +0200
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: Ivan Khoronzhuk <ivan.khoronzhuk@...aro.org>
Cc: Tariq Toukan <tariqt@...lanox.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
Toke Høiland-Jørgensen <toke@...e.dk>,
"toshiaki.makita1@...il.com" <toshiaki.makita1@...il.com>,
"grygorii.strashko@...com" <grygorii.strashko@...com>,
"mcroce@...hat.com" <mcroce@...hat.com>, brouer@...hat.com
Subject: Re: [PATCH net-next v1 08/11] xdp: tracking page_pool resources and
safe removal
On Tue, 18 Jun 2019 15:54:33 +0300 Ivan Khoronzhuk <ivan.khoronzhuk@...aro.org> wrote:
> On Sun, Jun 16, 2019 at 10:56:25AM +0000, Tariq Toukan wrote:
> >
> >On 6/15/2019 12:33 PM, Ivan Khoronzhuk wrote:
> >> On Thu, Jun 13, 2019 at 08:28:42PM +0200, Jesper Dangaard Brouer wrote:
[...]
> >>
> >> What would you recommend to do for the following situation:
> >>
> >> Same receive queue is shared between 2 network devices. The receive ring is
> >> filled by pages from page_pool, but you don't know the actual port (ndev)
> >> filling this ring, because a device is recognized only after packet is
> >> received.
> >>
> >> The API is so that xdp rxq is bind to network device, each frame has
> >> reference
> >> on it, so rxq ndev must be static. That means each netdev has it's own rxq
> >> instance even no need in it. Thus, after your changes, page must be
> >> returned to
> >> the pool it was taken from, or released from old pool and recycled in
> >> new one
> >> somehow.
> >>
> >> And that is inconvenience at least. It's hard to move pages between
> >> pools w/o performance penalty. No way to use common pool either,
> >> as unreg_rxq now drops the pool and 2 rxqa can't reference same
> >> pool.
> >
> >Within the single netdev, separate page_pool instances are anyway
> >created for different RX rings, working under different NAPI's.
>
> The circumstances are so that same RX ring is shared between 2
> netdevs... and netdev can be known only after descriptor/packet is
> received. Thus, while filling RX ring, there is no actual device,
> but when packet is received it has to be recycled to appropriate
> net device pool. Before this change there were no difference from
> which pool the page was allocated to fill RX ring, as there were no
> owner. After this change there is owner - netdev page pool.
It not really a dependency added in this patchset. A page_pool is
strictly bound to a single RX-queue, for performance, as this allow us
a NAPI fast-path return used for early drop (XDP_DROP).
I can see that the API xdp_rxq_info_reg_mem_model() make it possible to
call it on different xdp_rxq_info structs with the same page_pool
pointer. But it was never intended to be used like that, and I
consider it an API usage violation. I originally wanted to add the
allocator pointer to xdp_rxq_info_reg() call, but the API was extended
in different versions, so I didn't want to break users. I've actually
tried hard to catch when drivers use the API wrong, via WARN(), but I
guess you found a loop hole.
Besides, we already have a dependency from the RX-queue to the netdev
in the xdp_rxq_info structure. E.g. the xdp_rxq_info->dev is sort of
central, and dereferenced by BPF-code to read xdp_md->ingress_ifindex,
and also used by cpumap when creating SKBs.
> For cpsw the dma unmap is common for both netdevs and no difference
> who is freeing the page, but there is difference which pool it's
> freed to.
>
> So that, while filling RX ring the page is taken from page pool of
> ndev1, but packet is received for ndev2, it has to be later
> returned/recycled to page pool of ndev1, but when xdp buffer is
> handed over to xdp prog the xdp_rxq_info has reference on ndev2 ...
>
> And no way to predict the final ndev before packet is received, so no
> way to choose appropriate page pool as now it becomes page owner.
>
> So, while RX ring filling, the page/dma recycling is needed but should
> be some way to identify page owner only after receiving packet.
>
> Roughly speaking, something like:
>
> pool->pages_state_hold_cnt++;
>
> outside of page allocation API, after packet is received.
Don't EVER manipulate the internal state outside of page allocation
API. That kills the purpose of defining any API.
> and free of the counter while allocation (w/o owing the page).
You use-case of two netdev's sharing the same RX-queue sounds dubious,
and very hardware specific. I'm not sure why we want to bend the APIs
to support this?
If we had to allow page_pool to be registered twice, via
xdp_rxq_info_reg_mem_model() then I guess we could extend page_pool
with a usage/users reference count, and then only really free the
page_pool when refcnt reach zero. But it just seems and looks wrong
(in the code) as the hole trick to get performance is to only have one
user.
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists