[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MN2PR21MB1375C9F1F2EA6C9F5E95E873CA0C0@MN2PR21MB1375.namprd21.prod.outlook.com>
Date: Wed, 22 Jan 2020 20:29:22 +0000
From: Haiyang Zhang <haiyangz@...rosoft.com>
To: Jesper Dangaard Brouer <brouer@...hat.com>
CC: "sashal@...nel.org" <sashal@...nel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
KY Srinivasan <kys@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
"olaf@...fle.de" <olaf@...fle.de>, vkuznets <vkuznets@...hat.com>,
"davem@...emloft.net" <davem@...emloft.net>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>
Subject: RE: [PATCH V3,net-next, 1/2] hv_netvsc: Add XDP support
> -----Original Message-----
> From: Jesper Dangaard Brouer <brouer@...hat.com>
> Sent: Wednesday, January 22, 2020 2:52 PM
> To: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: brouer@...hat.com; sashal@...nel.org; linux-hyperv@...r.kernel.org;
> netdev@...r.kernel.org; KY Srinivasan <kys@...rosoft.com>; Stephen
> Hemminger <sthemmin@...rosoft.com>; olaf@...fle.de; vkuznets
> <vkuznets@...hat.com>; davem@...emloft.net; linux-kernel@...r.kernel.org;
> Ilias Apalodimas <ilias.apalodimas@...aro.org>
> Subject: Re: [PATCH V3,net-next, 1/2] hv_netvsc: Add XDP support
>
> On Wed, 22 Jan 2020 09:23:33 -0800
> Haiyang Zhang <haiyangz@...rosoft.com> wrote:
>
> > +u32 netvsc_run_xdp(struct net_device *ndev, struct netvsc_channel *nvchan,
> > + struct xdp_buff *xdp)
> > +{
> > + void *data = nvchan->rsc.data[0];
> > + u32 len = nvchan->rsc.len[0];
> > + struct page *page = NULL;
> > + struct bpf_prog *prog;
> > + u32 act = XDP_PASS;
> > +
> > + xdp->data_hard_start = NULL;
> > +
> > + rcu_read_lock();
> > + prog = rcu_dereference(nvchan->bpf_prog);
> > +
> > + if (!prog)
> > + goto out;
> > +
> > + /* allocate page buffer for data */
> > + page = alloc_page(GFP_ATOMIC);
>
> The alloc_page() + __free_page() alone[1] cost 231 cycles(tsc) 64.395 ns.
> Thus, the XDP_DROP case will already be limited to just around 10Gbit/s
> 14.88 Mpps (67.2ns).
>
> XDP is suppose to be done for performance reasons. This looks like a slowdown.
>
> Measurement tool:
> [1]
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co
> m%2Fnetoptimizer%2Fprototype-
> kernel%2Fblob%2Fmaster%2Fkernel%2Fmm%2Fbench%2Fpage_bench01.c&am
> p;data=02%7C01%7Chaiyangz%40microsoft.com%7C681b5b13e50448d098d408
> d79f748522%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63715319
> 5109318994&sdata=pncqYIWm1yS5rDf%2BAIbWgskycmuofzl09yA1QmsRb
> M0%3D&reserved=0
On synthetic data path (netvsc), the per channel throughput is much slower than
10Gbps, because of the host side software based vSwitch. Also in most VMs on
Azure, Accelerated Network (SRIOV) is enabled. So the alloc_page() overhead on
synthetic data path won't impact performance significantly.
>
> > + if (!page) {
> > + act = XDP_DROP;
> > + goto out;
> > + }
> > +
> > + xdp->data_hard_start = page_address(page);
> > + xdp->data = xdp->data_hard_start + NETVSC_XDP_HDRM;
> > + xdp_set_data_meta_invalid(xdp);
> > + xdp->data_end = xdp->data + len;
> > + xdp->rxq = &nvchan->xdp_rxq;
> > + xdp->handle = 0;
> > +
> > + memcpy(xdp->data, data, len);
>
> And a memcpy.
As in the commit log:
The Azure/Hyper-V synthetic NIC receive buffer doesn't provide headroom
for XDP. We thought about re-use the RNDIS header space, but it's too
small. So we decided to copy the packets to a page buffer for XDP. And,
most of our VMs on Azure have Accelerated Network (SRIOV) enabled, so
most of the packets run on VF NIC. The synthetic NIC is considered as a
fallback data-path. So the data copy on netvsc won't impact performance
significantly.
>
> > +
> > + act = bpf_prog_run_xdp(prog, xdp);
> > +
> > + switch (act) {
> > + case XDP_PASS:
> > + case XDP_TX:
> > + case XDP_DROP:
> > + break;
> > +
> > + case XDP_ABORTED:
> > + trace_xdp_exception(ndev, prog, act);
> > + break;
> > +
> > + default:
> > + bpf_warn_invalid_xdp_action(act);
> > + }
> > +
> > +out:
> > + rcu_read_unlock();
> > +
> > + if (page && act != XDP_PASS && act != XDP_TX) {
> > + __free_page(page);
>
> Given this runs under NAPI you could optimize this easily for XDP_DROP (and
> XDP_ABORTED) by recycling the page in a driver local cache. (The page_pool
> also have a driver local cache build in, but it might be overkill to use page_pool
> in this simple case).
>
> You could do this in a followup patch.
I will do the optimization in a follow-up patch.
Thanks,
- Haiyang
Powered by blists - more mailing lists