lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 22 Jan 2020 20:29:22 +0000
From:   Haiyang Zhang <haiyangz@...rosoft.com>
To:     Jesper Dangaard Brouer <brouer@...hat.com>
CC:     "sashal@...nel.org" <sashal@...nel.org>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        KY Srinivasan <kys@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        "olaf@...fle.de" <olaf@...fle.de>, vkuznets <vkuznets@...hat.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>
Subject: RE: [PATCH V3,net-next, 1/2] hv_netvsc: Add XDP support



> -----Original Message-----
> From: Jesper Dangaard Brouer <brouer@...hat.com>
> Sent: Wednesday, January 22, 2020 2:52 PM
> To: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: brouer@...hat.com; sashal@...nel.org; linux-hyperv@...r.kernel.org;
> netdev@...r.kernel.org; KY Srinivasan <kys@...rosoft.com>; Stephen
> Hemminger <sthemmin@...rosoft.com>; olaf@...fle.de; vkuznets
> <vkuznets@...hat.com>; davem@...emloft.net; linux-kernel@...r.kernel.org;
> Ilias Apalodimas <ilias.apalodimas@...aro.org>
> Subject: Re: [PATCH V3,net-next, 1/2] hv_netvsc: Add XDP support
> 
> On Wed, 22 Jan 2020 09:23:33 -0800
> Haiyang Zhang <haiyangz@...rosoft.com> wrote:
> 
> > +u32 netvsc_run_xdp(struct net_device *ndev, struct netvsc_channel *nvchan,
> > +		   struct xdp_buff *xdp)
> > +{
> > +	void *data = nvchan->rsc.data[0];
> > +	u32 len = nvchan->rsc.len[0];
> > +	struct page *page = NULL;
> > +	struct bpf_prog *prog;
> > +	u32 act = XDP_PASS;
> > +
> > +	xdp->data_hard_start = NULL;
> > +
> > +	rcu_read_lock();
> > +	prog = rcu_dereference(nvchan->bpf_prog);
> > +
> > +	if (!prog)
> > +		goto out;
> > +
> > +	/* allocate page buffer for data */
> > +	page = alloc_page(GFP_ATOMIC);
> 
> The alloc_page() + __free_page() alone[1] cost 231 cycles(tsc) 64.395 ns.
> Thus, the XDP_DROP case will already be limited to just around 10Gbit/s
> 14.88 Mpps (67.2ns).
> 
> XDP is suppose to be done for performance reasons. This looks like a slowdown.
> 
> Measurement tool:
> [1]
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co
> m%2Fnetoptimizer%2Fprototype-
> kernel%2Fblob%2Fmaster%2Fkernel%2Fmm%2Fbench%2Fpage_bench01.c&am
> p;data=02%7C01%7Chaiyangz%40microsoft.com%7C681b5b13e50448d098d408
> d79f748522%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63715319
> 5109318994&amp;sdata=pncqYIWm1yS5rDf%2BAIbWgskycmuofzl09yA1QmsRb
> M0%3D&amp;reserved=0

On synthetic data path (netvsc), the per channel throughput is much slower than 
10Gbps, because of the host side software based vSwitch. Also in most VMs on 
Azure, Accelerated Network (SRIOV) is enabled. So the alloc_page() overhead on 
synthetic data path won't impact performance significantly.

> 
> > +	if (!page) {
> > +		act = XDP_DROP;
> > +		goto out;
> > +	}
> > +
> > +	xdp->data_hard_start = page_address(page);
> > +	xdp->data = xdp->data_hard_start + NETVSC_XDP_HDRM;
> > +	xdp_set_data_meta_invalid(xdp);
> > +	xdp->data_end = xdp->data + len;
> > +	xdp->rxq = &nvchan->xdp_rxq;
> > +	xdp->handle = 0;
> > +
> > +	memcpy(xdp->data, data, len);
> 
> And a memcpy.


As in the commit log: 
The Azure/Hyper-V synthetic NIC receive buffer doesn't provide headroom
for XDP. We thought about re-use the RNDIS header space, but it's too
small. So we decided to copy the packets to a page buffer for XDP. And,
most of our VMs on Azure have Accelerated  Network (SRIOV) enabled, so
most of the packets run on VF NIC. The synthetic NIC is considered as a
fallback data-path. So the data copy on netvsc won't impact performance
significantly.

> 
> > +
> > +	act = bpf_prog_run_xdp(prog, xdp);
> > +
> > +	switch (act) {
> > +	case XDP_PASS:
> > +	case XDP_TX:
> > +	case XDP_DROP:
> > +		break;
> > +
> > +	case XDP_ABORTED:
> > +		trace_xdp_exception(ndev, prog, act);
> > +		break;
> > +
> > +	default:
> > +		bpf_warn_invalid_xdp_action(act);
> > +	}
> > +
> > +out:
> > +	rcu_read_unlock();
> > +
> > +	if (page && act != XDP_PASS && act != XDP_TX) {
> > +		__free_page(page);
> 
> Given this runs under NAPI you could optimize this easily for XDP_DROP (and
> XDP_ABORTED) by recycling the page in a driver local cache. (The page_pool
> also have a driver local cache build in, but it might be overkill to use page_pool
> in this simple case).
> 
> You could do this in a followup patch.

I will do the optimization in a follow-up patch.

Thanks,
- Haiyang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ