lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210414162500.397ddb7f@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Wed, 14 Apr 2021 16:25:00 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     "Joseph, Jithu" <jithu.joseph@...el.com>
Cc:     "Nguyen, Anthony L" <anthony.l.nguyen@...el.com>,
        "Karlsson, Magnus" <magnus.karlsson@...el.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "Gomes, Vinicius" <vinicius.gomes@...el.com>,
        "Fijalkowski, Maciej" <maciej.fijalkowski@...el.com>,
        "sassmann@...hat.com" <sassmann@...hat.com>,
        "Lifshits, Vitaly" <vitaly.lifshits@...el.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "Neftin, Sasha" <sasha.neftin@...el.com>,
        "Desouza, Ederson" <ederson.desouza@...el.com>,
        "bjorn.topel@...el.com" <bjorn.topel@...el.com>,
        "dvorax.fuxbrumer@...ux.intel.com" <dvorax.fuxbrumer@...ux.intel.com>
Subject: Re: [PATCH net-next 8/9] igc: Enable RX via AF_XDP zero-copy

On Wed, 14 Apr 2021 23:14:04 +0000 Joseph, Jithu wrote:
> > > +static struct sk_buff *igc_construct_skb_zc(struct igc_ring *ring,
> > > +					    struct xdp_buff *xdp)
> > > +{
> > > +	unsigned int metasize = xdp->data - xdp->data_meta;
> > > +	unsigned int datasize = xdp->data_end - xdp->data;
> > > +	struct sk_buff *skb;
> > > +
> > > +	skb = __napi_alloc_skb(&ring->q_vector->napi,
> > > +			       xdp->data_end - xdp->data_hard_start,
> > > +			       GFP_ATOMIC | __GFP_NOWARN);
> > > +	if (unlikely(!skb))
> > > +		return NULL;
> > > +
> > > +	skb_reserve(skb, xdp->data - xdp->data_hard_start);
> > > +	memcpy(__skb_put(skb, datasize), xdp->data, datasize);
> > > +	if (metasize)
> > > +		skb_metadata_set(skb, metasize);  
> > 
> > But you haven't actually copied the matadata into the skb,
> > the metadata is before xdp->data, right?  
> 
> Today the igc driver doesn’t add any metadata (except for hw time
> stamps explained later) . So for most part, xdp->data and xdp-
> >data_meta point to the same address . That could be why in this  
> initial implementation we are not copying  the metadata into skb (as
> the driver doesn’t add any).  

I don't think the timestamp is supposed to be part of the metadata.
We're talking about BPF metadata here (added by the XDP prog).

> If the XDP program adds some metadata before xdp->data (and  xdp-
> >data_meta reflects this), that is NOT copied into the SKB as you  
> mentioned .   Is the expectation that meta_data (if any added by the
> bpf program) , should also be copied to the skb  in this XDP_PASS flow
> ? If so I can revise this patch to do that. 

Yes, I believe so.

> If h/w time-stamp is added by the NIC, then metasize will be non zero
> (as  xdp->data is advanced by the driver ) .  h/w ts  is still copied
> into "skb_hwtstamps(skb)->hwtstamp" by  the caller of this function
> igc_dispatch_skb_zc()  . Do you still want it to be copied into
> __skb_put(skb, ) area too ? 

If TS is prepended to the frame it should be saved (e.g. on the stack)
before XDP program is called and gets the chance to overwrite it. The
metadata length when XDP program is called should be 0.

> > > +	return skb;
> > > +}
> > > +static int igc_xdp_enable_pool(struct igc_adapter *adapter,
> > > +			       struct xsk_buff_pool *pool, u16
> > > queue_id)
> > > +{
> > > +	struct net_device *ndev = adapter->netdev;
> > > +	struct device *dev = &adapter->pdev->dev;
> > > +	struct igc_ring *rx_ring;
> > > +	struct napi_struct *napi;
> > > +	bool needs_reset;
> > > +	u32 frame_size;
> > > +	int err;
> > > +
> > > +	if (queue_id >= adapter->num_rx_queues)
> > > +		return -EINVAL;
> > > +
> > > +	frame_size = xsk_pool_get_rx_frame_size(pool);
> > > +	if (frame_size < ETH_FRAME_LEN + VLAN_HLEN * 2) {
> > > +		/* When XDP is enabled, the driver doesn't support
> > > frames that
> > > +		 * span over multiple buffers. To avoid that, we check
> > > if xsk
> > > +		 * frame size is big enough to fit the max ethernet
> > > frame size
> > > +		 * + vlan double tagging.
> > > +		 */
> > > +		return -EOPNOTSUPP;
> > > +	}
> > > +
> > > +	err = xsk_pool_dma_map(pool, dev, IGC_RX_DMA_ATTR);
> > > +	if (err) {
> > > +		netdev_err(ndev, "Failed to map xsk pool\n");
> > > +		return err;
> > > +	}
> > > +
> > > +	needs_reset = netif_running(adapter->netdev) &&
> > > igc_xdp_is_enabled(adapter);
> > > +
> > > +	rx_ring = adapter->rx_ring[queue_id];
> > > +	napi = &rx_ring->q_vector->napi;
> > > +
> > > +	if (needs_reset) {
> > > +		igc_disable_rx_ring(rx_ring);
> > > +		napi_disable(napi);
> > > +	}
> > > +
> > > +	set_bit(IGC_RING_FLAG_AF_XDP_ZC, &rx_ring->flags);
> > > +
> > > +	if (needs_reset) {
> > > +		napi_enable(napi);
> > > +		igc_enable_rx_ring(rx_ring);
> > > +
> > > +		err = igc_xsk_wakeup(ndev, queue_id, XDP_WAKEUP_RX);
> > > +		if (err)
> > > +			return err;  
> > 
> > No need for an unwind path here?
> > Does something call XDP_SETUP_XSK_POOL(NULL) on failure
> > automagically?  
> 
> I think we should add a xsk_pool_dma_unmap() in this failure path
> ?  Did I understand you correctly ?

Sounds right.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ