netdev - Re: [PATCH net-next v6 4/4] gve: Add support for raw addressing in the tx path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKgT0Ufex3HveUvkWofwtA2Y3L1C12n1oNVPY14Mcp+3kRsOGA@mail.gmail.com>
Date:   Thu, 19 Nov 2020 08:33:13 -0800
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     David Awogbemila <awogbemila@...gle.com>
Cc:     Netdev <netdev@...r.kernel.org>,
        Catherine Sullivan <csully@...gle.com>,
        Yangchun Fu <yangchun@...gle.com>
Subject: Re: [PATCH net-next v6 4/4] gve: Add support for raw addressing in
 the tx path

On Wed, Nov 18, 2020 at 3:16 PM David Awogbemila <awogbemila@...gle.com> wrote:
>
> On Wed, Nov 11, 2020 at 9:29 AM Alexander Duyck
> <alexander.duyck@...il.com> wrote:
> >
> > On Mon, Nov 9, 2020 at 3:39 PM David Awogbemila <awogbemila@...gle.com> wrote:
> > >
> > > From: Catherine Sullivan <csully@...gle.com>
> > >
> > > During TX, skbs' data addresses are dma_map'ed and passed to the NIC.
> > > This means that the device can perform DMA directly from these addresses
> > > and the driver does not have to copy the buffer content into
> > > pre-allocated buffers/qpls (as in qpl mode).
> > >
> > > Reviewed-by: Yangchun Fu <yangchun@...gle.com>
> > > Signed-off-by: Catherine Sullivan <csully@...gle.com>
> > > Signed-off-by: David Awogbemila <awogbemila@...gle.com>

<snip>

> > > @@ -472,6 +499,100 @@ static int gve_tx_add_skb(struct gve_tx_ring *tx, struct sk_buff *skb,
> > >         return 1 + payload_nfrags;
> > >  }
> > >
> > > +static int gve_tx_add_skb_no_copy(struct gve_priv *priv, struct gve_tx_ring *tx,
> > > +                                 struct sk_buff *skb)
> > > +{
> > > +       const struct skb_shared_info *shinfo = skb_shinfo(skb);
> > > +       int hlen, payload_nfrags, l4_hdr_offset, seg_idx_bias;
> > > +       union gve_tx_desc *pkt_desc, *seg_desc;
> > > +       struct gve_tx_buffer_state *info;
> > > +       bool is_gso = skb_is_gso(skb);
> > > +       u32 idx = tx->req & tx->mask;
> > > +       struct gve_tx_dma_buf *buf;
> > > +       int last_mapped = 0;
> > > +       u64 addr;
> > > +       u32 len;
> > > +       int i;
> > > +
> > > +       info = &tx->info[idx];
> > > +       pkt_desc = &tx->desc[idx];
> > > +
> > > +       l4_hdr_offset = skb_checksum_start_offset(skb);
> > > +       /* If the skb is gso, then we want only up to the tcp header in the first segment
> > > +        * to efficiently replicate on each segment otherwise we want the linear portion
> > > +        * of the skb (which will contain the checksum because skb->csum_start and
> > > +        * skb->csum_offset are given relative to skb->head) in the first segment.
> > > +        */
> > > +       hlen = is_gso ? l4_hdr_offset + tcp_hdrlen(skb) :
> > > +                       skb_headlen(skb);
> > > +       len = skb_headlen(skb);
> > > +
> > > +       info->skb =  skb;
> > > +
> > > +       addr = dma_map_single(tx->dev, skb->data, len, DMA_TO_DEVICE);
> > > +       if (unlikely(dma_mapping_error(tx->dev, addr))) {
> > > +               rtnl_lock();
> > > +               priv->dma_mapping_error++;
> > > +               rtnl_unlock();
> >
> > Do you really need an rtnl_lock for updating this statistic? That
> > seems like a glaring issue to me.
>
> I thought this would be the way to protect the stat from parallel
> access as was suggested in a comment in v3 of the patchset but I
> understand now that rtnl_lock/unlock ought only to be used for net
> device configurations and not in the data path. Also I now believe
> that since this driver is very rarely not running on a 64-bit
> platform, the stat update is atomic anyway and shouldn't need the locks.

If nothing else it might be good to look at just creating a per-ring
stat for this and then aggregating the value before you report it to
the stack. Then you don't have to worry about multiple threads trying
to update it simultaneously since it will be protected by the Tx queue
lock.