lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+sq2CdX31cqsSc=qRhbcZ5fOk2zGOrhTMGqhsPddbhW=YQPCQ@mail.gmail.com>
Date:   Thu, 23 Jan 2020 01:20:51 +0530
From:   Sunil Kovvuri <sunil.kovvuri@...il.com>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Linux Netdev List <netdev@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Michal Kubecek <mkubecek@...e.cz>,
        Sunil Goutham <sgoutham@...vell.com>,
        Geetha sowjanya <gakula@...vell.com>
Subject: Re: [PATCH v4 07/17] octeontx2-pf: Add packet transmission support

On Tue, Jan 21, 2020 at 10:24 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> On Tue, 21 Jan 2020 18:51:41 +0530, sunil.kovvuri@...il.com wrote:
> > From: Sunil Goutham <sgoutham@...vell.com>
> >
> > This patch adds the packet transmission support.
> > For a given skb prepares send queue descriptors (SQEs) and pushes them
> > to HW. Here driver doesn't maintain it's own SQ rings, SQEs are pushed
> > to HW using a silicon specific operations called LMTST. From the
> > instuction HW derives the transmit queue number and queues the SQE to
> > that queue. These LMTST instructions are designed to avoid queue
> > maintenance in SW and lockless behavior ie when multiple cores are trying
> > to add SQEs to same queue then HW will takecare of serialization, no need
> > for SW to hold locks.
> >
> > Also supports scatter/gather.
> >
> > Co-developed-by: Geetha sowjanya <gakula@...vell.com>
> > Signed-off-by: Geetha sowjanya <gakula@...vell.com>
> > Signed-off-by: Sunil Goutham <sgoutham@...vell.com>
>
> > +static netdev_tx_t otx2_xmit(struct sk_buff *skb, struct net_device *netdev)
> > +
>
> Spurious new line
>
> > +{
> > +     struct otx2_nic *pf = netdev_priv(netdev);
> > +     int qidx = skb_get_queue_mapping(skb);
> > +     struct otx2_snd_queue *sq;
> > +     struct netdev_queue *txq;
> > +
> > +     /* Check for minimum and maximum packet length */
>
> You only check for min

Hmm.. will fix the comment.

>
> > +     if (skb->len <= ETH_HLEN) {
> > +             dev_kfree_skb(skb);
> > +             return NETDEV_TX_OK;
> > +     }
> > +
> > +     sq = &pf->qset.sq[qidx];
> > +     txq = netdev_get_tx_queue(netdev, qidx);
> > +
> > +     if (netif_tx_queue_stopped(txq)) {
> > +             dev_kfree_skb(skb);
>
> This should never happen.
>
> > +     } else if (!otx2_sq_append_skb(netdev, sq, skb, qidx)) {
> > +             netif_tx_stop_queue(txq);
> > +
> > +             /* Check again, incase SQBs got freed up */
> > +             smp_mb();
> > +             if (((sq->num_sqbs - *sq->aura_fc_addr) * sq->sqe_per_sqb)
> > +                                                     > sq->sqe_thresh)
> > +                     netif_tx_wake_queue(txq);
> > +
> > +             return NETDEV_TX_BUSY;
> > +     }
> > +
> > +     return NETDEV_TX_OK;
> > +}
>
> > +/* NIX send memory subdescriptor structure */
> > +struct nix_sqe_mem_s {
> > +#if defined(__BIG_ENDIAN_BITFIELD)  /* W0 */
> > +     u64 subdc         : 4;
> > +     u64 alg           : 4;
> > +     u64 dsz           : 2;
> > +     u64 wmem          : 1;
> > +     u64 rsvd_52_16    : 37;
> > +     u64 offset        : 16;
> > +#else
> > +     u64 offset        : 16;
> > +     u64 rsvd_52_16    : 37;
> > +     u64 wmem          : 1;
> > +     u64 dsz           : 2;
> > +     u64 alg           : 4;
> > +     u64 subdc         : 4;
> > +#endif
>
> Traditionally we prefer to extract the bitfields with masks and shifts
> manually in the kernel, rather than having those (subjectively) ugly
> and finicky bitfield structs. But I guess if nobody else complains this
> can stay :/
>
> > +     u64 addr;
>
> Why do you care about big endian bitfields tho, if you don't care about
> endianness of the data itself?

At this point of time we are not addressing big endian functionality,
so few things
might be broken in that aspect. If it's preferred to remove, i will remove it.

>
> > +};
> > +
> >  #endif /* OTX2_STRUCT_H */
> > diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> > index e6be18d..f416603 100644
> > --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> > +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_txrx.c
> > @@ -32,6 +32,78 @@ static struct nix_cqe_hdr_s *otx2_get_next_cqe(struct otx2_cq_queue *cq)
> >       return cqe_hdr;
> >  }
> >
>> > +static void otx2_sqe_flush(struct otx2_snd_queue *sq, int size)
> > +{
> > +     u64 status;
> > +
> > +     /* Packet data stores should finish before SQE is flushed to HW */
>
> Packet data is synced by the dma operations the barrier shouldn't be
> needed AFAIK (and if it would be, dma_wmb() would not be the one, as it
> only works for iomem AFAIU).
>
> > +     dma_wmb();


Due to out of order execution by CPU, HW folks have suggested add a barrier
to avoid scenarios where packet is transmitted before all stores from
CPU are committed.
On arm64 a dmb() is less costlier than a dsb() barrier and as per HW
folks a dmb(st)
is sufficient to ensure all stores from CPU are committed. And
dma_wmb() uses dmb(st)
hence it is used. It's more of choice of architecture specific
instruction rather than the API.

Thanks,
Sunil.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ