lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 1 Aug 2022 15:49:26 +0000
From:   Maxim Mikityanskiy <maximmi@...dia.com>
To:     "maciej.fijalkowski@...el.com" <maciej.fijalkowski@...el.com>
CC:     "magnus.karlsson@...el.com" <magnus.karlsson@...el.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        Tariq Toukan <tariqt@...dia.com>,
        Gal Pressman <gal@...dia.com>,
        "john.fastabend@...il.com" <john.fastabend@...il.com>,
        "bjorn@...nel.org" <bjorn@...nel.org>,
        "daniel@...earbox.net" <daniel@...earbox.net>,
        "jonathan.lemon@...il.com" <jonathan.lemon@...il.com>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        Saeed Mahameed <saeedm@...dia.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "hawk@...nel.org" <hawk@...nel.org>,
        "ast@...nel.org" <ast@...nel.org>
Subject: Re: [PATCH net] net/mlx5e: xsk: Discard unaligned XSK frames on
 striding RQ

First of all, this patch is a temporary kludge. I found a bug in the
current implementation of the unaligned mode: frames not aligned at
least to 8 are misplaced. There is a proper fix in the driver, but it
will be pushed to net-next, because it's huge. In the meanwhile, this
workaround that drops packets not aligned to 8 will go to stable
kernels.

On Mon, 2022-08-01 at 15:41 +0200, Maciej Fijalkowski wrote:
> On Fri, Jul 29, 2022 at 03:13:56PM +0300, Maxim Mikityanskiy wrote:
> > Striding RQ uses MTT page mapping, where each page corresponds to an XSK
> > frame. MTT pages have alignment requirements, and XSK frames don't have
> > any alignment guarantees in the unaligned mode. Frames with improper
> > alignment must be discarded, otherwise the packet data will be written
> > at a wrong address.
> 
> Hey Maxim,
> can you explain what MTT stands for?

MTT is Memory Translation Table, it's a mechanism for virtual mapping
in the NIC. It's essentially a table of pages, where each virtual page
maps to a physical page.

> 
> > 
> > Fixes: 282c0c798f8e ("net/mlx5e: Allow XSK frames smaller than a page")
> > Signed-off-by: Maxim Mikityanskiy <maximmi@...dia.com>
> > Reviewed-by: Tariq Toukan <tariqt@...dia.com>
> > Reviewed-by: Saeed Mahameed <saeedm@...dia.com>
> > ---
> >  .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.h    | 14 ++++++++++++++
> >  include/net/xdp_sock_drv.h                         | 11 +++++++++++
> >  2 files changed, 25 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
> > index a8cfab4a393c..cc18d97d8ee0 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
> > @@ -7,6 +7,8 @@
> >  #include "en.h"
> >  #include <net/xdp_sock_drv.h>
> >  
> > +#define MLX5E_MTT_PTAG_MASK 0xfffffffffffffff8ULL
> 
> What if PAGE_SIZE != 4096 ? Is aligned mode with 2k frame fine for MTT
> case?

PAGE_SIZE doesn't affect this value. Aligned mode doesn't suffer from
this bug, because 2k or bigger frames are all aligned to 8.

> 
> > +
> >  /* RX data path */
> >  
> >  struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
> > @@ -21,6 +23,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
> >  static inline int mlx5e_xsk_page_alloc_pool(struct mlx5e_rq *rq,
> >  					    struct mlx5e_dma_info *dma_info)
> >  {
> > +retry:
> >  	dma_info->xsk = xsk_buff_alloc(rq->xsk_pool);
> >  	if (!dma_info->xsk)
> >  		return -ENOMEM;
> > @@ -32,6 +35,17 @@ static inline int mlx5e_xsk_page_alloc_pool(struct mlx5e_rq *rq,
> >  	 */
> >  	dma_info->addr = xsk_buff_xdp_get_frame_dma(dma_info->xsk);
> >  
> > +	/* MTT page mapping has alignment requirements. If they are not
> > +	 * satisfied, leak the descriptor so that it won't come again, and try
> > +	 * to allocate a new one.
> > +	 */
> > +	if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
> > +		if (unlikely(dma_info->addr & ~MLX5E_MTT_PTAG_MASK)) {
> > +			xsk_buff_discard(dma_info->xsk);
> > +			goto retry;
> > +		}
> > +	}
> 
> I don't know your hardware much, but how would this work out performance
> wise? Are there any config combos (page size vs chunk size in unaligned
> mode) that you would forbid during pool attach to queue or would you
> better allow anything?

This issue isn't related to page or frame sizes, but rather to frame
locations. As far as I understand, frames can be located at any places
in the unaligned mode (even at odd addresses), regardless of their
size. Frames whose addr % 8 != 0 don't really work with MTT, but it's
not something that can be enforced on attach. Enforcing it in xp_alloc
won't be any faster either (well, only a tiny bit, because of one fewer
function call).

In any case, next kernels will get another page mapping mechanism,
which supports arbitrary addresses, and it's almost as fast as MTT, as
the preliminary testing shows. It will be used for the unaligned XSK,
this kludge will be removed altogether, and I also plan to remove
xsk_buff_discard.

> Also would be helpful if you would describe the use case you're fixing.

Sure - described in the beginning of the email.

> 
> Thanks!
> 
> > +
> >  	return 0;
> >  }
> >  
> > diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h
> > index 4aa031849668..0774ce97c2f1 100644
> > --- a/include/net/xdp_sock_drv.h
> > +++ b/include/net/xdp_sock_drv.h
> > @@ -95,6 +95,13 @@ static inline void xsk_buff_free(struct xdp_buff *xdp)
> >  	xp_free(xskb);
> >  }
> >  
> > +static inline void xsk_buff_discard(struct xdp_buff *xdp)
> > +{
> > +	struct xdp_buff_xsk *xskb = container_of(xdp, struct xdp_buff_xsk, xdp);
> > +
> > +	xp_release(xskb);
> > +}
> > +
> >  static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size)
> >  {
> >  	xdp->data = xdp->data_hard_start + XDP_PACKET_HEADROOM;
> > @@ -238,6 +245,10 @@ static inline void xsk_buff_free(struct xdp_buff *xdp)
> >  {
> >  }
> >  
> > +static inline void xsk_buff_discard(struct xdp_buff *xdp)
> > +{
> > +}
> > +
> >  static inline void xsk_buff_set_size(struct xdp_buff *xdp, u32 size)
> >  {
> >  }
> > -- 
> > 2.25.1
> > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ