lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d57971bf4ff780782e68ccb1d9fd0c5bb1577ea9.camel@redhat.com>
Date:   Tue, 02 Aug 2022 12:54:15 +0200
From:   Paolo Abeni <pabeni@...hat.com>
To:     Maxim Mikityanskiy <maximmi@...dia.com>,
        "maciej.fijalkowski@...el.com" <maciej.fijalkowski@...el.com>
Cc:     "magnus.karlsson@...el.com" <magnus.karlsson@...el.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        Tariq Toukan <tariqt@...dia.com>,
        Gal Pressman <gal@...dia.com>,
        "john.fastabend@...il.com" <john.fastabend@...il.com>,
        "bjorn@...nel.org" <bjorn@...nel.org>,
        "daniel@...earbox.net" <daniel@...earbox.net>,
        "jonathan.lemon@...il.com" <jonathan.lemon@...il.com>,
        "kuba@...nel.org" <kuba@...nel.org>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        Saeed Mahameed <saeedm@...dia.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "hawk@...nel.org" <hawk@...nel.org>,
        "ast@...nel.org" <ast@...nel.org>
Subject: Re: [PATCH net] net/mlx5e: xsk: Discard unaligned XSK frames on
 striding RQ

On Mon, 2022-08-01 at 15:49 +0000, Maxim Mikityanskiy wrote:
> First of all, this patch is a temporary kludge. I found a bug in the
> current implementation of the unaligned mode: frames not aligned at
> least to 8 are misplaced. There is a proper fix in the driver, but it
> will be pushed to net-next, because it's huge. In the meanwhile, this
> workaround that drops packets not aligned to 8 will go to stable
> kernels.
> 
> On Mon, 2022-08-01 at 15:41 +0200, Maciej Fijalkowski wrote:
> > On Fri, Jul 29, 2022 at 03:13:56PM +0300, Maxim Mikityanskiy wrote:
> > > Striding RQ uses MTT page mapping, where each page corresponds to an XSK
> > > frame. MTT pages have alignment requirements, and XSK frames don't have
> > > any alignment guarantees in the unaligned mode. Frames with improper
> > > alignment must be discarded, otherwise the packet data will be written
> > > at a wrong address.
> > 
> > Hey Maxim,
> > can you explain what MTT stands for?
> 
> MTT is Memory Translation Table, it's a mechanism for virtual mapping
> in the NIC. It's essentially a table of pages, where each virtual page
> maps to a physical page.
> 
> > 
> > > 
> > > Fixes: 282c0c798f8e ("net/mlx5e: Allow XSK frames smaller than a page")
> > > Signed-off-by: Maxim Mikityanskiy <maximmi@...dia.com>
> > > Reviewed-by: Tariq Toukan <tariqt@...dia.com>
> > > Reviewed-by: Saeed Mahameed <saeedm@...dia.com>
> > > ---
> > >  .../net/ethernet/mellanox/mlx5/core/en/xsk/rx.h    | 14 ++++++++++++++
> > >  include/net/xdp_sock_drv.h                         | 11 +++++++++++
> > >  2 files changed, 25 insertions(+)
> > > 
> > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
> > > index a8cfab4a393c..cc18d97d8ee0 100644
> > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
> > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xsk/rx.h
> > > @@ -7,6 +7,8 @@
> > >  #include "en.h"
> > >  #include <net/xdp_sock_drv.h>
> > >  
> > > +#define MLX5E_MTT_PTAG_MASK 0xfffffffffffffff8ULL
> > 
> > What if PAGE_SIZE != 4096 ? Is aligned mode with 2k frame fine for MTT
> > case?
> 
> PAGE_SIZE doesn't affect this value. Aligned mode doesn't suffer from
> this bug, because 2k or bigger frames are all aligned to 8.
> 
> > 
> > > +
> > >  /* RX data path */
> > >  
> > >  struct sk_buff *mlx5e_xsk_skb_from_cqe_mpwrq_linear(struct mlx5e_rq *rq,
> > > @@ -21,6 +23,7 @@ struct sk_buff *mlx5e_xsk_skb_from_cqe_linear(struct mlx5e_rq *rq,
> > >  static inline int mlx5e_xsk_page_alloc_pool(struct mlx5e_rq *rq,
> > >  					    struct mlx5e_dma_info *dma_info)
> > >  {
> > > +retry:
> > >  	dma_info->xsk = xsk_buff_alloc(rq->xsk_pool);
> > >  	if (!dma_info->xsk)
> > >  		return -ENOMEM;
> > > @@ -32,6 +35,17 @@ static inline int mlx5e_xsk_page_alloc_pool(struct mlx5e_rq *rq,
> > >  	 */
> > >  	dma_info->addr = xsk_buff_xdp_get_frame_dma(dma_info->xsk);
> > >  
> > > +	/* MTT page mapping has alignment requirements. If they are not
> > > +	 * satisfied, leak the descriptor so that it won't come again, and try
> > > +	 * to allocate a new one.
> > > +	 */
> > > +	if (rq->wq_type == MLX5_WQ_TYPE_LINKED_LIST_STRIDING_RQ) {
> > > +		if (unlikely(dma_info->addr & ~MLX5E_MTT_PTAG_MASK)) {
> > > +			xsk_buff_discard(dma_info->xsk);
> > > +			goto retry;
> > > +		}
> > > +	}
> > 
> > I don't know your hardware much, but how would this work out performance
> > wise? Are there any config combos (page size vs chunk size in unaligned
> > mode) that you would forbid during pool attach to queue or would you
> > better allow anything?
> 
> This issue isn't related to page or frame sizes, but rather to frame
> locations. As far as I understand, frames can be located at any places
> in the unaligned mode (even at odd addresses), regardless of their
> size. Frames whose addr % 8 != 0 don't really work with MTT, but it's
> not something that can be enforced on attach. Enforcing it in xp_alloc
> won't be any faster either (well, only a tiny bit, because of one fewer
> function call).
> 
> In any case, next kernels will get another page mapping mechanism,
> which supports arbitrary addresses, and it's almost as fast as MTT, as
> the preliminary testing shows. It will be used for the unaligned XSK,
> this kludge will be removed altogether, and I also plan to remove
> xsk_buff_discard.
> 
> > Also would be helpful if you would describe the use case you're fixing.
> 
> Sure - described in the beginning of the email.

@Maciej: are you satisfied by Maxim's answers?

/P

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ