[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<MN6PR16MB545089E086BF99087773B76EB7E8A@MN6PR16MB5450.namprd16.prod.outlook.com>
Date: Wed, 15 Oct 2025 20:32:55 +0800
From: Mingrui Cui <mingruic@...look.com>
To: jacob.e.keller@...el.com
Cc: andrew+netdev@...n.ch,
davem@...emloft.net,
dtatulea@...dia.com,
edumazet@...gle.com,
kuba@...nel.org,
leon@...nel.org,
linux-kernel@...r.kernel.org,
linux-rdma@...r.kernel.org,
mbloch@...dia.com,
mingruic@...look.com,
netdev@...r.kernel.org,
pabeni@...hat.com,
saeedm@...dia.com,
tariqt@...dia.com
Subject: Re: [PATCH net v2] net/mlx5e: Make DEFAULT_FRAG_SIZE relative to page size
That's all the effects of changing DEFAULT_FRAG_SIZE. DEFAULT_FRAG_SIZE is only
used as the initial value for frag_size_max. It is primarily used to calculate
frag_size and frag_stride in arr of mlx5e_rq_frags_info, representing the actual
data size and the size used for page allocation, respectively. In
mlx5e_init_frags_partition(), an mlx5e_wqe_frag_info is allocated for each
fragment according to its frag_stride, which determines the layout of fragments
on a page.
> > static int mlx5e_build_rq_frags_info(struct mlx5_core_dev *mdev,
> > struct mlx5e_params *params,
> > @@ -756,18 +756,13 @@ static int mlx5e_build_rq_frags_info(struct mlx5_=
> core_dev *mdev,
> > /* No WQE can start in the middle of a page. */
> > info->wqe_index_mask =3D 0;
> > } else {
> > - /* PAGE_SIZEs starting from 8192 don't use 2K-sized fragments,
> > - * because there would be more than MLX5E_MAX_RX_FRAGS of them.
> > - */
> > - WARN_ON(PAGE_SIZE !=3D 2 * DEFAULT_FRAG_SIZE);
> > -
>
> So previously we would warn, but now we just fix the DEFAULT_FRAG_SIZE
> so this warning is redundant.. Ok.
>
> > /* Odd number of fragments allows to pack the last fragment of
> > * the previous WQE and the first fragment of the next WQE into
> > * the same page.
> > - * As long as DEFAULT_FRAG_SIZE is 2048, and MLX5E_MAX_RX_FRAGS
> > - * is 4, the last fragment can be bigger than the rest only if
> > - * it's the fourth one, so WQEs consisting of 3 fragments will
> > - * always share a page.
> > + * As long as DEFAULT_FRAG_SIZE is (PAGE_SIZE / 2), and
> > + * MLX5E_MAX_RX_FRAGS is 4, the last fragment can be bigger than
> > + * the rest only if it's the fourth one, so WQEs consisting of 3
> > + * fragments will always share a page.
> > * When a page is shared, WQE bulk size is 2, otherwise just 1.
> > */
> > info->wqe_index_mask =3D info->num_frags % 2;
>
> Would it be possible to fix the other logic so that it works for a
> DEFAULT_FRAG_SIZE of 2k on 8K pages? I guess if there's no negative to
> increasing the frag size then this fix makes sense since it is simple.
To maintain 2K DEFAULT_FRAG_SIZE on 8K pages, one of two alternatives would be
necessary: either find a method to calculate the occurrence period of
page-aligned WQEs for the current MTU to replace wqe_index_mask, or use a more
complex logic to manage fragment allocation and release on shared pages to avoid
conflicts. This would make the page allocation logic for 8K pages significantly
different from the 4K page case. Therefore, I believe directly modifying
DEFAULT_FRAG_SIZE is a cleaner solution.
Please note that frag_size_max is not fixed at 2048. If the current MTU exceeds
the maximum size that 2K fragments can store, frag_size_max will be set to
PAGE_SIZE. Therefore, changing DEFAULT_FRAG_SIZE to PAGE_SIZE/2 should
theoretically be safe. The only downside is a potential for slightly more wasted
space when filling a page.
> I guess the noted fixed commit limits to 4 fragments, but it make some
> assumptions that were wrong for 8K pages?
That's right. Specifically, on 4KB pages, once a small fragment and a 2K
fragment are placed, there is no room for another 2K fragment. This results in
the predictable page-sharing pattern for 3-fragment WQEs that breaks on larger
page sizes.
Powered by blists - more mailing lists