lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izM-ee5C8W2D2x9ChQz667PQEaYFOtgKZcFCMT4HRHL0fQ@mail.gmail.com>
Date: Fri, 23 May 2025 10:55:54 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Byungchul Park <byungchul@...com>
Cc: willy@...radead.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-mm@...ck.org, kernel_team@...ynix.com, kuba@...nel.org, 
	ilias.apalodimas@...aro.org, harry.yoo@...cle.com, hawk@...nel.org, 
	akpm@...ux-foundation.org, davem@...emloft.net, john.fastabend@...il.com, 
	andrew+netdev@...n.ch, asml.silence@...il.com, toke@...hat.com, 
	tariqt@...dia.com, edumazet@...gle.com, pabeni@...hat.com, saeedm@...dia.com, 
	leon@...nel.org, ast@...nel.org, daniel@...earbox.net, david@...hat.com, 
	lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, vbabka@...e.cz, 
	rppt@...nel.org, surenb@...gle.com, mhocko@...e.com, horms@...nel.org, 
	linux-rdma@...r.kernel.org, bpf@...r.kernel.org, vishal.moola@...il.com
Subject: Re: [PATCH 18/18] mm, netmem: remove the page pool members in struct page

On Thu, May 22, 2025 at 8:26 PM Byungchul Park <byungchul@...com> wrote:
>
> Now that all the users of the page pool members in struct page have been
> gone, the members can be removed from struct page.
>
> However, since struct netmem_desc might still use the space in struct
> page, the size of struct netmem_desc should be checked, until struct
> netmem_desc has its own instance from slab, to avoid conficting with
> other members within struct page.
>
> Remove the page pool members in struct page and add a static checker for
> the size.
>
> Signed-off-by: Byungchul Park <byungchul@...com>
> ---
>  include/linux/mm_types.h | 11 -----------
>  include/net/netmem.h     | 28 +++++-----------------------
>  2 files changed, 5 insertions(+), 34 deletions(-)
>
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 873e820e1521..5a7864eb9d76 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -119,17 +119,6 @@ struct page {
>                          */
>                         unsigned long private;
>                 };
> -               struct {        /* page_pool used by netstack */
> -                       unsigned long _pp_mapping_pad;
> -                       /**
> -                        * @pp_magic: magic value to avoid recycling non
> -                        * page_pool allocated pages.
> -                        */
> -                       unsigned long pp_magic;
> -                       struct page_pool *pp;
> -                       unsigned long dma_addr;
> -                       atomic_long_t pp_ref_count;
> -               };
>                 struct {        /* Tail pages of compound page */
>                         unsigned long compound_head;    /* Bit zero is set */
>                 };
> diff --git a/include/net/netmem.h b/include/net/netmem.h
> index c63a7e20f5f3..257c22398d7a 100644
> --- a/include/net/netmem.h
> +++ b/include/net/netmem.h
> @@ -77,30 +77,12 @@ struct net_iov_area {
>         unsigned long base_virtual;
>  };
>
> -/* These fields in struct page are used by the page_pool and net stack:
> - *
> - *        struct {
> - *                unsigned long _pp_mapping_pad;
> - *                unsigned long pp_magic;
> - *                struct page_pool *pp;
> - *                unsigned long dma_addr;
> - *                atomic_long_t pp_ref_count;
> - *        };
> - *
> - * We mirror the page_pool fields here so the page_pool can access these fields
> - * without worrying whether the underlying fields belong to a page or net_iov.
> - *
> - * The non-net stack fields of struct page are private to the mm stack and must
> - * never be mirrored to net_iov.
> +/* XXX: The page pool fields in struct page have been removed but they
> + * might still use the space in struct page.  Thus, the size of struct
> + * netmem_desc should be under control until struct netmem_desc has its
> + * own instance from slab.
>   */
> -#define NET_IOV_ASSERT_OFFSET(pg, iov)             \
> -       static_assert(offsetof(struct page, pg) == \
> -                     offsetof(struct net_iov, iov))
> -NET_IOV_ASSERT_OFFSET(pp_magic, pp_magic);
> -NET_IOV_ASSERT_OFFSET(pp, pp);
> -NET_IOV_ASSERT_OFFSET(dma_addr, dma_addr);
> -NET_IOV_ASSERT_OFFSET(pp_ref_count, pp_ref_count);
> -#undef NET_IOV_ASSERT_OFFSET
> +static_assert(sizeof(struct netmem_desc) <= offsetof(struct page, _refcount));
>

Removing these asserts is actually a bit dangerous. Functions like
netmem_or_pp_magic() rely on the fact that the offsets are the same
between struct page and struct net_iov to access these fields without
worrying about the type of the netmem. What we do in these helpers is
we we clear the least significant bit of the netmem, and then  access
the field. This works only because we verified at build time that the
offset is the same.

I think we have 3 options here:

1. Keep the asserts as-is, then in the follow up patch where we remove
netmem_desc from struct page, we update the asserts to make sure
struct page and struct net_iov can grab the netmem_desc in a uniform
way.

2. We remove the asserts, but all the helpers that rely on
__netmem_clear_lsb need to be modified to do custom handling of
net_iov vs page. Something like:

static inline void netmem_or_pp_magic(netmem_ref netmem, unsigned long pp_magic)
{
  if (netmem_is_net_iov(netmem)
     netmem_to_net_iov(netmem)->pp_magic |= pp_magic;
  else
    netmem_to_page(netmem)->pp_magic |= pp_magic;
}

Option #2 requires extra checks, which may affect the performance
reported by page_pool_bench_simple that I pointed you to before.

3. We could swap out all the individual asserts for one assert, if
both page and net_iov have a netmem_desc subfield. This will also need
to be reworked when netmem_desc is eventually moved out of struct page
and is slab allocated:

NET_IOV_ASSERT_OFFSET(netmem_desc, netmem_desc);

-- 
Thanks,
Mina

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ