[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89i+GoVZLcdHxuf33HpmgyPNKxGqEjXGpi=XiB-QOsAG52A@mail.gmail.com>
Date: Mon, 22 Sep 2025 14:12:57 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: Daniil Tatianin <d-tatianin@...dex-team.ru>
Cc: Pablo Neira Ayuso <pablo@...filter.org>, Jozsef Kadlecsik <kadlec@...filter.org>,
Florian Westphal <fw@...len.de>, Phil Sutter <phil@....cc>, "David S. Miller" <davem@...emloft.net>,
David Ahern <dsahern@...nel.org>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>, netfilter-devel@...r.kernel.org, coreteam@...filter.org,
linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [PATCH 1/3] netfilter/x_tables: go back to using vmalloc for xt_table_info
On Mon, Sep 22, 2025 at 12:48 PM Daniil Tatianin
<d-tatianin@...dex-team.ru> wrote:
>
> This code previously always used vmalloc for anything above
> PAGE_ALLOC_COSTLY_ORDER, but this logic was changed in
> commit eacd86ca3b036 ("net/netfilter/x_tables.c: use kvmalloc() in xt_alloc_table_info()").
>
> The commit that changed it did so because "xt_alloc_table_info()
> basically opencodes kvmalloc()", which is not actually what it was
> doing. kvmalloc() does not attempt to go directly to vmalloc if the
> order the caller is trying to allocate is "expensive", instead it only
> uses vmalloc as a fallback in case the buddy allocator is not able to
> fullfill the request.
>
> The difference between the two is actually huge in case the system is
> under memory pressure and has no free pages of a large order. Before the
> change to kvmalloc we wouldn't even try going to the buddy allocator for
> large orders, but now we would force it to try to find a page of the
> required order by waking up kswapd/kcompactd and dropping reclaimable memory
> for no reason at all to satisfy our huge order allocation that could easily
> exist within vmalloc'ed memory instead.
This would hint at an issue with kvmalloc(), why not fixing it, instead
of trying to fix all its users ?
There was a time where PAGE_ALLOC_COSTLY_ORDER was used.
>
> Revert the change to always call vmalloc, since this code doesn't really
> benefit from contiguous physical memory, and the size it allocates is
> directly dictated by the userspace-passed table buffer thus allowing it to
> torture the buddy allocator by carefully crafting a huge table that fits
> right at the maximum available memory order on the system.
>
> Signed-off-by: Daniil Tatianin <d-tatianin@...dex-team.ru>
> ---
> net/netfilter/x_tables.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
> index 90b7630421c4..c98f4b05d79d 100644
> --- a/net/netfilter/x_tables.c
> +++ b/net/netfilter/x_tables.c
> @@ -1190,7 +1190,7 @@ struct xt_table_info *xt_alloc_table_info(unsigned int size)
> if (sz < sizeof(*info) || sz >= XT_MAX_TABLE_SIZE)
> return NULL;
>
> - info = kvmalloc(sz, GFP_KERNEL_ACCOUNT);
> + info = __vmalloc(sz, GFP_KERNEL_ACCOUNT);
> if (!info)
> return NULL;
>
> @@ -1210,7 +1210,7 @@ void xt_free_table_info(struct xt_table_info *info)
> kvfree(info->jumpstack);
> }
>
> - kvfree(info);
> + vfree(info);
> }
> EXPORT_SYMBOL(xt_free_table_info);
>
> --
> 2.34.1
>
Powered by blists - more mailing lists