netdev - Re: [RFC PATCH net] bpf: Fix unregister memory model in BPF_PROG

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <867eb8dc-0c4c-4d12-7c71-5fbd44fc348d@engleder-embedded.com>
Date:   Sun, 19 Mar 2023 20:42:58 +0100
From:   Gerhard Engleder <gerhard@...leder-embedded.com>
To:     netdev@...r.kernel.org, bpf@...r.kernel.org, ast@...nel.org,
        daniel@...earbox.net, john.fastabend@...il.com, hawk@...nel.org
Cc:     davem@...emloft.net, kuba@...nel.org, edumazet@...gle.com,
        pabeni@...hat.com, toke@...hat.com, song@...nel.org
Subject: Re: [RFC PATCH net] bpf: Fix unregister memory model in BPF_PROG_RUN
 in test runner

On 11.03.23 22:37, Gerhard Engleder wrote:
> After executing xdp-trafficgen the following kernel output appeared:
> [  214.564375] page_pool_release_retry() stalled pool shutdown 64 inflight 60 sec
> 
> xdp_test_run_batch() in combination with xdp-trafficgen uses a batch
> size of 64. So it seems that a single batch does find its way back to
> the page pool. I checked my tsnep driver, but the page pool entries were
> not lost in the driver according to my analysis.
> 
> Executing xdp-trafficgen with n=1000 resulted in this output:
> [  251.652376] page_pool_release_retry() stalled pool shutdown 40 inflight 60 sec
> 
> Executing xdp-trafficgen with n=10000 resulted in this output:
> [  267.332374] page_pool_release_retry() stalled pool shutdown 16 inflight 60 sec
> 
> So interestingly in both cases the last batch with a size lower than 64
> does not find its way back to the page pool. So what's the problem with
> the last batch?
> 
> After xdp_test_run_batch() clean up is done in xdp_test_run_teardown()
> no matter if page pool entries are still in flight. No problem for
> page_pool_destroy() as the page pool is released later when no entries
> are in flight.
> 
> With commit 425d239379db0 unregistering memory model has been added to
> xdp_test_run_teardown(). Otherwise the page pool would not be released.
> But xdp_unreg_mem_model() resets the memory model type immediately to 0
> (which is actually MEM_TYPE_PAGE_SHARED). So the memory model type
> MEM_TYPE_PAGE_POOL is lost and any inflight page pool entries have no
> chance to find its way back to the page pool. I assume that's the reason
> why the last batch does not find its way back to the page pool.
> 
> A simple sleep before xdp_unreg_mem_model() solved this problem, but
> this is no valid fix of course. It needs to be ensured that the memory
> model is not in use anymore. This is the case when the page pool has no
> entries in flight.
> 
> How could it be ensured that a call to xdp_unreg_mem_model() is safe?
> In my opinion drivers suffer the same problem. Is there a pattern how
> this can be solved? Or did I misinterprete something?
> 
> Fixes: 425d239379db0 ("bpf: Fix release of page_pool in BPF_PROG_RUN in test runner")
> Signed-off-by: Gerhard Engleder <gerhard@...leder-embedded.com>
> ---
>   net/bpf/test_run.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index b766a84c8536..eaccfdab0be3 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -202,6 +202,8 @@ static int xdp_test_run_setup(struct xdp_test_data *xdp, struct xdp_buff *orig_c
>   
>   static void xdp_test_run_teardown(struct xdp_test_data *xdp)
>   {
> +	usleep_range(10000, 11000);
> +
>   	xdp_unreg_mem_model(&xdp->mem);
>   	page_pool_destroy(xdp->pp);
>   	kfree(xdp->frames);

I did not get any reply. Did I post it the wrong the way?

I did post it as RFC as I have no clear idea how to fix that. But maybe
I'm wrong and there is no problem. Would be great to hear your opinion 
about this issue.

Gerhard