[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bc65340d-99cf-2971-3e4e-738d220b68de@redhat.com>
Date: Fri, 28 Apr 2023 17:46:46 +0200
From: Jesper Dangaard Brouer <jbrouer@...hat.com>
To: Toke Høiland-Jørgensen <toke@...hat.com>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>,
linux-mm@...ck.org, Mel Gorman <mgorman@...hsingularity.net>
Cc: brouer@...hat.com, lorenzo@...nel.org, linyunsheng@...wei.com,
bpf@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>, willy@...radead.org
Subject: Re: [PATCH RFC net-next/mm V2 1/2] page_pool: Remove workqueue in new
shutdown scheme
On 27/04/2023 22.53, Toke Høiland-Jørgensen wrote:
>> @@ -868,11 +890,13 @@ void page_pool_destroy(struct page_pool *pool)
>> if (!page_pool_release(pool))
>> return;
>>
>> - pool->defer_start = jiffies;
>> - pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
>> + /* PP have pages inflight, thus cannot immediately release memory.
>> + * Enter into shutdown phase.
>> + */
>> + pool->p.flags |= PP_FLAG_SHUTDOWN;
>
> I think there's another race here: once the flag is set in this line
> (does this need a memory barrier, BTW?), another CPU can return the last
> outstanding page, read the flag and call page_pool_empty_ring(). If this
> happens before the call to page_pool_empty_ring() below, you'll get a
> use-after-free.
>
> To avoid this, we could artificially bump the pool->hold_cnt *before*
> setting the flag above; that way we know that the page_pool_empty_ring()
> won't trigger a release, because inflight pages will never go below 1.
> And then, below the page_pool_empty_ring() call below, we can add an
> artificial bump of the release_cnt as well, which means we'll get proper
> atomic semantics on the counters and only ever release once. I.e.,:
>
>> - INIT_DELAYED_WORK(&pool->release_dw, page_pool_release_retry);
>> - schedule_delayed_work(&pool->release_dw, DEFER_TIME);
>> + /* Concurrent CPUs could have returned last pages into ptr_ring */
>> + page_pool_empty_ring(pool);
> release_cnt = atomic_inc_return(&pool->pages_state_release_cnt);
> page_pool_free_attempt(pool, release_cnt);
>
I agree and I've implemented this solution (see V3 soon).
I've used smp_store_release() instead of WRITE_ONCE(), because AFAIK
smp_store_release() adds the memory barriers.
--Jesper
Powered by blists - more mailing lists