[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mt2s6zy5.fsf@toke.dk>
Date: Fri, 28 Apr 2023 12:52:50 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Jesper Dangaard Brouer <jbrouer@...hat.com>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>,
linux-mm@...ck.org, Mel Gorman <mgorman@...hsingularity.net>
Cc: brouer@...hat.com, lorenzo@...nel.org, linyunsheng@...wei.com,
bpf@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>, willy@...radead.org
Subject: Re: [PATCH RFC net-next/mm V2 1/2] page_pool: Remove workqueue in
new shutdown scheme
Jesper Dangaard Brouer <jbrouer@...hat.com> writes:
> On 27/04/2023 22.53, Toke Høiland-Jørgensen wrote:
>>> +noinline
>>> static void page_pool_empty_ring(struct page_pool *pool)
>>> {
>>> struct page *page;
>>> @@ -796,39 +828,29 @@ static void page_pool_scrub(struct page_pool *pool)
>>> page_pool_empty_ring(pool);
>>> }
>> So this is not in the diff context, but page_pool_empty_ring() does
>> this:
>>
>> static void page_pool_empty_ring(struct page_pool *pool)
>> {
>> struct page *page;
>>
>> /* Empty recycle ring */
>> while ((page = ptr_ring_consume_bh(&pool->ring))) {
>> /* Verify the refcnt invariant of cached pages */
>> if (!(page_ref_count(page) == 1))
>> pr_crit("%s() page_pool refcnt %d violation\n",
>> __func__, page_ref_count(page));
>>
>> page_pool_return_page(pool, page);
>> }
>> }
>>
>> ...and with this patch, that page_pool_return_page() call will now free
>> the pool memory entirely when the last page is returned. When it does
>> this, the condition in the while loop will still execute afterwards; it
>> would return false, but if the pool was freed, it's now referencing
>> freed memory when trying to read from pool->ring.
>
> Yes, that sounds like a problem.
>
>> So I think page_pool_empty_ring needs to either pull out all the pages
>> in the ring to an on-stack buffer before calling page_pool_return_page()
>> on them, or there needs to be some other way to break the loop early.
>
> Let me address this one first, I'll get back to the other in another
> reply. The usual/idiom way of doing this is to have a next pointer that
> is populated inside the loop before freeing the object.
> It should look like this (only compile tested):
>
> static void page_pool_empty_ring(struct page_pool *pool)
> {
> struct page *page, *next;
>
> next = ptr_ring_consume_bh(&pool->ring);
>
> /* Empty recycle ring */
> while (next) {
> page = next;
> next = ptr_ring_consume_bh(&pool->ring);
>
> /* Verify the refcnt invariant of cached pages */
> if (!(page_ref_count(page) == 1))
> pr_crit("%s() page_pool refcnt %d violation\n",
> __func__, page_ref_count(page));
>
> page_pool_return_page(pool, page);
> }
> }
Yup, that works!
>> There are a couple of other places where page_pool_return_page() is
>> called in a loop where the loop variable lives inside struct page_pool,
>> so we need to be absolutely sure they will never be called in the
>> shutdown stage, or they'll have to be fixed as well.
>
> The other loops are okay, but I spotted another problem in
> __page_pool_put_page() in "Fallback/non-XDP mode", but that is fixable.
Alright, great!
-Toke
Powered by blists - more mailing lists