lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 28 Apr 2023 12:42:32 +0200
From:   Jesper Dangaard Brouer <jbrouer@...hat.com>
To:     Toke Høiland-Jørgensen <toke@...hat.com>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>,
        linux-mm@...ck.org, Mel Gorman <mgorman@...hsingularity.net>
Cc:     brouer@...hat.com, lorenzo@...nel.org, linyunsheng@...wei.com,
        bpf@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>, willy@...radead.org
Subject: Re: [PATCH RFC net-next/mm V2 1/2] page_pool: Remove workqueue in new
 shutdown scheme



On 27/04/2023 22.53, Toke Høiland-Jørgensen wrote:
>> +noinline
>>   static void page_pool_empty_ring(struct page_pool *pool)
>>   {
>>   	struct page *page;
>> @@ -796,39 +828,29 @@ static void page_pool_scrub(struct page_pool *pool)
>>   	page_pool_empty_ring(pool);
>>   }
> So this is not in the diff context, but page_pool_empty_ring() does
> this:
> 
> static void page_pool_empty_ring(struct page_pool *pool)
> {
> 	struct page *page;
> 
> 	/* Empty recycle ring */
> 	while ((page = ptr_ring_consume_bh(&pool->ring))) {
> 		/* Verify the refcnt invariant of cached pages */
> 		if (!(page_ref_count(page) == 1))
> 			pr_crit("%s() page_pool refcnt %d violation\n",
> 				__func__, page_ref_count(page));
> 
> 		page_pool_return_page(pool, page);
> 	}
> }
> 
> ...and with this patch, that page_pool_return_page() call will now free
> the pool memory entirely when the last page is returned. When it does
> this, the condition in the while loop will still execute afterwards; it
> would return false, but if the pool was freed, it's now referencing
> freed memory when trying to read from pool->ring.

Yes, that sounds like a problem.

> So I think page_pool_empty_ring needs to either pull out all the pages
> in the ring to an on-stack buffer before calling page_pool_return_page()
> on them, or there needs to be some other way to break the loop early.

Let me address this one first, I'll get back to the other in another
reply.  The usual/idiom way of doing this is to have a next pointer that
is populated inside the loop before freeing the object.
It should look like this (only compile tested):

  static void page_pool_empty_ring(struct page_pool *pool)
  {
	struct page *page, *next;

	next = ptr_ring_consume_bh(&pool->ring);

	/* Empty recycle ring */
	while (next) {
		page = next;
		next = ptr_ring_consume_bh(&pool->ring);

		/* Verify the refcnt invariant of cached pages */
		if (!(page_ref_count(page) == 1))
			pr_crit("%s() page_pool refcnt %d violation\n",
				__func__, page_ref_count(page));

		page_pool_return_page(pool, page);
	}
  }


> There are a couple of other places where page_pool_return_page() is
> called in a loop where the loop variable lives inside struct page_pool,
> so we need to be absolutely sure they will never be called in the
> shutdown stage, or they'll have to be fixed as well.

The other loops are okay, but I spotted another problem in 
__page_pool_put_page() in "Fallback/non-XDP mode", but that is fixable.

--Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ