lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e0bbcd20-77ec-4dc9-ada9-94aaf4ea44bb@redhat.com>
Date:   Thu, 27 Apr 2023 12:47:28 +0200
From:   Jesper Dangaard Brouer <jbrouer@...hat.com>
To:     Yunsheng Lin <linyunsheng@...wei.com>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>,
        linux-mm@...ck.org, Mel Gorman <mgorman@...hsingularity.net>
Cc:     brouer@...hat.com, lorenzo@...nel.org,
        Toke Høiland-Jørgensen <toke@...hat.com>,
        bpf@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>, willy@...radead.org
Subject: Re: [PATCH RFC net-next/mm V1 1/3] page_pool: Remove workqueue in new
 shutdown scheme



On 27/04/2023 02.57, Yunsheng Lin wrote:
> On 2023/4/26 1:15, Jesper Dangaard Brouer wrote:
>> @@ -609,6 +609,8 @@ void page_pool_put_defragged_page(struct page_pool *pool, struct page *page,
>>   		recycle_stat_inc(pool, ring_full);
>>   		page_pool_return_page(pool, page);
>>   	}
>> +	if (pool->p.flags & PP_FLAG_SHUTDOWN)
>> +		page_pool_shutdown_attempt(pool);
> 
> It seems we have allowed page_pool_shutdown_attempt() to be called
> concurrently here, isn't there a time window between atomic_inc_return_relaxed()
> and page_pool_inflight() for pool->pages_state_release_cnt, which may cause
> double calling of page_pool_free()?
> 

Yes, I think that is correct.
I actually woke up this morning thinking of this case of double freeing,
and this time window.  Thanks for spotting and confirming this issue.

Basically: Two concurrent CPUs executing page_pool_shutdown_attempt() 
can both end-up seeing inflight equal zero, resulting in both of them 
kfreeing the memory (in page_pool_free()) as they both think they are 
the last user of PP instance.

I've been thinking how to address this.
This is my current idea:

(1) Atomic variable inc and test (or cmpxchg) that resolves last user race.
(2) Defer free to call_rcu callback to let other CPUs finish.
(3) Might need rcu_read_lock() in page_pool_shutdown_attempt().

--Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ