lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 28 Apr 2023 17:46:46 +0200
From:   Jesper Dangaard Brouer <jbrouer@...hat.com>
To:     Toke Høiland-Jørgensen <toke@...hat.com>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        netdev@...r.kernel.org, Eric Dumazet <eric.dumazet@...il.com>,
        linux-mm@...ck.org, Mel Gorman <mgorman@...hsingularity.net>
Cc:     brouer@...hat.com, lorenzo@...nel.org, linyunsheng@...wei.com,
        bpf@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>, willy@...radead.org
Subject: Re: [PATCH RFC net-next/mm V2 1/2] page_pool: Remove workqueue in new
 shutdown scheme


On 27/04/2023 22.53, Toke Høiland-Jørgensen wrote:
>> @@ -868,11 +890,13 @@ void page_pool_destroy(struct page_pool *pool)
>>   	if (!page_pool_release(pool))
>>   		return;
>>   
>> -	pool->defer_start = jiffies;
>> -	pool->defer_warn  = jiffies + DEFER_WARN_INTERVAL;
>> +	/* PP have pages inflight, thus cannot immediately release memory.
>> +	 * Enter into shutdown phase.
>> +	 */
>> +	pool->p.flags |= PP_FLAG_SHUTDOWN;
 >
> I think there's another race here: once the flag is set in this line
> (does this need a memory barrier, BTW?), another CPU can return the last
> outstanding page, read the flag and call page_pool_empty_ring(). If this
> happens before the call to page_pool_empty_ring() below, you'll get a
> use-after-free.
> 
> To avoid this, we could artificially bump the pool->hold_cnt *before*
> setting the flag above; that way we know that the page_pool_empty_ring()
> won't trigger a release, because inflight pages will never go below 1.
> And then, below the page_pool_empty_ring() call below, we can add an
> artificial bump of the release_cnt as well, which means we'll get proper
> atomic semantics on the counters and only ever release once. I.e.,:
> 
>> -	INIT_DELAYED_WORK(&pool->release_dw, page_pool_release_retry);
>> -	schedule_delayed_work(&pool->release_dw, DEFER_TIME);
>> +	/* Concurrent CPUs could have returned last pages into ptr_ring */
>> +	page_pool_empty_ring(pool);
>          release_cnt = atomic_inc_return(&pool->pages_state_release_cnt);
>          page_pool_free_attempt(pool, release_cnt);
> 

I agree and I've implemented this solution (see V3 soon).

I've used smp_store_release() instead of WRITE_ONCE(), because AFAIK
smp_store_release() adds the memory barriers.

--Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ