[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8734h8qgmz.fsf@toke.dk>
Date: Fri, 24 Jan 2025 18:13:40 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Yunsheng Lin <linyunsheng@...wei.com>, Yunsheng Lin
<yunshenglin0825@...il.com>, davem@...emloft.net, kuba@...nel.org,
pabeni@...hat.com
Cc: zhangkun09@...wei.com, liuyonglong@...wei.com, fanghaiqing@...wei.com,
Alexander Lobakin <aleksander.lobakin@...el.com>, Xuan Zhuo
<xuanzhuo@...ux.alibaba.com>, Jesper Dangaard Brouer <hawk@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>, Eric Dumazet
<edumazet@...gle.com>, Simon Horman <horms@...nel.org>,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v7 2/8] page_pool: fix timing for checking and
disabling napi_local
Yunsheng Lin <linyunsheng@...wei.com> writes:
>> So I really don't see a way for this race to happen with correct usage
>> of the page_pool and NAPI APIs, which means there's no reason to make
>> the change you are proposing here.
>
> I looked at one driver setting pp->napi, it seems the bnxt driver doesn't
> seems to call page_pool_disable_direct_recycling() when unloading, see
> bnxt_half_close_nic(), page_pool_disable_direct_recycling() seems to be
> only called for the new queue_mgmt API:
>
> /* rtnl_lock held, this call can only be made after a previous successful
> * call to bnxt_half_open_nic().
> */
> void bnxt_half_close_nic(struct bnxt *bp)
> {
> bnxt_hwrm_resource_free(bp, false, true);
> bnxt_del_napi(bp); *----call napi del and rcu sync----*
> bnxt_free_skbs(bp);
> bnxt_free_mem(bp, true); *------call page_pool_destroy()----*
> clear_bit(BNXT_STATE_HALF_OPEN, &bp->state);
> }
>
> Even if there is a page_pool_disable_direct_recycling() called between
> bnxt_del_napi() and bnxt_free_mem(), the timing window still exist as
> rcu sync need to be called after page_pool_disable_direct_recycling(),
> it seems some refactor is needed for bnxt driver to reuse the rcu sync
> from the NAPI API, in order to avoid calling the rcu sync for
> page_pool_destroy().
Well, I would consider that usage buggy. A page pool object is created
with a reference to the napi struct; so the page pool should also be
destroyed (clearing its reference) before the napi memory is freed. I
guess this is not really documented anywhere, but it's pretty standard
practice to free objects in the opposite order of their creation.
So no, I don't think this is something that should be fixed on the page
pool side (and certainly not by adding another synchronize_rcu() call
per queue!); rather, we should fix the drivers that get this wrong (and
probably document the requirement a bit better).
-Toke
Powered by blists - more mailing lists