[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250822101634.129855-1-jackzxcui1989@163.com>
Date: Fri, 22 Aug 2025 18:16:34 +0800
From: Xin Zhao <jackzxcui1989@....com>
To: willemdebruijn.kernel@...il.com,
edumazet@...gle.com,
ferenc@...es.dev
Cc: davem@...emloft.net,
kuba@...nel.org,
pabeni@...hat.com,
horms@...nel.org,
netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v6] net: af_packet: Use hrtimer to do the retire operation
On Fri, 2025-08-22 at 14:37 +0800, Willem wrote:
> The hrtimer callback is called by __run_hrtimer, if we only use hrtimer_forward_now in the callback,
> it will not restart the time within the callback. The timer will be enqueued after the callback
> return. So when the timer is being enqueued, it is not protected by sk_receive_queue.lock.
I see.
> > Consider the following timing sequence:
> > timer cpu0 (softirq context, hrtimer timeout) cpu
> > 0 hrtimer_run_softirq
> > 1 __hrtimer_run_queues
> > 2 __run_hrtimer
> > 3 prb_retire_rx_blk_timer_expired
> > 4 spin_lock(&po->sk.sk_receive_queue.lock);
> > 5 _prb_refresh_rx_retire_blk_timer
> > 6 hrtimer_forward_now
> > 7 spin_unlock(&po->sk.sk_receive_queue.lock)
> > 8 raw_spin_lock_irq(&cpu_base->lock); tpacket_rcv
> > 9 enqueue_hrtimer spin_lock(&sk->sk_receive_queue.lock);
> > 10 packet_current_rx_frame
> > 11 __packet_lookup_frame_in_block
> > 12 finish enqueue_hrtimer prb_open_block
> > 13 _prb_refresh_rx_retire_blk_timer
> > 14 hrtimer_is_queued(&pkc->retire_blk_timer) == true
> > 15 hrtimer_forward_now
> > 16 WARN_ON
> >
> > On cpu0 in the timing sequence above, enqueue_hrtimer is not protected by sk_receive_queue.lock,
> > while the hrtimer_forward_now is not protected by raw_spin_lock_irq(&cpu_base->lock).
> >
> > It will cause WARN_ON if we only use 'hrtimer_is_queued(&pkc->retire_blk_timer) == true' to check
> > whether to call hrtimer_forward_now.
>
> One way around this may be to keep the is_timer_queued state inside
> tpacket_kbdq_core protected by a relevant lock, like the receive queue
> lock. Similar to pkc->delete_blk_timer.
>
> Admittedly I have not given this much thought yet. Am traveling for a
> few days, have limited time.
Thank you for replying to me during your break. I later thought of a way to ensure that the enqueue of
the hrtimer can be set in an ordered manner without adding new locks or using local_irq_save. I will
reflect this in version 7.
Additionally, we still need the 'bool callback' parameter to determine whether we are inside the
hrtimer's callback, while 'bool start' parameter is unnecessary.
Why should we keep the callback parameter?
As mentioned earlier, the enqueue action occurs after exiting the hrtimer callback, and this enqueue
action is not performed in the af_packet code. This could lead to the timer's state being changed back
to enqueued at an uncertain time, which does not provide a timing guarantee for our logic in
_prb_refresh_rx_retire_blk_timer, where we check the status using hrtimer_is_queued.
As previously discussed, I said that we must accurately determine whether we are in the hrtimer callback
in _prb_refresh_rx_retire_blk_timer. Relying on either hrtimer_is_queued or hrtimer_callback_running
would not provide sufficient accuracy. However, adding a callback variable as a parameter would be a
reliable approach, requiring minimal code changes and making it easier for future readers to understand.
Therefore, I will also add this 'bool callback' parameter in version 7.
Thanks
Xin Zhao
Powered by blists - more mailing lists