[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250816164257.3908597-1-jackzxcui1989@163.com>
Date: Sun, 17 Aug 2025 00:42:57 +0800
From: Xin Zhao <jackzxcui1989@....com>
To: willemdebruijn.kernel@...il.com,
edumazet@...gle.com,
ferenc@...es.dev
Cc: davem@...emloft.net,
kuba@...nel.org,
pabeni@...hat.com,
horms@...nel.org,
netdev@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] net: af_packet: Use hrtimer to do the retire operation
On Sat, 2025-08-16 at 17:33 +0800, Willem wrote:
> > > > - p1->tov_in_jiffies = msecs_to_jiffies(p1->retire_blk_tov);
> > >
> > > Since the hrtimer API takes ktime, and there is no other user for
> > > retire_blk_tov, remove that too and instead have interval_ktime.
> > >
> > > > p1->blk_sizeof_priv = req_u->req3.tp_sizeof_priv;
> >
> > We cannot simply remove the retire_blk_tov field, because in net/packet/diag.c
> > retire_blk_tov is being used in function pdiag_put_ring. Since there are still
> > people using it, I personally prefer not to remove this variable for now. If
> > you think it is still necessary, I can remove it later and adjust the logic in
> > diag.c accordingly, using ktime_to_ms to convert the ktime_t format value back
> > to the u32 type needed in the pdiag_put_ring function.
>
> Yes, let's remove the unnecessary extra field.
>
> >
OK, I will delete the 'retire_blk_tov' and add field named 'interval_ktime' instead.
And Accordingly, we also need to modify the logic in diag.c. The related changes are
as follows:
index 6ce1dcc28..c8f43e0c1 100644
--- a/net/packet/diag.c
+++ b/net/packet/diag.c
@@ -83,7 +83,7 @@ static int pdiag_put_ring(struct packet_ring_buffer *ring, int ver, int nl_type,
pdr.pdr_frame_nr = ring->frame_max + 1;
if (ver > TPACKET_V2) {
- pdr.pdr_retire_tmo = ring->prb_bdqc.retire_blk_tov;
+ pdr.pdr_retire_tmo = ktime_to_ms(ring->prb_bdqc.interval_ktime);
pdr.pdr_sizeof_priv = ring->prb_bdqc.blk_sizeof_priv;
pdr.pdr_features = ring->prb_bdqc.feature_req_word;
} else {
> >
> > > > + hrtimer_set_expires(&pkc->retire_blk_timer,
> > > > + ktime_add(ktime_get(), ms_to_ktime(pkc->retire_blk_tov)));
> > >
> > > More common for HRTIMER_RESTART timers is hrtimer_forward_now.
> > >
> > > > pkc->last_kactive_blk_num = pkc->kactive_blk_num;
> >
> > As I mentioned in my previous response, we cannot use hrtimer_forward_now here
> > because the function _prb_refresh_rx_retire_blk_timer can be called not only
> > when the retire timer expires, but also when the kernel logic for receiving
> > network packets detects that a network packet has filled up a block and calls
> > prb_open_block to use the next block. This can lead to a WARN_ON being triggered
> > in hrtimer_forward_now when it checks if the timer has already been enqueued
> > (WARN_ON(timer->state & HRTIMER_STATE_ENQUEUED)).
> > I encountered this issue when I initially used hrtimer_forward_now. This is the
> > reason why the existing logic for the regular timer uses mod_timer instead of
> > add_timer, as mod_timer is designed to handle such scenarios. A relevant comment
> > in the mod_timer implementation states:
> > * Note that if there are multiple unserialized concurrent users of the
> > * same timer, then mod_timer() is the only safe way to modify the timeout,
> > * since add_timer() cannot modify an already running timer.
>
> Please add a comment above the call to hrtimer_set_expires and/or in
> the commit message. As this is non-obvious and someone may easily
> later miss that and update.
I will add a comment in PATCH v3 as below:
static void _prb_refresh_rx_retire_blk_timer(struct tpacket_kbdq_core *pkc)
{
- mod_timer(&pkc->retire_blk_timer,
- jiffies + pkc->tov_in_jiffies);
+ /* We cannot use hrtimer_forward_now here because the function
+ * _prb_refresh_rx_retire_blk_timer can be called not only when
+ * the retire timer expires, but also when the kernel logic for
+ * receiving network packets detects that a network packet has
+ * filled up a block and calls prb_open_block to use the next
+ * block. This can lead to a WARN_ON being triggered in
+ * hrtimer_forward_now when it checks if the timer has already
+ * been enqueued.
+ */
+ hrtimer_set_expires(&pkc->retire_blk_timer,
+ ktime_add(ktime_get(), pkc->interval_ktime));
pkc->last_kactive_blk_num = pkc->kactive_blk_num;
}
>
> So mod_timer can also work as add_timer.
>
> But for hrtimer, is it safe for an hrtimer_setup call to run while a
> timer is already queued? And same for hrtimer_start.
In the current patch, the hrtimer_setup and hrtimer_start will be called only when
the af_packet-mmap user call setsockopt, and then the following call chain:
packet_setsockopt
packet_set_ring
init_prb_bdqc
prb_setup_retire_blk_timer
hrtimer_setup
hrtimer_start
So once the socket setup, hrtimer_setup and hrtimer_start will never be called again.
However, I also tested calling hrtimer_setup and hrtimer_start within the hrtimer
expiration callback function, and it seems that it does not affect the normal
operation of the timer (the first line of the hrtimer_start comment states that
* hrtimer_start - (re)start an hrtimer, indicating that it can handle this scenario).
As you previously suggested, the hrtimer expiration callback functions typically use
hrtimer_set_expires or hrtimer_forward_now. In addition, repeatedly calling
hrtimer_setup and hrtimer_start within the hrtimer expiration callback function is
also a waste of CPU resources. Therefore, my current patch does not use hrtimer_start
within the hrtimer expiration callback function. I already change it in PATCH v1 as I
mentioned it in Changes in v1:
...
- Use hrtimer_set_expires instead of hrtimer_start_range_ns when retire timer needs update
as suggested by Willem de Bruijn. Start the hrtimer in prb_setup_retire_blk_timer;
...
The implementation of _prb_refresh_rx_retire_blk_timer in the current patch is as follows:
static void _prb_refresh_rx_retire_blk_timer(struct tpacket_kbdq_core *pkc)
{
/* We cannot use hrtimer_forward_now here because the function
* _prb_refresh_rx_retire_blk_timer can be called not only when
* the retire timer expires, but also when the kernel logic for
* receiving network packets detects that a network packet has
* filled up a block and calls prb_open_block to use the next
* block. This can lead to a WARN_ON being triggered in
* hrtimer_forward_now when it checks if the timer has already
* been enqueued.
*/
hrtimer_set_expires(&pkc->retire_blk_timer,
ktime_add(ktime_get(), pkc->interval_ktime));
pkc->last_kactive_blk_num = pkc->kactive_blk_num;
}
Thanks
Xin Zhao
Powered by blists - more mailing lists