linux-kernel - Re: [PATCH v2] net: af_packet: Use hrtimer to do the retire operation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250815170825.3585310-1-jackzxcui1989@163.com>
Date: Sat, 16 Aug 2025 01:08:25 +0800
From: Xin Zhao <jackzxcui1989@....com>
To: willemdebruijn.kernel@...il.com,
	edumazet@...gle.com,
	ferenc@...es.dev
Cc: davem@...emloft.net,
	kuba@...nel.org,
	pabeni@...hat.com,
	horms@...nel.org,
	netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] net: af_packet: Use hrtimer to do the retire operation

On Fri, 2025-08-15 at 18:12 +0800, Willem wrote:

> Signed-off-by: Xin Zhao <jackzxcui1989@....com>

> Please clearly label PATCH net-next and include a changelog and link
> to previous versions.
> 
> See also other recently sent patches and
> https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html
> https://docs.kernel.org/process/submitting-patches.html
> 
> > ---

Dear Willem,

I will add the details in PATCH v3.


> > -	p1->tov_in_jiffies = msecs_to_jiffies(p1->retire_blk_tov);
> 
> Since the hrtimer API takes ktime, and there is no other user for
> retire_blk_tov, remove that too and instead have interval_ktime.
> 
> >  	p1->blk_sizeof_priv = req_u->req3.tp_sizeof_priv;

We cannot simply remove the retire_blk_tov field, because in net/packet/diag.c 
retire_blk_tov is being used in function pdiag_put_ring. Since there are still
people using it, I personally prefer not to remove this variable for now. If
you think it is still necessary, I can remove it later and adjust the logic in
diag.c accordingly, using ktime_to_ms to convert the ktime_t format value back
to the u32 type needed in the pdiag_put_ring function.


> > +	hrtimer_set_expires(&pkc->retire_blk_timer,
> > +			    ktime_add(ktime_get(), ms_to_ktime(pkc->retire_blk_tov)));
> 
> More common for HRTIMER_RESTART timers is hrtimer_forward_now.
> 
> >  	pkc->last_kactive_blk_num = pkc->kactive_blk_num;

As I mentioned in my previous response, we cannot use hrtimer_forward_now here
because the function _prb_refresh_rx_retire_blk_timer can be called not only
when the retire timer expires, but also when the kernel logic for receiving
network packets detects that a network packet has filled up a block and calls
prb_open_block to use the next block. This can lead to a WARN_ON being triggered
in hrtimer_forward_now when it checks if the timer has already been enqueued
(WARN_ON(timer->state & HRTIMER_STATE_ENQUEUED)).
I encountered this issue when I initially used hrtimer_forward_now. This is the
reason why the existing logic for the regular timer uses mod_timer instead of
add_timer, as mod_timer is designed to handle such scenarios. A relevant comment
in the mod_timer implementation states:
 * Note that if there are multiple unserialized concurrent users of the
 * same timer, then mod_timer() is the only safe way to modify the timeout,
 * since add_timer() cannot modify an already running timer.


> > +static enum hrtimer_restart prb_retire_rx_blk_timer_expired(struct hrtimer *t)
> >  {
> >  	struct packet_sock *po =
> >  		timer_container_of(po, t, rx_ring.prb_bdqc.retire_blk_timer);
> > @@ -790,6 +790,7 @@ static void prb_retire_rx_blk_timer_expired(struct timer_list *t)
> > 
> >  out:
> >  	spin_unlock(&po->sk.sk_receive_queue.lock);
> > +	return HRTIMER_RESTART;
> 
> This always restart the timer. But that is not the current behavior.
> Per prb_retire_rx_blk_timer_expired:
> 
>    * 1) We refresh the timer only when we open a block.
> 
> Look at the five different paths that can reach label out.
> 
> In particular, if the block is retired in this timer, and no new block
> is available to be opened, no timer should be armed.
> 
> >  }

I have sorted out the logic in this area; please take a look and see if it's correct.

We are discussing the conditions under which we should return HRTIMER_NORESTART. We only
need to focus on the three 'goto out' statements in this function (because if it don't
call 'goto out', it will definitely not skip the 'refresh_timer:' label, and if it don't
skip the refresh_timer label, it will definitely execute the _prb_refresh_rx_retire_blk_timer
function, which expects to return HRTIMER_RESTART):
Case 1:
  if (unlikely(pkc->delete_blk_timer))
    goto out;
  This case indicates that the hrtimer has already been stopped. In this situation, it 
  should return HRTIMER_NORESTART, and I will make this change in PATCH v3.
Case 2:
  if (!prb_dispatch_next_block(pkc, po))
    goto refresh_timer;
  else
    goto out;
  In this case, the execution will only reach the out label if prb_dispatch_next_block
  returns a non-zero value. If prb_dispatch_next_block returns a non-zero value, it must
  have executed prb_open_block, which in turn will call _prb_refresh_rx_retire_blk_timer
  to set the new timeout for the retire timer. Therefore, in this scenario, the hrtimer
  should return HRTIMER_RESTART.
Case 3:
  } else {
     ...
     prb_open_block(pkc, pbd);
     goto out;
  }
  This goto out clearly follows a call to prb_open_block, and as mentioned in the case 2,
  it will set a new timeout and expects the hrtimer to restart.
Based on the analysis above, I only need to modify the situation described in case 1 in
PATCH v3 to return HRTIMER_NORESTART. If there are any inaccuracies, please provide
further guidance.


Thanks
Xin Zhao