linux-kernel - Re: [PATCH net-next v10 2/2] net: af_packet: Use hrtimer to do the retire operation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20250903170716.595528-1-jackzxcui1989@163.com>
Date: Thu,  4 Sep 2025 01:07:16 +0800
From: Xin Zhao <jackzxcui1989@....com>
To: kerneljasonxing@...il.com,
	willemdebruijn.kernel@...il.com,
	edumazet@...gle.com,
	ferenc@...es.dev
Cc: davem@...emloft.net,
	kuba@...nel.org,
	pabeni@...hat.com,
	horms@...nel.org,
	netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v10 2/2] net: af_packet: Use hrtimer to do the retire operation

On Wed, Sep 3, 2025 at 00:42 +0800 Jason Xing <kerneljasonxing@...il.com> wrote:

> One more review from my side is that as to the removal of
> delete_blk_timer, I'm afraid it deserves a clarification in the commit
> message.
> 
> > > -       spin_unlock_bh(&rb_queue->lock);
> > > -
> > > -       prb_del_retire_blk_timer(pkc);
> > > -}
> > > -

In the description of [PATCH net-next v10 0/2] net: af_packet: optimize retire operation:

Changes in v8:
- Delete delete_blk_timer field, as suggested by Willem de Bruijn,
  hrtimer_cancel will check and wait until the timer callback return and ensure
  enter enter callback again;

I will also emphasize the removal of delete_blk_timer in the commit message for this 2/2
commit. The updated commit message for the 2/2 patch is as follows：

Changes in v8:
- Simplify the logic related to setting timeout.
- Delete delete_blk_timer field, hrtimer_cancel will check and wait until
  the timer callback return.

> I gradually understand your thought behind this modification. You're
> trying to move the timer operation out of prb_open_block() and then
> spread the timer operation into each caller.
> 
> You probably miss the following call trace:
> packet_current_rx_frame() -> __packet_lookup_frame_in_block() ->
> prb_open_block() -> _prb_refresh_rx_retire_blk_timer()
> ?
> 
> May I ask why bother introducing so many changes like this instead of
> leaving it as-is?

Consider the following timing sequence:
timer   cpu0 (softirq context, hrtimer timeout)                cpu1 (process context)
0       hrtimer_run_softirq
1         __hrtimer_run_queues
2           __run_hrtimer
3             prb_retire_rx_blk_timer_expired
4               spin_lock(&po->sk.sk_receive_queue.lock);
5               _prb_refresh_rx_retire_blk_timer
6                 hrtimer_forward_now
7               spin_unlock(&po->sk.sk_receive_queue.lock)
8             raw_spin_lock_irq(&cpu_base->lock);              tpacket_rcv
9             enqueue_hrtimer                                    spin_lock(&sk->sk_receive_queue.lock);
10                                                               packet_current_rx_frame
11                                                                 __packet_lookup_frame_in_block
12            finish enqueue_hrtimer                                 prb_open_block
13                                                                     _prb_refresh_rx_retire_blk_timer
14                                                                       hrtimer_is_queued(&pkc->retire_blk_timer) == true
15                                                                       hrtimer_forward_now
16                                                                         WARN_ON
On cpu0 in the timing sequence above, enqueue_hrtimer is not protected by sk_receive_queue.lock,
while the hrtimer_forward_now is not protected by raw_spin_lock_irq(&cpu_base->lock).

In my previous email, I provided an explanation. As a supplement, I would
like to reiterate a paragraph from my earlier response to Willem.
The point is that when the hrtimer is in the enqueued state, you cannot
call interfaces like hrtimer_forward_now. The kernel has a WARN_ON check
in hrtimer_forward_now for this reason. Similarly, you also cannot call
interfaces like hrtimer_set_expires. The kernel does not include a WARN_ON
check in hrtimer_set_expires to avoid increasing the code size, as
hrtimer_set_expires is an inline function.

Thanks
Xin Zhao