lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250822101634.129855-1-jackzxcui1989@163.com>
Date: Fri, 22 Aug 2025 18:16:34 +0800
From: Xin Zhao <jackzxcui1989@....com>
To: willemdebruijn.kernel@...il.com,
	edumazet@...gle.com,
	ferenc@...es.dev
Cc: davem@...emloft.net,
	kuba@...nel.org,
	pabeni@...hat.com,
	horms@...nel.org,
	netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v6] net: af_packet: Use hrtimer to do the retire operation

On Fri, 2025-08-22 at 14:37 +0800, Willem wrote:

> The hrtimer callback is called by __run_hrtimer, if we only use hrtimer_forward_now in the callback,
> it will not restart the time within the callback. The timer will be enqueued after the callback
> return. So when the timer is being enqueued, it is not protected by sk_receive_queue.lock.

I see.

> > Consider the following timing sequence:
> > timer   cpu0 (softirq context, hrtimer timeout)                cpu
> > 0       hrtimer_run_softirq
> > 1         __hrtimer_run_queues
> > 2           __run_hrtimer
> > 3             prb_retire_rx_blk_timer_expired
> > 4               spin_lock(&po->sk.sk_receive_queue.lock);
> > 5               _prb_refresh_rx_retire_blk_timer
> > 6                 hrtimer_forward_now
> > 7               spin_unlock(&po->sk.sk_receive_queue.lock)
> > 8             raw_spin_lock_irq(&cpu_base->lock);              tpacket_rcv
> > 9             enqueue_hrtimer                                    spin_lock(&sk->sk_receive_queue.lock);
> > 10                                                               packet_current_rx_frame
> > 11                                                                 __packet_lookup_frame_in_block
> > 12            finish enqueue_hrtimer                                 prb_open_block
> > 13                                                                     _prb_refresh_rx_retire_blk_timer
> > 14                                                                       hrtimer_is_queued(&pkc->retire_blk_timer) == true
> > 15                                                                       hrtimer_forward_now
> > 16                                                                         WARN_ON
> > 
> > On cpu0 in the timing sequence above, enqueue_hrtimer is not protected by sk_receive_queue.lock,
> > while the hrtimer_forward_now is not protected by raw_spin_lock_irq(&cpu_base->lock).
> > 
> > It will cause WARN_ON if we only use 'hrtimer_is_queued(&pkc->retire_blk_timer) == true' to check
> > whether to call hrtimer_forward_now.
> 
> One way around this may be to keep the is_timer_queued state inside
> tpacket_kbdq_core protected by a relevant lock, like the receive queue
> lock. Similar to pkc->delete_blk_timer.
> 
> Admittedly I have not given this much thought yet. Am traveling for a
> few days, have limited time.

Thank you for replying to me during your break. I later thought of a way to ensure that the enqueue of
the hrtimer can be set in an ordered manner without adding new locks or using local_irq_save. I will
reflect this in version 7.

Additionally, we still need the 'bool callback' parameter to determine whether we are inside the
hrtimer's callback, while 'bool start' parameter is unnecessary.


Why should we keep the callback parameter?

As mentioned earlier, the enqueue action occurs after exiting the hrtimer callback, and this enqueue
action is not performed in the af_packet code. This could lead to the timer's state being changed back
to enqueued at an uncertain time, which does not provide a timing guarantee for our logic in
_prb_refresh_rx_retire_blk_timer, where we check the status using hrtimer_is_queued.

As previously discussed, I said that we must accurately determine whether we are in the hrtimer callback
in _prb_refresh_rx_retire_blk_timer. Relying on either hrtimer_is_queued or hrtimer_callback_running
would not provide sufficient accuracy. However, adding a callback variable as a parameter would be a
reliable approach, requiring minimal code changes and making it easier for future readers to understand.
Therefore, I will also add this 'bool callback' parameter in version 7.


Thanks
Xin Zhao


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ