lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <willemdebruijn.kernel.e7f6fdfe20e3@gmail.com>
Date: Wed, 27 Aug 2025 17:53:08 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Xin Zhao <jackzxcui1989@....com>, 
 willemdebruijn.kernel@...il.com, 
 edumazet@...gle.com, 
 ferenc@...es.dev
Cc: davem@...emloft.net, 
 kuba@...nel.org, 
 pabeni@...hat.com, 
 horms@...nel.org, 
 netdev@...r.kernel.org, 
 linux-kernel@...r.kernel.org, 
 Xin Zhao <jackzxcui1989@....com>
Subject: Re: [PATCH net-next v8] net: af_packet: Use hrtimer to do the retire
 operation

Xin Zhao wrote:
> In a system with high real-time requirements, the timeout mechanism of
> ordinary timers with jiffies granularity is insufficient to meet the
> demands for real-time performance. Meanwhile, the optimization of CPU
> usage with af_packet is quite significant. Use hrtimer instead of timer
> to help compensate for the shortcomings in real-time performance.
> In HZ=100 or HZ=250 system, the update of TP_STATUS_USER is not real-time
> enough, with fluctuations reaching over 8ms (on a system with HZ=250).
> This is unacceptable in some high real-time systems that require timely
> processing of network packets. By replacing it with hrtimer, if a timeout
> of 2ms is set, the update of TP_STATUS_USER can be stabilized to within
> 3 ms.
> 
> Signed-off-by: Xin Zhao <jackzxcui1989@....com>
> 
> ---
> Changes in v8:
> - Delete delete_blk_timer field, as suggested by Willem de Bruijn,
>   hrtimer_cancel will check and wait until the timer callback return and ensure
>   enter enter callback again;
> - Simplify the logic related to setting timeout, as suggestd by Willem de Bruijn.
>   Currently timer callback just restarts itself unconditionally, so delete the
>  'out:' label, do not forward hrtimer in prb_open_block, call hrtimer_forward_now
>   directly and always return HRTIMER_RESTART. The only special case is when
>   prb_open_block is called from tpacket_rcv. That would set the timeout further
>   into the future than the already queued timer. An earlier timeout is not
>   problematic. No need to add complexity to avoid that.

This simplifies the timer logic tremendously. I like this direction a lot.
 
>  static void prb_setup_retire_blk_timer(struct packet_sock *po)
> @@ -603,9 +592,10 @@ static void prb_setup_retire_blk_timer(struct packet_sock *po)
>  	struct tpacket_kbdq_core *pkc;
>  
>  	pkc = GET_PBDQC_FROM_RB(&po->rx_ring);
> -	timer_setup(&pkc->retire_blk_timer, prb_retire_rx_blk_timer_expired,
> -		    0);
> -	pkc->retire_blk_timer.expires = jiffies;
> +	hrtimer_setup(&pkc->retire_blk_timer, prb_retire_rx_blk_timer_expired,
> +		      CLOCK_MONOTONIC, HRTIMER_MODE_REL_SOFT);
> +	hrtimer_start(&pkc->retire_blk_timer, pkc->interval_ktime,
> +		      HRTIMER_MODE_REL_SOFT);

Since this is only called from init_prb_bdqc, we can further remove
this whole function and move the two hrtimer calls to the parent.

>  }
>  
>  static int prb_calc_retire_blk_tmo(struct packet_sock *po,
> @@ -672,11 +662,10 @@ static void init_prb_bdqc(struct packet_sock *po,
>  	p1->last_kactive_blk_num = 0;
>  	po->stats.stats3.tp_freeze_q_cnt = 0;
>  	if (req_u->req3.tp_retire_blk_tov)
> -		p1->retire_blk_tov = req_u->req3.tp_retire_blk_tov;
> +		p1->interval_ktime = ms_to_ktime(req_u->req3.tp_retire_blk_tov);
>  	else
> -		p1->retire_blk_tov = prb_calc_retire_blk_tmo(po,
> -						req_u->req3.tp_block_size);
> -	p1->tov_in_jiffies = msecs_to_jiffies(p1->retire_blk_tov);
> +		p1->interval_ktime = ms_to_ktime(prb_calc_retire_blk_tmo(po,
> +						req_u->req3.tp_block_size));
>  	p1->blk_sizeof_priv = req_u->req3.tp_sizeof_priv;
>  	rwlock_init(&p1->blk_fill_in_prog_lock);
>  
> @@ -686,16 +675,6 @@ static void init_prb_bdqc(struct packet_sock *po,
>  	prb_open_block(p1, pbd);
>  }
>  
> -/*  Do NOT update the last_blk_num first.
> - *  Assumes sk_buff_head lock is held.
> - */
> -static void _prb_refresh_rx_retire_blk_timer(struct tpacket_kbdq_core *pkc)
> -{
> -	mod_timer(&pkc->retire_blk_timer,
> -			jiffies + pkc->tov_in_jiffies);
> -	pkc->last_kactive_blk_num = pkc->kactive_blk_num;

last_kactive_blk_num is now only updated on prb_open_block. It still
needs to be updated on each timer callback? To see whether the active
block did not change since the last callback.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ