netdev - Re: [PATCH net-next v5] net: af_packet: Use hrtimer to do the retire operation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <willemdebruijn.kernel.85f8d3f87b8b@gmail.com>
Date: Tue, 19 Aug 2025 16:21:43 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>, 
 Xin Zhao <jackzxcui1989@....com>, 
 willemdebruijn.kernel@...il.com, 
 edumazet@...gle.com, 
 ferenc@...es.dev
Cc: davem@...emloft.net, 
 kuba@...nel.org, 
 pabeni@...hat.com, 
 horms@...nel.org, 
 netdev@...r.kernel.org, 
 linux-kernel@...r.kernel.org, 
 Xin Zhao <jackzxcui1989@....com>
Subject: Re: [PATCH net-next v5] net: af_packet: Use hrtimer to do the retire
 operation

Willem de Bruijn wrote:
> Xin Zhao wrote:
> > In a system with high real-time requirements, the timeout mechanism of
> > ordinary timers with jiffies granularity is insufficient to meet the
> > demands for real-time performance. Meanwhile, the optimization of CPU
> > usage with af_packet is quite significant. Use hrtimer instead of timer
> > to help compensate for the shortcomings in real-time performance.
> > In HZ=100 or HZ=250 system, the update of TP_STATUS_USER is not real-time
> > enough, with fluctuations reaching over 8ms (on a system with HZ=250).
> > This is unacceptable in some high real-time systems that require timely
> > processing of network packets. By replacing it with hrtimer, if a timeout
> > of 2ms is set, the update of TP_STATUS_USER can be stabilized to within
> > 3 ms.
> > 
> > Signed-off-by: Xin Zhao <jackzxcui1989@....com>
> 
> > -static void _prb_refresh_rx_retire_blk_timer(struct tpacket_kbdq_core *pkc)
> > +static void _prb_refresh_rx_retire_blk_timer(struct tpacket_kbdq_core *pkc,
> > +					     bool start, bool callback)
> >  {
> > -	mod_timer(&pkc->retire_blk_timer,
> > -			jiffies + pkc->tov_in_jiffies);
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> 
> The two environments that can race are the timer callback running in
> softirq context or the open_block from tpacket_rcv in process context.

I meant prb_open_block

tpacket_rcv runs in softirq context (from __netif_receive_skb_core)
or with bottom halves disabled (from __dev_queue_xmit, or if rx uses
napi_threaded).

That is likely why the spin_lock_bh variant is not explicitly needed.

> So worst case the process context path needs to disable bh?
> 
> As you pointed out, the accesses to the hrtimer fields are already
> protected, by the caller holding sk.sk_receive_queue.lock.
> 
> So it should be sufficient to just test hrtimer_is_queued inside that
> critical section before calling hrtimer_start?
> 
> Side-note: tpacket_rcv calls spin_lock, not spin_lock_bh. But if the
> same lock can also be taken in softirq context, the process context
> caller should use the _bh variant. This is not new with your patch.
> Classical timers also run in softirq context. I may be overlooking
> something, will need to take a closer look at that.
> 
> In any case, I don't think local_irq_save is needed. 
> 
> > +	if (start && !callback)
> > +		hrtimer_start(&pkc->retire_blk_timer, pkc->interval_ktime,
> > +			      HRTIMER_MODE_REL_SOFT);
> > +	else
> > +		hrtimer_forward_now(&pkc->retire_blk_timer, pkc->interval_ktime);
> > +	local_irq_restore(flags);
> >  	pkc->last_kactive_blk_num = pkc->kactive_blk_num;
> >  }