linux-kernel - Re: [PATCH net-next v9] net: af_packet: Use hrtimer to do the retire operation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <willemdebruijn.kernel.801a3a33fec@gmail.com>
Date: Thu, 28 Aug 2025 19:16:41 -0400
From: Willem de Bruijn <willemdebruijn.kernel@...il.com>
To: Xin Zhao <jackzxcui1989@....com>, 
 willemdebruijn.kernel@...il.com, 
 edumazet@...gle.com, 
 ferenc@...es.dev
Cc: davem@...emloft.net, 
 kuba@...nel.org, 
 pabeni@...hat.com, 
 horms@...nel.org, 
 netdev@...r.kernel.org, 
 linux-kernel@...r.kernel.org, 
 Xin Zhao <jackzxcui1989@....com>
Subject: Re: [PATCH net-next v9] net: af_packet: Use hrtimer to do the retire
 operation

Xin Zhao wrote:
> In a system with high real-time requirements, the timeout mechanism of
> ordinary timers with jiffies granularity is insufficient to meet the
> demands for real-time performance. Meanwhile, the optimization of CPU
> usage with af_packet is quite significant. Use hrtimer instead of timer
> to help compensate for the shortcomings in real-time performance.
> In HZ=100 or HZ=250 system, the update of TP_STATUS_USER is not real-time
> enough, with fluctuations reaching over 8ms (on a system with HZ=250).
> This is unacceptable in some high real-time systems that require timely
> processing of network packets. By replacing it with hrtimer, if a timeout
> of 2ms is set, the update of TP_STATUS_USER can be stabilized to within
> 3 ms.
> 
> Signed-off-by: Xin Zhao <jackzxcui1989@....com>
> 
> ---
> Changes in v9:
> - Remove the function prb_setup_retire_blk_timer and move hrtimer setup and start
>   logic into function init_prb_bdqc
>   as suggested by Willem de Bruijn;
> - Always update last_kactive_blk_num before hrtimer callback return as the origin
>   logic does, as suggested by Willem de Bruijn.
>   In tpacket_rcv, it may call prb_close_block but do not call prb_open_block in
>   prb_dispatch_next_block, leading to inconsistency between last_kactive_blk_num
>   and kactive_blk_num. In hrtimer callback, we should update last_kactive_blk_num
>   in this case.

Overall I'm on-board with this version.

One remaining question is the intended behavior of kactive_blk_num (K)
and last_kactive_blk_num (L).

K is incremented on block close. L is set to match K on block open and
each timer. So the only time that they differ is if a block is closed
in tpacket_rcv and no new block could be opened.

The only use of L is that the core of the timer callback is skipped if
L != K. Based on the above that can only happen if a block was closed
in tpacket_rcv and no new block could be opened (because the ring is
full), so the ring is frozen. So it only skips the frozen part of the
timer callback. Until the next timeout. But why? If the queue is
frozen and the next block is no longer in use, just call
prb_open_block right away?

Unless I'm missing something, I think we can make that simplification.
Then we won't have to worry that the behavior changes after this
patch. It should be a separate precursor patch though.