lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250903161709.563847-1-jackzxcui1989@163.com>
Date: Thu,  4 Sep 2025 00:17:09 +0800
From: Xin Zhao <jackzxcui1989@....com>
To: kerneljasonxing@...il.com,
	willemdebruijn.kernel@...il.com,
	edumazet@...gle.com,
	ferenc@...es.dev
Cc: davem@...emloft.net,
	kuba@...nel.org,
	pabeni@...hat.com,
	horms@...nel.org,
	netdev@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v10 2/2] net: af_packet: Use hrtimer to do the retire operation

On Tue, Sep 2, 2025 at 23:43 +0800 Jason Xing <kerneljasonxing@...il.com> wrote:

> >         p1->max_frame_len = p1->kblk_size - BLK_PLUS_PRIV(p1->blk_sizeof_priv);
> >         prb_init_ft_ops(p1, req_u);
> > -       prb_setup_retire_blk_timer(po);
> > +       hrtimer_setup(&p1->retire_blk_timer, prb_retire_rx_blk_timer_expired,
> > +                     CLOCK_MONOTONIC, HRTIMER_MODE_REL_SOFT);
> > +       hrtimer_start(&p1->retire_blk_timer, p1->interval_ktime,
> > +                     HRTIMER_MODE_REL_SOFT);
> 
> You expect to see it start at the setsockopt phase? Even if it's far
> from the real use of recv at the moment.
> 
> >         prb_open_block(p1, pbd);
> >  }

Before applying this patch, init_prb_bdqc also start the timer by mod_timer:

init_prb_bdqc
  prb_open_block
    _prb_refresh_rx_retire_blk_timer
      mod_timer

So the current timer's start time is almost the same as it was before applying
the patch.


> > @@ -917,7 +873,6 @@ static void prb_open_block(struct tpacket_kbdq_core *pkc1,
> >         pkc1->pkblk_end = pkc1->pkblk_start + pkc1->kblk_size;
> >
> >         prb_thaw_queue(pkc1);
> > -       _prb_refresh_rx_retire_blk_timer(pkc1);
> 
> Could you say more on why you remove this here and only reset/update
> the expiry time in the timer handler? Probably I missed something
> appearing in the previous long discussion.
> 
> >
> >         smp_wmb();
> >  }

In the description of [PATCH net-next v10 0/2] net: af_packet: optimize retire operation:

Changes in v7:
  When the callback return, without sk_buff_head lock protection, __run_hrtimer will
  enqueue the timer if return HRTIMER_RESTART. Setting the hrtimer expires while
  enqueuing a timer may cause chaos in the hrtimer red-black tree.

Neither hrtimer_set_expires nor hrtimer_forward_now is allowed when the hrtimer has
already been enqueued. Therefore, the only place where the hrtimer timeout can be set is
within the callback, at which point the hrtimer is in a non-enqueued state and can have
its timeout set.


Changes in v8:
  Simplify the logic related to setting timeout, as suggestd by Willem de Bruijn.
  Currently timer callback just restarts itself unconditionally, so delete the
 'out:' label, do not forward hrtimer in prb_open_block, call hrtimer_forward_now
  directly and always return HRTIMER_RESTART. The only special case is when
  prb_open_block is called from tpacket_rcv. That would set the timeout further
  into the future than the already queued timer. An earlier timeout is not
  problematic. No need to add complexity to avoid that.

This paragraph explains that if the block's retire timeout is not adjusted within
the timer callback, it will only result in an earlier-than-expected retire timeout,
which is not problematic. Therefore, it is unnecessary to increase the logical complexity
to ensure block retire timeout occurs as expected each time.


> The whole structure needs a new organization?
> 
> Before:
>         /* size: 152, cachelines: 3, members: 22 */
>         /* sum members: 144, holes: 2, sum holes: 8 */
>         /* paddings: 1, sum paddings: 4 */
>         /* last cacheline: 24 bytes */
> After:
>         /* size: 176, cachelines: 3, members: 19 */
>         /* sum members: 163, holes: 4, sum holes: 13 */
>         /* paddings: 1, sum paddings: 4 */
>         /* forced alignments: 1, forced holes: 1, sum forced holes: 6 */
>         /* last cacheline: 48 bytes */

What about the following organization:?

/* kbdq - kernel block descriptor queue */
struct tpacket_kbdq_core {
	struct pgv	*pkbdq;
	unsigned int	feature_req_word;
	unsigned int	hdrlen;
	unsigned short	kactive_blk_num;
	unsigned short	blk_sizeof_priv;
	unsigned char	reset_pending_on_curr_blk;

	char		*pkblk_start;
	char		*pkblk_end;
	int		kblk_size;
	unsigned int	max_frame_len;
	unsigned int	knum_blocks;
	char		*prev;
	char		*nxt_offset;

	unsigned short  version;
	
	uint64_t	knxt_seq_num;
	struct sk_buff	*skb;

	rwlock_t	blk_fill_in_prog_lock;

	/* timer to retire an outstanding block */
	struct hrtimer  retire_blk_timer;

	/* Default is set to 8ms */
#define DEFAULT_PRB_RETIRE_TOV	(8)

	ktime_t		interval_ktime;
};



Thanks
Xin Zhao


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ