[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAL+tcoAgsWdNuwBep=Kch+YRkFnAmHS+OTA0Ffxqy=p6bH9hag@mail.gmail.com>
Date: Thu, 4 Sep 2025 10:09:54 +0800
From: Jason Xing <kerneljasonxing@...il.com>
To: Xin Zhao <jackzxcui1989@....com>
Cc: willemdebruijn.kernel@...il.com, edumazet@...gle.com, ferenc@...es.dev,
davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com, horms@...nel.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH net-next v10 1/2] net: af_packet: remove
last_kactive_blk_num field
On Wed, Sep 3, 2025 at 10:58 PM Xin Zhao <jackzxcui1989@....com> wrote:
>
> On Tue, Sep 2, 2025 at 22:04 PM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> > > kactive_blk_num (K) is incremented on block close. last_kactive_blk_num (L)
> > > is set to match K on block open and each timer. So the only time that they
> > > differ is if a block is closed in tpacket_rcv and no new block could be
> > > opened.
> > > So the origin check L==K in timer callback only skip the case 'no new block
> > > to open'. If we remove L==K check, it will make prb_curr_blk_in_use check
> > > earlier, which will not cause any side effect.
> >
> > I believe the above commit message needs to be revised:
> > 1) the above sentence (starting from 'if we remove L....') means
> > nothing because your modification doesn't change the behaviour when
> > the queue is not frozen.
> > 2) lack of proofs/reasons on why exposing the prb_open_block() logic doesn't
> > cause side effects. It's the key proof that shows to future readers to
> > make sure this patch will not bring trouble.
>
> This diff file may not clearly demonstrate the changes made to the
> prb_retire_rx_blk_timer_expired function. We have simply removed the check for
> pkc->last_kactive_blk_num == pkc->kactive_blk_num; the other logic remains unchanged.
> Therefore, we should only need to explain why removing the check for
> pkc->last_kactive_blk_num == pkc->kactive_blk_num will not cause any negative impacts.
>
> I will clarify in the commit that our change to prb_retire_rx_blk_timer_expired only
> involves removing the check for pkc->last_kactive_blk_num == pkc->kactive_blk_num,
> to ensure that everyone understands this point, as it may not be clearly visible from
> the diff file.
>
> The commit may be changed as follow:
>
> kactive_blk_num (K) is only incremented on block close.
> In timer callback prb_retire_rx_blk_timer_expired, except delete_blk_timer
> is true, last_kactive_blk_num (L) is set to match kactive_blk_num (K) in
> all cases. L is also set to match K in prb_open_block.
> The only case K not equal to L is when scheduled by tpacket_rcv
> and K is just incremented on block close but no new block could be opened,
> so that it does not call prb_open_block in prb_dispatch_next_block.
> This patch modifies the prb_retire_rx_blk_timer_expired function by simply
> removing the check for L == K. Why can we remove the check for L == K in
> timer callback?
> In prb_freeze_queue, reset_pending_on_curr_blk (R) is set to 1. If R == 1,
> prb_queue_frozen return 1. In prb_retire_rx_blk_timer_expired,
> frozen = prb_queue_frozen(pkc); so frozen is 1 when R == 1.
>
> Consider the following case:
> (before applying this patch)
> cpu0 cpu1
> tpacket_rcv
> ...
> prb_dispatch_next_block
> prb_freeze_queue (R = 1)
> prb_retire_rx_blk_timer_expired
> L != K
> _prb_refresh_rx_retire_blk_timer
> refresh timer
> set L = K
I do not think the above can happen because:
1) tpacket_rcv() owns the sk_receive_queue.lock and then calls
packet_current_rx_frame()->__packet_lookup_frame_in_block()->prb_dispatch_next_block()
2) the timer prb_retire_rx_blk_timer_expired() also needs to acquire
the same lock first.
> (after applying this patch)
> cpu0 cpu1
> tpacket_rcv
> ...
> prb_dispatch_next_block
> prb_freeze_queue (R = 1)
> prb_retire_rx_blk_timer_expired
> !forzen is 0
> check prb_curr_blk_in_use
> if true
> same as (before apply)
> if false
> prb_open_block
> Before applying this patch, prb_retire_rx_blk_timer_expired will do nothing
> but refresh timer and set L = K in the case above. After applying this
> patch, it will check prb_curr_blk_in_use and call prb_open_block if
> user-space caught up.
The major difference after this patch is that even if L != K we would
call prb_open_block(). So I think the key point is that this patch
provides another checkpoint to thaw the might-be-frozen block in any
case. It doesn't have any effect because
__packet_lookup_frame_in_block() has the same logic and does it again
without this patch when detecting the ring is frozen. The patch only
advances checking the status of the ring.
Thanks,
Jason
>
>
> Please check if the above description is appropriate. I will change the
> description as above in PATCH v11.
>
>
> >
> > >
> > > Signed-off-by: Xin Zhao <jackzxcui1989@....com>
> >
> > It was suggested by Willem, so please add:
> > Suggested-by: Willem de Bruijn <willemdebruijn.kernel@...il.com>
>
> Okay, I will add it in the PATCH v11.
>
> >
> > So far, it looks good to me as well:
> > Reviewed-by: Jason Xing <kerneljasonxing@...il.com>
> >
> > And I will finish reviewing the other patch by tomorrow :)
>
>
> Thanks
> Xin Zhao
>
Powered by blists - more mailing lists