lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 7 Apr 2024 15:51:22 +0800
From: Menglong Dong <menglong8.dong@...il.com>
To: Jason Xing <kerneljasonxing@...il.com>, Eric Dumazet <edumazet@...gle.com>, jmaloy@...hat.com
Cc: netdev@...r.kernel.org, davem@...emloft.net, kuba@...nel.org, 
	passt-dev@...st.top, sbrivio@...hat.com, lvivier@...hat.com, 
	dgibson@...hat.com, eric.dumazet@...il.com, dongmenglong.8@...edance.com
Subject: Re: [net-next 2/2] tcp: correct handling of extreme menory squeeze

On Sun, Apr 7, 2024 at 2:52 PM Jason Xing <kerneljasonxing@...il.com> wrote:
>
> On Sun, Apr 7, 2024 at 2:38 AM Eric Dumazet <edumazet@...gle.com> wrote:
> >
> > On Sat, Apr 6, 2024 at 8:21 PM <jmaloy@...hat.com> wrote:
> > >
> > > From: Jon Maloy <jmaloy@...hat.com>
> > >
> > > Testing of the previous commit ("tcp: add support for SO_PEEK_OFF")
> > > in this series along with the pasta protocol splicer revealed a bug in
> > > the way tcp handles window advertising during extreme memory squeeze
> > > situations.
> > >
> > > The excerpt of the below logging session shows what is happeing:
> > >
> > > [5201<->54494]:     ==== Activating log @ tcp_select_window()/268 ====
> > > [5201<->54494]:     (inet_csk(sk)->icsk_ack.pending & ICSK_ACK_NOMEM) --> TRUE
> > > [5201<->54494]:   tcp_select_window(<-) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354, returning 0
> > > [5201<->54494]:   ADVERTISING WINDOW SIZE 0
> > > [5201<->54494]: __tcp_transmit_skb(<-) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > >
> > > [5201<->54494]: tcp_recvmsg_locked(->)
> > > [5201<->54494]:   __tcp_cleanup_rbuf(->) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > > [5201<->54494]:     (win_now: 250164, new_win: 262144 >= (2 * win_now): 500328))? --> time_to_ack: 0
> > > [5201<->54494]:     NOT calling tcp_send_ack()
> > > [5201<->54494]:   __tcp_cleanup_rbuf(<-) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > > [5201<->54494]: tcp_recvmsg_locked(<-) returning 131072 bytes, window now: 250164, qlen: 83
> > >
> > > [...]
> >
> > I would prefer a packetdrill test, it is not clear what is happening...
> >
> > In particular, have you used SO_RCVBUF ?
> >
> > >
> > > [5201<->54494]: tcp_recvmsg_locked(->)
> > > [5201<->54494]:   __tcp_cleanup_rbuf(->) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > > [5201<->54494]:     (win_now: 250164, new_win: 262144 >= (2 * win_now): 500328))? --> time_to_ack: 0
> > > [5201<->54494]:     NOT calling tcp_send_ack()
> > > [5201<->54494]:   __tcp_cleanup_rbuf(<-) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > > [5201<->54494]: tcp_recvmsg_locked(<-) returning 131072 bytes, window now: 250164, qlen: 1
> > >
> > > [5201<->54494]: tcp_recvmsg_locked(->)
> > > [5201<->54494]:   __tcp_cleanup_rbuf(->) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > > [5201<->54494]:     (win_now: 250164, new_win: 262144 >= (2 * win_now): 500328))? --> time_to_ack: 0
> > > [5201<->54494]:     NOT calling tcp_send_ack()
> > > [5201<->54494]:   __tcp_cleanup_rbuf(<-) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > > [5201<->54494]: tcp_recvmsg_locked(<-) returning 57036 bytes, window now: 250164, qlen: 0
> > >
> > > [5201<->54494]: tcp_recvmsg_locked(->)
> > > [5201<->54494]:   __tcp_cleanup_rbuf(->) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > > [5201<->54494]:     NOT calling tcp_send_ack()
> > > [5201<->54494]:   __tcp_cleanup_rbuf(<-) tp->rcv_wup: 2812454294, tp->rcv_wnd: 5812224, tp->rcv_nxt 2818016354
> > > [5201<->54494]: tcp_recvmsg_locked(<-) returning -11 bytes, window now: 250164, qlen: 0
> > >
> > > We can see that although we are adverising a window size of zero,
> > > tp->rcv_wnd is not updated accordingly. This leads to a discrepancy
> > > between this side's and the peer's view of the current window size.
> > > - The peer thinks the window is zero, and stops sending.

Hi!

In my original logic, the client will send a zero-window
ack when it drops the skb because it is out of the
memory. And the peer SHOULD keep retrans the dropped
packet.

Does the peer do the transmission in this case? The receive
window of the peer SHOULD recover once the
retransmission is successful.

> > > - This side ends up in a cycle where it repeatedly caclulates a new
> > >   window size it finds too small to advertise.

Yeah,  the zero-window suppressed the sending of ack in
__tcp_cleanup_rbuf, which I wasn't aware of.

The ack will recover the receive window of the peer. Does
it make the peer retrans the dropped data immediately?
In my opinion, the peer still needs to retrans the dropped
packet until the retransmission timer timeout. Isn't it?

If it is, maybe we can do the retransmission immediately
if we are in zero-window from a window-shrink, which can
make the recovery faster.

[......]
> > Any particular reason to not cc Menglong Dong ?
> > (I just did)
>
> He is not working at Tencent any more. Let me CC here one more time.

Thanks for CC the new email of mine, it's very kind of you,
xing :/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ