lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <7a32cbad-ea81-49a1-970d-faa731a6041e@mail.uni-paderborn.de>
Date: Wed, 28 May 2025 01:40:57 +0200
From: Dennis Baurichter <dennisba@...l.uni-paderborn.de>
To: Neal Cardwell <ncardwell@...gle.com>
Cc: netdev@...r.kernel.org, netfilter@...r.kernel.org,
 Eric Dumazet <edumazet@...gle.com>
Subject: Re: Issue with delayed segments despite TCP_NODELAY

Hi neal,

Am 26.05.25 um 15:50 schrieb Neal Cardwell:
>> We would very much appreciate it if someone could help us on the
>> following questions:
>> - Why are the remaining segments not send out immediately, despite
>> TCP_NODELAY?
>> - Is there a way to change this?
>> - If not, do you have better workarounds than injecting a fake ACK
>> pretending to come "from the server" via a raw socket?
>>     Actually, we haven't tried this yet, but probably will soon.
> 
> Sounds like you are probably seeing the effects of TCP Small Queues
> (TSQ) limiting the number of skbs queued in various layers of the
> sending machine. See tcp_small_queue_check() for details.

thank you so much! I compiled v6.15 with a tcp_small_queue_check() that 
I patched to always return false and things just worked (again)! Now I 
wrote a small module using kretprobe and regs_set_return_value() to 
allow us to apply this change a bit more selectively (and without 
recompiling the entire kernel). That's probably not optimal for anything 
that should be widely deployed, but since we are currently just 
experimenting and don't even know what might be actually used later on, 
it seems good enough for now.

> Probably with shorter RTTs the incoming ACKs clear skbs from the rtx
> queue, and thus the tcp_small_queue_check() call to
> tcp_rtx_queue_empty_or_single_skb(sk) returns true and
> tcp_small_queue_check() returns false, enabling transmissions.

Honestly, I still don't quite understand why this works the way it does. 
We intercept all outgoing (initial) payload segments before we NF_ACCEPT 
any of them (i.e., collect all first, then release), so after the 
handshake itself there shouldn't be any skb clearing triggered by new 
ACKs from our server... Oh well. In any case, it does work, and I'm 
happy with that.

Thanks again,
Dennis

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ