netdev - Re: [PATCH net-next] tcp: Add tracepoint for rxtstamp coalescing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8fe6208c-c408-4a14-acbc-84a1130b3ddf@linux.alibaba.com>
Date: Fri, 14 Jun 2024 19:27:46 +0800
From: Philo Lu <lulie@...ux.alibaba.com>
To: Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
 Mike Maloney <maloney@...gle.com>, Willem de Bruijn <willemb@...gle.com>
Cc: netdev@...r.kernel.org, rostedt@...dmis.org, mhiramat@...nel.org,
 mathieu.desnoyers@...icios.com, davem@...emloft.net, dsahern@...nel.org,
 kuba@...nel.org, xuanzhuo@...ux.alibaba.com, dust.li@...ux.alibaba.com,
 Soheil Hassas Yeganeh <soheil@...gle.com>
Subject: Re: [PATCH net-next] tcp: Add tracepoint for rxtstamp coalescing



On 2024/6/14 16:25, Eric Dumazet wrote:
> On Fri, Jun 14, 2024 at 10:09 AM Paolo Abeni <pabeni@...hat.com> wrote:
>>
>> On Tue, 2024-06-11 at 12:58 +0800, Philo Lu wrote:
>>> During tcp coalescence, rx timestamps of the former skb ("to" in
>>> tcp_try_coalesce), will be lost. This may lead to inaccurate
>>> timestamping results if skbs come out of order.
>>>
>>> Here is an example.
>>> Assume a message consists of 3 skbs, namely A, B, and C. And these skbs
>>> are processed by tcp in the following order:
>>> A -(1us)-> C -(1ms)-> B
>>
>> IMHO the above order makes the changelog confusing
>>
>>> If C is coalesced to B, the final rx timestamps of the message will be
>>> those of C. That is, the timestamps show that we received the message
>>> when C came (including hardware and software). However, we actually
>>> received it 1ms later (when B came).
>>>
>>> With the added tracepoint, we can recognize such cases and report them
>>> if we want.
>>
>> We really need very good reasons to add new tracepoints to TCP. I'm
>> unsure if the above example match such requirement. The reported
>> timestamp actually matches the first byte in the aggregate segment,
>> inferring anything more is IMHO stretching too far the API semantic.
>>
> 
> Note the current behavior was a conscious choice, see
> commit 98aaa913b4ed2503244 ("tcp: Extend SOF_TIMESTAMPING_RX_SOFTWARE
> to TCP recvmsg")
> for the rationale.
> 

IIUC, the behavior of returning the timestamp of the skb with highest 
sequence number works well without disorder. But once disorder occurs, 
tcp coalescence can cause this issue.

> Perhaps another application would need to add a new timestamp to report
> both the oldest and newest timestamps.

I prefer this way, we do need both oldest and newest timestamps of a 
message to find if any packet is unexpected delayed after sending.
But given there can be both hardware and software timestamps, we may 
need more fields in sk_buff to carry these new timestamps.

> 
> Or add a socket flag to prevent coalescing for applications needing
> precise timestamps.
> 
> Willem might know better about this.
> 
> I agree the tracepoint seems not needed. What about solving the issue instead ?
Thanks.

-- 
Philo