netdev - Re: [PATCH net-next] tcp: add tracepoint trace_tcp_retransmit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <876c1022-7922-b7b5-8b1f-95f013ed6097@fb.com>
Date:   Fri, 27 Oct 2017 13:58:03 -0700
From:   Alexei Starovoitov <ast@...com>
To:     Alban Crequy <alban.crequy@...il.com>,
        Song Liu <songliubraving@...com>
CC:     <alexei.starovoitov@...il.com>, <kafai@...com>,
        <netdev@...r.kernel.org>, <liu.song.a23@...il.com>
Subject: Re: [PATCH net-next] tcp: add tracepoint
 trace_tcp_retransmit_synack()

On 10/27/17 1:38 PM, Alban Crequy wrote:
> Hi,
>
> On 25 October 2017 at 01:57, Song Liu <songliubraving@...com> wrote:
>> This tracepoint can be used to trace synack retransmits. It maintains
>> pointer to struct request_sock.
>>
>> We cannot simply reuse trace_tcp_retransmit_skb() here, because the
>> sk here is the LISTEN socket. The IP addresses and ports should be
>> extracted from struct request_sock.
>>
>> Signed-off-by: Song Liu <songliubraving@...com>
>> ---
>>  include/trace/events/tcp.h | 56 ++++++++++++++++++++++++++++++++++++++++++++++
>>  net/ipv4/tcp_output.c      |  1 +
>>  2 files changed, 57 insertions(+)
>>
>> diff --git a/include/trace/events/tcp.h b/include/trace/events/tcp.h
>> index 03699ba..07cccca 100644
>> --- a/include/trace/events/tcp.h
>> +++ b/include/trace/events/tcp.h
>> @@ -237,6 +237,62 @@ TRACE_EVENT(tcp_set_state,
>>                   show_tcp_state_name(__entry->newstate))
>>  );
>>
>> +TRACE_EVENT(tcp_retransmit_synack,
>> +
>> +       TP_PROTO(const struct sock *sk, const struct request_sock *req),
>> +
>> +       TP_ARGS(sk, req),
>> +
>> +       TP_STRUCT__entry(
>> +               __field(const void *, skaddr)
>> +               __field(const void *, req)
>> +               __field(__u16, sport)
>> +               __field(__u16, dport)
>> +               __array(__u8, saddr, 4)
>> +               __array(__u8, daddr, 4)
>> +               __array(__u8, saddr_v6, 16)
>> +               __array(__u8, daddr_v6, 16)
>
> Would it make sense to add the inode of the network namespace that
> owns the socket? (along with the major/minor of the nsfs)

We cannot do this.
netns ino is not unique identifier of netns.
we can do such hack only inside programs by
walking skb->dev->nd_net->net with bpf_probe_read() and realizing
that this is unstable interface and not technically correct.

> If the kernel later gains tracepoints for TCP connect, accept, close
> including the netns ino, then I might be able to replace some
> ebpf-kprobes code by ebpf-tracepoints code :)

What is the use case for tracepoints in connect/accept/close ?
Just because some _useful_ bcc script is using kprobe in particular
kernel function it doesn't mean yet that we need a tracepoint in there.
imo the general rule for tracepoints is to only add them when it's 100%
certain that this is the right place for it and kprobe approach
is not enough or not possible.
In the case of recent addition of tcp tracepoints the main thing
they achieve (vs our old kprobe approach) is that they are accurate.
In this particular patch the kprobe on tcp_rtx_synack() is not
the same as trace_tcp_retransmit_synack(), since it incorrectly
counts failed send_synack(). It's solvable via kretprobe
on tcp_rtx_synack() and checking %rax inside the bpf program,
but kretprobes add runtime overhead and much slower than tracepoints.