lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230109103922.656eb286@gandalf.local.home>
Date:   Mon, 9 Jan 2023 10:39:22 -0500
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     运辉崔 <cuiyunhui@...edance.com>,
        mhiramat@...nel.org, davem@...emloft.net, kuba@...nel.org,
        pabeni@...hat.com, kuniyu@...zon.com, xiyou.wangcong@...il.com,
        duanxiongchun@...edance.com, linux-kernel@...r.kernel.org,
        linux-trace-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: [External] Re: [PATCH v4] sock: add tracepoint for send recv
 length

On Mon, 9 Jan 2023 16:20:58 +0100
Eric Dumazet <edumazet@...gle.com> wrote:

> On Mon, Jan 9, 2023 at 4:08 PM Steven Rostedt <rostedt@...dmis.org> wrote:
> >
> > On Mon, 9 Jan 2023 15:54:38 +0100
> > Eric Dumazet <edumazet@...gle.com> wrote:
> >  
> > > > static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
> > > > {
> > > >         int ret = INDIRECT_CALL_INET(sock->ops->sendmsg, inet6_sendmsg,
> > > >                                      inet_sendmsg, sock, msg,
> > > >                                      msg_data_left(msg));
> > > >         BUG_ON(ret == -EIOCBQUEUED);
> > > >
> > > >         if (trace_sock_send_length_enabled()) {  
> > >
> > > A barrier() is needed here, with the current state of affairs.
> > >
> > > IMO, ftrace/x86 experts should take care of this generic issue ?  
> >
> > trace_*_enabled() is a static_branch() (aka. jump label).
> >
> > It's a nop, where the if block is in the out-of-line code and skipped. When
> > the tracepoint is enabled, it gets turned into a jump to the if block
> > (which returns back to this location).
> >  
> 
> This is not a nop, as shown in the generated assembly, I copied in
> this thread earlier.

But I thought that was for the patch before the added noinline helper
function to force the tracepoint into its own function, and this now just
has the static branch.

> 
> Compiler does all sorts of things before the point the static branch
> is looked at.
> 
> Let's add the extract again with <<*>> tags on added instructions/dereferences.
> 
> 

Ug, bad line wrapping

> sock_recvmsg_nosec:
>         pushq   %r12    #
>         movl    %edx, %r12d     # tmp123, flags
>         pushq   %rbp    #
> # net/socket.c:999:     int ret =
> INDIRECT_CALL_INET(sock->ops->recvmsg, inet6_recvmsg,
>         movl    %r12d, %ecx     # flags,
> # net/socket.c:998: {
>         movq    %rdi, %rbp      # tmp121, sock
>         pushq   %rbx    #
> # net/socket.c:999:     int ret =
> INDIRECT_CALL_INET(sock->ops->recvmsg, inet6_recvmsg,
>         movq    32(%rdi), %rax  # sock_19(D)->ops, sock_19(D)->ops
> # ./include/linux/uio.h:270:    return i->count;
>         movq    32(%rsi), %rdx  # MEM[(const struct iov_iter
> *)msg_20(D) + 16B].count, pretmp_48
> # net/socket.c:999:     int ret =
> INDIRECT_CALL_INET(sock->ops->recvmsg, inet6_recvmsg,
>         movq    144(%rax), %rax # _1->recvmsg, _2
>         cmpq    $inet6_recvmsg, %rax    #, _2
>         jne     .L107   #,
>         call    inet6_recvmsg   #
>  <<*>>       movl    %eax, %ebx      # tmp124, <retval>
> .L108:
> # net/socket.c:1003:    trace_sock_recv_length(sock->sk, sock->sk->sk_family,
>   <<*>>      xorl    %r8d, %r8d      # tmp127
>    <<*>>     testl   %ebx, %ebx      # <retval>
> # net/socket.c:1004:                           sock->sk->sk_protocol,
>  <<*>>       movq    24(%rbp), %rsi  # sock_19(D)->sk, _10
> # net/socket.c:1003:    trace_sock_recv_length(sock->sk, sock->sk->sk_family,
>  <<*>>       cmovle  %ebx, %r8d      # <retval>,, tmp119
>   <<*>>      testb   $2, %r12b       #, flags
> # net/socket.c:1004:                           sock->sk->sk_protocol,
>   <<*>>      movzwl  516(%rsi), %ecx # _10->sk_protocol,
> # net/socket.c:1003:    trace_sock_recv_length(sock->sk, sock->sk->sk_family,
>   <<*>>      movzwl  16(%rsi), %edx  # _10->__sk_common.skc_family,
> # net/socket.c:1003:    trace_sock_recv_length(sock->sk, sock->sk->sk_family,
>   <<*>>      cmove   %ebx, %r8d      # tmp119,, <retval>, iftmp.54_16
> # ./arch/x86/include/asm/jump_label.h:27:       asm_volatile_goto("1:"
> #APP
> # 27 "./arch/x86/include/asm/jump_label.h" 1
>         1:jmp .L111 # objtool NOPs this         #
>         .pushsection __jump_table,  "aw"
>          .balign 8
>         .long 1b - .
>         .long .L111 - .         #
>          .quad __tracepoint_sock_recv_length+8 + 2 - .  #,
>         .popsection
> 
> # 0 "" 2
> #NO_APP
> .L106:
> # net/socket.c:1008: }
>  <<*>>       movl    %ebx, %eax      # <retval>,
>         popq    %rbx    #
>         popq    %rbp    #
>         popq    %r12    #
>         ret
> .L111:
> # ./include/trace/events/sock.h:308: DEFINE_EVENT(sock_msg_length,
> sock_recv_length,
> 
> > That is, when the tracepoint in the block gets enabled so does the above
> > branch. Sure, there could be a race between the two being enabled, but I
> > don't see any issue if there is. But the process to modify the jump labels,
> > does a bunch of synchronization between the CPUs.
> >
> > What barrier are you expecting?  
> 
> Something preventing the compiler being 'smart', forcing expression evaluations
> before TP_fast_assign() is eventually called.

There's no good way to generically keep the compiler from being 'smart',
that I know of. That's because the tracepoint injection is defined by:

#define __DECLARE_TRACE(name, proto, args, cond, data_proto)		\
	extern int __traceiter_##name(data_proto);			\
	DECLARE_STATIC_CALL(tp_func_##name, __traceiter_##name);	\
	extern struct tracepoint __tracepoint_##name;			\
	static inline void __trace_##name(proto)			\
	{								

That (proto) is the prototype being passed in. And because macros can't
create other macros, I don't know how to have a way to inject a barrier()
before and after that call, or better yet, to have the prototype hidden
behind the static_branch.


But looking at this tracepoint again, I see a issue that can help with the
dereferencing.

Why is family and protocol passed in?

+	trace_sock_send_length(sock->sk, sock->sk->sk_family,
+			       sock->sk->sk_protocol, ret, 0);


Where the TP_fast_assign() is:

+	TP_fast_assign(
+		__entry->sk = sk;
+		__entry->family = sk->sk_family;
+		__entry->protocol = sk->sk_protocol;
+		__entry->length = ret > 0 ? ret : 0;
+		__entry->error = ret < 0 ? ret : 0;
+		__entry->flags = flags;
+	),

The family and protocol is taken from the sk, and not the parameters. I bet
dropping those would help.

-- Steve

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ