lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Jan 2019 11:16:07 +0100
From:   Andreas Ziegler <andreas.ziegler@....de>
To:     Masami Hiramatsu <mhiramat@...nel.org>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Ingo Molnar <mingo@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: uprobes: bug in comm/string output?

Hi,

thanks for your reply!

On 1/16/19 11:00 AM, Masami Hiramatsu wrote:
> On Tue, 15 Jan 2019 14:36:48 +0100
> Andreas Ziegler <andreas.ziegler@....de> wrote:
> 
>> Hi again,
>>
>> On 1/14/19 1:38 PM, Andreas Ziegler wrote:
>>> Hi,
>>>
>>> I've been playing around with uprobes today and found the following weird behaviour/output when using more than one string argument (or using the $comm argument). This was run on a v4.20 mainline build on Ubuntu 18.04.
>>>
>>> root@...ntu1810:~# uname -a
>>> Linux ubuntu1810 4.20.0-042000-generic #201812232030 SMP Mon Dec 24 01:32:58 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> I'm trying to track calls to dlopen so I'm looking up the correct offset in libdl.so:
>>>
>>> root@...ntu1810:/sys/kernel/debug/tracing# readelf -s /lib/x86_64-linux-gnu/libdl-2.28.so | grep dlopen
>>>      34: 00000000000012a0   133 FUNC    GLOBAL DEFAULT   14 dlopen@@GLIBC_2.2.5
>>>
>>> Then I'm creating a uprobe with two prints of $comm and two prints of the first argument to dlopen, and enable that probe. The '/root/test' program only does a dlopen("libc.so.6", RTLD_LAZY) in main().
>>>
>>> root@...ntu1810:/sys/kernel/debug/tracing# echo 'p:dlopen /lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm +0(%di):string +0(%di):string' > uprobe_events
>>> root@...ntu1810:/sys/kernel/debug/tracing# echo 1 > events/uprobes/dlopen/enable
>>> root@...ntu1810:/sys/kernel/debug/tracing# /root/test
>>>
>>> The trace output looks like this:
>>>
>>> root@...ntu1810:/sys/kernel/debug/tracing# cat trace
>>> # tracer: nop
>>> #
>>> #                              _-----=> irqs-off
>>> #                             / _----=> need-resched
>>> #                            | / _---=> hardirq/softirq
>>> #                            || / _--=> preempt-depth
>>> #                            ||| /     delay
>>> #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
>>> #              | |       |   ||||       |         |
>>>              test-1617  [000] d...  1237.959168: dlopen: (0x7fbd5272e2a0) arg1=(fault) arg2=(fault) arg3="libc.so.6libc.so.6" arg4="libc.so.6"
>>>
>>> That's very weird for two reasons:
>>> - fetching $comm seems to fail with an invalid pointer
>>> - arg3 contains the text twice (if I add another print of the argument, arg3 will contain the wanted string three times, arg4 two times and the last argument will contain it once).
>>
>> at least for the second problem I think I found the answer, and for the
>> first problem I have a suspicion (see last paragraph for that).
> 
> OK, this looks broken. Thank you very much for reporting it!
> 
> BTW, I tried to reproduce it with kprobe event, but it seems working well. e.g.
> 
>   # echo 'p ksys_chdir $comm $comm +0(%di):string +0(%di):string' > kprobe_events
>   # echo 1 > events/kprobes/enable
>   # cd /sys/kernel/debug/tracing
>   # cat trace
>                sh-812   [003] ...1   229.344360: p_ksys_chdir_0: (ksys_chdir+0x0/0xc0) arg1="sh" arg2="sh" arg3="/sys/kernel/debug/tracing" arg4="/sys/kernel/debug/tracing"
> 
> So, it might be an issue on uprobe_event.
> 

yes, kprobes work because they use strncpy_from_unsafe which *includes* 
the null byte in its return value... the fact that both are called 
strncpy_* but behave differently is really annoying...

I just sent a patch for this case half an hour ago which simply adds 1 
to the returned value for uprobes if it didn't hit the maximum allowed 
length.

>>
>> For this, I installed a uprobe for libdl.so/dlopen once again, the
>> command would be
>>
>> 'p:dlopen /lib/x86_64-linux-gnu/libdl-2.28.so:0x12a0 $comm $comm'
>>
>> so it should print the process name twice (using a kernel v4.18 on
>> Ubuntu 18.10).
>>
>> The code which prints the collected data (print_type_string, defined by
>> PRINT_TYPE_FUNC_NAME(string) in kernel/trace/trace_probe.c) is the
>> following, it simply passes the respective data to trace_seq_printf with
>> a "%s" format string:
>>
>> int PRINT_TYPE_FUNC_NAME(string)(struct trace_seq *s, void *data, void *ent)
>> {
>>           int len = *(u32 *)data >> 16;
>>
>>           if (!len)
>>                   trace_seq_puts(s, "(fault)");
>>           else
>>                   trace_seq_printf(s, "\"%s\"",
>>                                    (const char *)get_loc_data(data, ent));
>>           return !trace_seq_has_overflowed(s);
>> }
>>
>> I dug into that function with KGDB and found the following: 'data' holds
>> the size and offset for the member in question, which is 0xf and 0x18,
>> respectively. 'ent' holds the base address for event. When we print the
>> string at ent + 0x18, we can see that the output for 'arg1' will be
>> "update-notifierupdate-notifier"
>>
>> Thread 511 hit Breakpoint 6, print_type_string (s=0xffff880078fd1090,
>> name=0xffff880078fe4458 "arg1", data=0xffff88007d01f05c,
>> ent=0xffff88007d01f04c) at
>> /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c:67
>> 67	in /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c
>> gdb$ p *(uint32_t *) data
>> $46 = 0xf0018
>> gdb$ p ent
>> $47 = (void *) 0xffff88007d01f04c
>> gdb$ p ((char *)ent + 0x18)
>> $48 = 0xffff88007d01f064 "update-notifierupdate-notifier"
>>
>> Moving on printing 'arg2' (e.g., the other $comm string). Here we see
>> that the string in question is at offset 0x27, and if we print that, we
>> see the correct "update-notifier".
>>
>> Thread 511 hit Breakpoint 6, print_type_string (s=0xffff880078fd1090,
>> name=0xffff880078fe4d70 "arg2", data=0xffff88007d01f060,
>> ent=0xffff88007d01f04c) at
>> /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c:67
>> 67	in /build/linux-EsXT4r/linux-4.18.0/kernel/trace/trace_probe.c
>> gdb$ p *(uint32_t *) data
>> $49 = 0xf0027
>> gdb$ p ((char *)ent + 0x27)
>> $50 = 0xffff88007d01f073 "update-notifier"
>>
>> Looking at the bytes in memory and the offsets it becomes clear that
>> there is no \0 byte at the end of the first entry (which would need to
>> be at address 0xffff88007d01f064 + 0xf = 0xffff88007d01f073 but instead
>> that's the start address of the second entry which simply gets consumed
>> by the (first) "%s" as well.
>>
>> gdb$ x/32x ent
>> 0xffff88007d01f04c:	0x00010592	0x00002143	0xe83522a0	0x00007f7f
>> 0xffff88007d01f05c:	0x000f0018	0x000f0027	0x61647075	0x6e2d6574
>> 0xffff88007d01f06c:	0x6669746f	0x75726569	0x74616470	0x6f6e2d65
>> 0xffff88007d01f07c:	0x69666974	0x00007265	0x0045feee	0x00010592
>> 0xffff88007d01f08c:	0x00002143	0xe83522a0	0x00007f7f	0x000f0018
>> 0xffff88007d01f09c:	0x000f0027	0x61647075	0x6e2d6574	0x6669746f
>> 0xffff88007d01f0ac:	0x75726569	0x74616470	0x6f6e2d65	0x69666974
>> 0xffff88007d01f0bc:	0x00007265	0x0038806e	0x00010592	0x00002143
>>
>> Should we simply also store the terminating \0 at the end of the string?
>> The $comm string is saved by fetch_comm_string (in v4.18) which uses
>> 'strlcpy' and its return value as the size of the respective data...
>> that value however does NOT include the terminating \0 byte (as it's
>> simply the return value of a call to strlen). The same holds for
>> "regular" string arguments where the code uses 'strncpy_from_user' which
>> has the same return value semantics. Or should we use the information in
>> 'len' to only print that many characters?
>>
>> As fetch_store_string has changed between v4.18 and v4.20, I could try
>> to reproduce this with a v4.20 kernel but from looking at the code I
>> suspect this should be the problem in v4.20 as well.
>>
>> As for $comm only printing "(fault)" I suspect this has to do with
>> commit 533059281ee5 ("tracing: probeevent: Introduce new argument
>> fetching code") as we lost the 'fetch_comm_string' function in that
>> commit and (I think, didn't have the newer kernel running yet) go
>> through 'fetch_store_string' now. This calls 'strncpy_from_user' instead
>> of accessing current->comm directly (and thus does not succeed in
>> fetching it). I'm adding Masami to Cc: as the author of said patch.
> 
> Ah, OK. I have to check fetch_store_string() implementation differences
> between trace_kprobe.c and trace_uprobe.c. Well, in the uprobes, we may
> need more careful steps.
> 

I went into this a bit deeper today, and right now it is simply failing 
to parse the code because there is no FETCH_OP_COMM case in 
process_fetch_insn() for uprobes so that will return -EILSEQ, leading to 
a make_data_loc(0, ...) in store_trace_args(). If we just add 
FETCH_OP_COMM and let val point to current->comm (that's what 
trace_kprobe.c does), we get an -EFAULT return value from 
fetch_store_string because strncpy_from_user() checks if the argument is 
in user space.

So I think we might need a special case for that, something like 
FETCH_OP_ST_COMM_STRING which is only used for FETCH_OP_COMM and copies 
current->comm over to the dynamic area. The implementation could be 
similar to the old fetch_comm_string implementation before your rewrite.

> Anyway, that is my fault. I will fix the issue. Hmm, and I may need to
> consider to add some test program for uprobes, which including a target
> application to be probed.
> 
> Thank you!
>

Thanks, let me know if I can help you.

Andreas


Download attachment "smime.p7s" of type "application/pkcs7-signature" (5450 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ