lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20efff94-de74-dcbe-68e4-a72476fab209@linux.vnet.ibm.com>
Date:   Fri, 16 Mar 2018 19:19:34 +0530
From:   Ravi Bangoria <ravi.bangoria@...ux.vnet.ibm.com>
To:     Oleg Nesterov <oleg@...hat.com>
Cc:     mhiramat@...nel.org, peterz@...radead.org,
        srikar@...ux.vnet.ibm.com, acme@...nel.org,
        ananth@...ux.vnet.ibm.com, akpm@...ux-foundation.org,
        alexander.shishkin@...ux.intel.com, alexis.berlemont@...il.com,
        corbet@....net, dan.j.williams@...el.com,
        gregkh@...uxfoundation.org, huawei.libin@...wei.com,
        hughd@...gle.com, jack@...e.cz, jglisse@...hat.com,
        jolsa@...hat.com, kan.liang@...el.com,
        kirill.shutemov@...ux.intel.com, kjlx@...pleofstupid.com,
        kstewart@...uxfoundation.org, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhocko@...e.com,
        milian.wolff@...b.com, mingo@...hat.com, namhyung@...nel.org,
        naveen.n.rao@...ux.vnet.ibm.com, pc@...ibm.com,
        pombredanne@...b.com, rostedt@...dmis.org, tglx@...utronix.de,
        tmricht@...ux.vnet.ibm.com, willy@...radead.org,
        yao.jin@...ux.intel.com, fengguang.wu@...el.com,
        Ravi Bangoria <ravi.bangoria@...ux.vnet.ibm.com>
Subject: Re: [PATCH 6/8] trace_uprobe/sdt: Fix multiple update of same
 reference counter



On 03/16/2018 05:42 PM, Ravi Bangoria wrote:
>
> On 03/15/2018 08:19 PM, Oleg Nesterov wrote:
>> On 03/13, Ravi Bangoria wrote:
>>> For tiny binaries/libraries, different mmap regions points to the
>>> same file portion. In such cases, we may increment reference counter
>>> multiple times.
>> Yes,
>>
>>> But while de-registration, reference counter will get
>>> decremented only by once
>> could you explain why this happens? sdt_increment_ref_ctr() and
>> sdt_decrement_ref_ctr() look symmetrical, _decrement_ should see
>> the same mappings?
> Sorry, I thought this happens only for tiny binaries. But that is not the case.
> This happens for binary / library of any length.
>
> Also, it's not a problem with sdt_increment_ref_ctr() / sdt_increment_ref_ctr().
> The problem happens with trace_uprobe_mmap_callback().
>
> To illustrate in detail, I'm adding a pr_info() in trace_uprobe_mmap_callback():
>
>                 vaddr = vma_offset_to_vaddr(vma, tu->ref_ctr_offset);
> +             pr_info("0x%lx-0x%lx : 0x%lx\n", vma->vm_start, vma->vm_end, vaddr);
>                 sdt_update_ref_ctr(vma->vm_mm, vaddr, 1);
>
>
> Ok now, libpython has SDT markers with reference counter:
>
>     # readelf -n /usr/lib64/libpython2.7.so.1.0 | grep -A2 Provider
>         Provider: python
>         Name: function__entry
>         ... Semaphore: 0x00000000002899d8
>
> Probing on that marker:
>
>     # cd /sys/kernel/debug/tracing/
>     # echo "p:sdt_python/function__entry /usr/lib64/libpython2.7.so.1.0:0x16a4d4(0x2799d8)" > uprobe_events
>     # echo 1 > events/sdt_python/function__entry/enable
>
> When I run python:
>
>     # strace -o out python
>       mmap(NULL, 2738968, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fff92460000
>       mmap(0x7fff926a0000, 327680, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x230000) = 0x7fff926a0000
>       mprotect(0x7fff926a0000, 65536, PROT_READ) = 0
>
> The first mmap() maps the whole library into one region. Second mmap()
> and third mprotect() split out the whole region into smaller vmas and sets
> appropriate protection flags.
>
> Now, in this case, trace_uprobe_mmap_callback() updates reference counter
> twice -- by second mmap() call and by third mprotect() call -- because both
> regions contain reference counter offset. This I can verify in dmesg:
>
>     # dmesg | tail
>       trace_kprobe: 0x7fff926a0000-0x7fff926f0000 : 0x7fff926e99d8
>       trace_kprobe: 0x7fff926b0000-0x7fff926f0000 : 0x7fff926e99d8
>
> Final vmas of libpython:
>
>     # cat /proc/`pgrep python`/maps | grep libpython
>       7fff92460000-7fff926a0000 r-xp 00000000 08:05 403934  /usr/lib64/libpython2.7.so.1.0
>       7fff926a0000-7fff926b0000 r--p 00230000 08:05 403934  /usr/lib64/libpython2.7.so.1.0
>       7fff926b0000-7fff926f0000 rw-p 00240000 08:05 403934  /usr/lib64/libpython2.7.so.1.0
>
>
> I see similar problem with normal binary as well. I'm using Brendan Gregg's
> example[1]:
>
>     # readelf -n /tmp/tick | grep -A2 Provider
>         Provider: tick
>         Name: loop2
>         ... Semaphore: 0x000000001005003c
>
> Probing that marker:
>
>     # echo "p:sdt_tick/loop2 /tmp/tick:0x6e4(0x10036)" > uprobe_events
>     # echo 1 > events/sdt_tick/loop2/enable
>
> Now when I run the binary
>
>     # /tmp/tick
>
> load_elf_binary() internally calls mmap() and I see trace_uprobe_mmap_callback()
> updating reference counter twice:
>
>     # dmesg | tail
>       trace_kprobe: 0x10010000-0x10030000 : 0x10020036
>       trace_kprobe: 0x10020000-0x10030000 : 0x10020036
>
> proc/<pid>/maps of the tick:
>
>     # cat /proc/`pgrep tick`/maps
>       10000000-10010000 r-xp 00000000 08:05 1335712  /tmp/tick
>       10010000-10020000 r--p 00000000 08:05 1335712  /tmp/tick
>       10020000-10030000 rw-p 00010000 08:05 1335712  /tmp/tick
>
> [1] https://github.com/iovisor/bcc/issues/327#issuecomment-200576506

Also, while de-registration, we look for all existing mms using
uprobe_build_mmap_info() and decrement the counter in each
of the mm. i.e. we decrement the counter only once.

-Ravi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ