lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOgjDMgcKZAGP0-QW9uzCM28eKdZvA49KoDQ=RnAnJL=7_zL5g@mail.gmail.com>
Date:   Wed, 16 Jun 2021 15:34:21 +0800
From:   郑琦 <zhengqi.arch@...edance.com>
To:     Andy Lutomirski <luto@...nel.org>
Cc:     tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
        peterz@...radead.org, x86@...nel.org, linux-kernel@...r.kernel.org,
        songmuchun@...edance.com
Subject: Re: [External] Re: [PATCH] x86: fix get_wchan() not support the ORC unwinder

On Tue, Jun 15, 2021 at 12:21 AM Andy Lutomirski <luto@...nel.org> wrote:
>
> On 6/11/21 5:46 AM, Qi Zheng wrote:
> > Currently, the kernel CONFIG_UNWINDER_ORC option is enabled by
> > default on x86, but the implementation of get_wchan() is still
> > based on the frame pointer unwinder, so the /proc/<pid>/wchan
> > always return 0 regardless of whether the task <pid> is running.
> >
> > We reimplement the get_wchan() by calling stack_trace_save_tsk(),
> > which is adapted to the ORC and frame pointer unwinders.
>
> How much slower does this make ps?

I used the bpftrace tool to test the running time of get_wchan() in the two
cases of the ORC and frame pointer unwinders, the test script and
the result are as follows:

the test script:
bpftrace -e 'kprobe:get_wchan { @start[tid] = nsecs; } kretprobe: get_wchan
/@...rt[tid]/ { @ns[comm] = hist(nsecs - @start[tid]); delete(@start[tid]); }'

the result:
1) ORC unwinder ( before applying this patch )

@ns[ps]:
[512, 1K)     4609   |@@@@@@@@@@@@ |
[1K, 2K)      18599  |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[2K, 4K)      1848    |@@@@@ |
[4K, 8K)       307     | |
[8K, 16K)     74       | |
[16K, 32K)   12       | |

73% of the cases are in the [1K, 2K) range.
Notice: In this case, the get_wchan() always returns the wrong value of 0.

2) ORC unwinder ( after applying this patch )

@ns[ps]:
[512, 1K)    536      |@ |
[1K, 2K)     19945   |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[2K, 4K)     5604     |@@@@@@@@@@@@@@ |
[4K, 8K)     246       | |
[8K, 16K)   154       | |
[16K, 32K) 18         | |

75% of the cases are in the [1K, 2K) range.

3) frame point unwinder ( before applying this patch )

@ns[ps]:
[512, 1K)    245      | |
[1K, 2K)     16577   |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[2K, 4K)     2788     |@@@@@@@@ |
[4K, 8K)     190       | |
[8K, 16K)   74         | |
[16K, 32K) 9           | |

83% of the cases are in the [1K, 2K) range.

4) frame point unwinder ( after applying this patch )

@ns[ps]:
[512, 1K)    85        | |
[1K, 2K)     12023  |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[2K, 4K)     7418    |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[4K, 8K)     232      |@ |
[8K, 16K)   104      | |
[16K, 32K) 18        | |

60% of the cases are in the [1K, 2K) range.

In summary, the running time of get_wchan() has increased after applying this
patch. But the get_wchan() is not the hotspot function, and this is a bug in the
default ORC option, so I think these increased runtimes are acceptable.

In addition, this issue has existed for nearly 4 years and no one has
fixed it, if
nobody cares about the return value of the get_wchan(), maybe we can return
0 or remove this function directly. What do you think?

Best regards,
Qi Zheng

>
> --Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ