lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <930f0c5e-0fd4-aae7-334f-ec9cc42998a4@bytedance.com>
Date:   Wed, 22 Sep 2021 11:30:01 +0800
From:   Qi Zheng <zhengqi.arch@...edance.com>
To:     Josh Poimboeuf <jpoimboe@...hat.com>,
        Vito Caputo <vcaputo@...garu.com>
Cc:     linux-kernel <linux-kernel@...r.kernel.org>, x86@...nel.org,
        peterz@...radead.org, luto@...nel.org, jannh@...gle.com,
        Kees Cook <keescook@...omium.org>
Subject: Re: CONFIG_ORC_UNWINDER=y breaks get_wchan()?



On 9/22/21 8:15 AM, Josh Poimboeuf wrote:
> On Tue, Sep 21, 2021 at 12:32:49PM -0700, Vito Caputo wrote:
>> Is this an oversight of the ORC_UNWINDER implementation?  It's
>> arguably a regression to completely break wchans for tools like `ps -o
>> wchan` and `top`, or my window manager and its separate monitoring
>> utility.  Presumably there are other tools out there sampling wchans
>> for monitoring as well, there's also an internal use of get_chan() in
>> kernel/sched/fair.c for sleep profiling.
>>
>> I've occasionally seen when monitoring at a high sample rate (60hz) on
>> something churny like a parallel kernel or systemd build, there's a
>> spurious non-zero sample coming out of /proc/[pid]/wchan containing a
>> hexadecimal address like 0xffffa9ebc181bcf8.  This all smells broken,
>> is get_wchan() occasionally spitting out random junk here kallsyms
>> can't resolve, because get_chan() is completely ignorant of
>> ORC_UNWINDER's effects?
> 
> Hi Vito,
> 
> Thanks for reporting this.  Does this patch fix your issue?
> 
>    https://lkml.kernel.org/r/20210831083625.59554-1-zhengqi.arch@bytedance.com
> 
> Though, considering wchan has been silently broken for four years, I do
> wonder what the impact would be if we were to just continue to show "0"
> (and change frame pointers to do the same).

Agree, Or remove get_wchan() directly.

> 
> The kernel is much more cautious than it used to be about exposing this
> type of thing.  Can you elaborate on your use case?
> 
> If we do keep it, we might want to require CAP_SYS_ADMIN anyway, for
> similar reasons as
> 
>    f8a00cef1720 ("proc: restrict kernel stack dumps to root")
> 
> ... since presumably proc_pid_wchan()'s use of '%ps' can result in an
> actual address getting printed if the unwind gets confused, thanks to
> __sprint_symbol()'s backup option if kallsyms_lookup_buildid() doesn't
> find a name.
> 
> Though, instead of requiring CAP_SYS_ADMIN, maybe we can just fix
> __sprint_symbol() to not expose addresses?
> 
> Or is there some other reason for needing CAP_SYS_ADMIN?  Jann?
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ