lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c283c978-2563-06b9-4c21-59bedceda9ea@oracle.com>
Date:   Thu, 30 Sep 2021 11:05:35 -0700
From:   Stephen Brennan <stephen.s.brennan@...cle.com>
To:     Kees Cook <keescook@...omium.org>,
        Thomas Gleixner <tglx@...utronix.de>
Cc:     Josh Poimboeuf <jpoimboe@...hat.com>,
        Vito Caputo <vcaputo@...garu.com>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>, Jens Axboe <axboe@...nel.dk>,
        Mark Rutland <mark.rutland@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Stefan Metzmacher <metze@...ba.org>,
        Andy Lutomirski <luto@...nel.org>,
        Lai Jiangshan <laijs@...ux.alibaba.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Kenta.Tada@...y.com" <Kenta.Tada@...y.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Michael Weiß <michael.weiss@...ec.fraunhofer.de>,
        Anand K Mistry <amistry@...gle.com>,
        Alexey Gladkov <legion@...nel.org>,
        Michal Hocko <mhocko@...e.com>, Helge Deller <deller@....de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Andrea Righi <andrea.righi@...onical.com>,
        Ohhoon Kwon <ohoono.kwon@...sung.com>,
        Kalesh Singh <kaleshsingh@...gle.com>,
        YiFei Zhu <yifeifz2@...inois.edu>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        linux-kernel@...r.kernel.org, x86@...nel.org,
        linux-fsdevel@...r.kernel.org, linux-hardening@...r.kernel.org
Subject: Re: [PATCH] proc: Disable /proc/$pid/wchan

On 9/23/21 4:31 PM, Kees Cook wrote:
> The /proc/$pid/wchan file has been broken by default on x86_64 for 4
> years now[1]. As this remains a potential leak of either kernel
> addresses (when symbolization fails) or limited observation of kernel
> function progress, just remove the contents for good.
> 
> Unconditionally set the contents to "0" and also mark the wchan
> field in /proc/$pid/stat with 0.

Hi all,

It looks like there's already been pushback on this idea, but I wanted
to add another voice from a frequent user of /proc/$pid/wchan (via PS).
Much of my job involves diagnosing kernel issues and performance issues
on stable kernels, frequently on production systems where I can't do
anything too invasive. wchan is incredibly useful for these situations,
so much so that we store regular snapshots of ps output, and we expand
the size of the WCHAN column to fit more data (e.g. ps -e -o
pid,wchan=WCHAN-WIDE-COLUMN). Disabling wchan would remove a critical
tool for me and my team.

 From my our team's feedback:
1. It's fine if this needs to have CAP_SYS_ADMIN to read for tasks not
    owned by the calling user; and for non-admin, if the symbolization
    fails, to return 0 just like kallsyms does for unprivileged users.
2. We don't care about the stack of an actively running process
    (/proc/$pid/stack is there for that). We only need WCHAN for
    understanding why a task is blocked.
3. Keeping the function / symbol name in the wchan is ideal (so we can
    pinpoint the exact area that a task is blocked at).

> This leaves kernel/sched/fair.c as the only user of get_wchan(). But
> again, since this was broken for 4 years, was this profiling logic
> actually doing anything useful?

This was only broken with CONFIG_UNWINDER_ORC. You may say this is the
default, but Ubuntu's latest kernel (5.11 in Hirsute) still ships with
CONFIG_UNWINDER_FRAME_POINTER, and many other distributions are the
same. Stable distributions have a lag time picking up new code, and even
longer lag picking up new configurations -- even new defaults.
(Especially when frame pointers are so useful for debugging...) So
saying that this was broken for 4 years is at best misleading. Plenty of
users have been happily using recent kernels when this was supposedly
"broken", on valid configurations, without any issues.

It looks like we've backed off of the decision to rip out 
/proc/$pid/wchan, but I just wanted to chime in, since it feels like the 
discussion is happening without much input from users.

Thanks,
Stephen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ