lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202005121111.6BECC45@keescook>
Date:   Tue, 12 May 2020 11:45:54 -0700
From:   Kees Cook <keescook@...omium.org>
To:     Petr Mladek <pmladek@...e.com>
Cc:     Pavel Tatashin <pasha.tatashin@...een.com>,
        Anton Vorontsov <anton@...msg.org>,
        Colin Cross <ccross@...roid.com>,
        Tony Luck <tony.luck@...el.com>,
        Jonathan Corbet <corbet@....net>,
        Rob Herring <robh+dt@...nel.org>,
        Benson Leung <bleung@...omium.org>,
        Enric Balletbo i Serra <enric.balletbo@...labora.com>,
        Sergey Senozhatsky <sergey.senozhatsky@...il.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        James Morris <jmorris@...ei.org>,
        Sasha Levin <sashal@...nel.org>,
        Linux Doc Mailing List <linux-doc@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>, devicetree@...r.kernel.org
Subject: Re: [PATCH v3 0/6] allow ramoops to collect all kmesg_dump events

On Tue, May 12, 2020 at 05:52:07PM +0200, Petr Mladek wrote:
> On Tue 2020-05-12 10:03:44, Pavel Tatashin wrote:
> > > OK, I personally see this as two separate problems:
> > >
> > >    1. Missing support to set loglevel per console.
> > >    2. Missing support to dump messages for other reasons.
> > >
> > > I would remove the paragraph about console log levels completely.
> > 
> > OK, I see your point, this paragraph can be removed, however, I think
> > it makes it clear to understand the rationale for this change. As I
> > understand, the per console loglevel has been proposed but were never
> > accepted.

I understood Pavel's rationale as an output from my questions in the v1
series, that went like this, paraphrased:

Pavel: "I need to have other kmsg dump reasons available to pstore."
Kees:  "Why can't you just use the pstore console dumper?"
Pavel: "It's too much for the slow device; we only need to know about
        specific events that are already provided by kmsg dump."
Kees:  "Ah! Sounds good, max_reasons it is."

So, AIUI, loglevel remains orthogonal to this, and it's my fault for
even causing to be be brought up. Please disregard! :)

> > printk.always_kmsg_dump is not working for me because ramoops has its
> > own filtering based on dump_oops boolean, and ignores everything but
> > panics and conditionally oops.
> > max_reason makes the ramoops internal logic cleaner compared to using dump_oops.
> 
> I see. Just to be sure. Is the main reason to add max_reason parameter
> to keep complatibility of the deprecated dump_oops parameter? Or is
> there any real use case for this granularity?

In my mind it seemed like a nice mapping, so it was an easy port.

> I wonder if anyone is actually using the ramoops.dump_oops parameter
> in reality. I would personally make it deprecated and change the
> default behavior to work according to printk.always_kmsg_dump parameter.

Yes. For things I'm aware of: ARM devices with very tiny persistent RAM
were using ramoops and setting dump_oops to 0 (specifically, setting
the DT "no-dump-oops" to 1), and larger Android and Chrome OS devices
using ramoops were setting to dump_oops to 1[1].

The logic built into pstore recognizes a difference between panic and
non-panic dumps as well, as the expectation is that there is little to
no kernel infrastructure available for use during a panic kmsg.

> IMHO, ramoops.dump_oops just increases complexity and should not have
> been introduced at all. I would try hard to avoid introducing even bigger
> complecity and mess.

I think dump_oops was the wrong implementation, but granularity control
is still needed. It is an old parameter, and is baked into many device
trees on systems, so I can't just drop it. (In fact, I've had to support
some other DT compat issues[2] as well.)

> I know that there is the "do not break existing userspace" rule. The
> question is if there is any user and if it is worth it.

For dump_oops, yes, there is unfortunately.

> > I agree, the reasons in kmsg_dump_reason do not order well  (I
> > actually want to add another reason for kexec type reboots, and where
> > do I put it?), so how about if we change the ordering list to
> > bitfield/flags, and instead of max_reason provide: "reasons" bitset?
> 
> It looks too complicated. I would really try hard to avoid the
> parameter at all.

Here are the problems I see being solved by this:

- lifting kmsg dump reason filtering out of the individual pstore
  backends and making it part of the "infrastructure", so that
  there is a central place to set expectations. Right now there
  is a mix of explicit and implicit kmsg dump handling:

  - arch/powerpc/kernel/nvram_64.c has a hard-coded list
  - drivers/firmware/efi/efi-pstore.c doesn't expect anything but
    OOPS and PANIC.
  - drivers/mtd/mtdoops.c tries to filter using its own dump_oops
    and doesn't expect anything but OOPS and PANIC.
  - fs/pstore/ram.c: has a hard-coded list and uses its own
    dump_oops.
  - drivers/mtd/mtdpstore.c (under development[3]) expected only
    OOPS and PANIC and had its own dump_oops.

- providing a way for backends that can deal with all kmsg dump reasons
  to do so without breaking existing default behavior (i.e. getting
  Pavel what he's interested in).

So, that said, I'm totally fine with a bit field. I just need a way to
map the kmsg dump reasons onto the existing backend expectations and to
have Pavel's needs addressed.

-Kees

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/platform/chrome/chromeos_pstore.c?h=v5.6#n60
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/pstore/ram.c?h=v5.6#n708
[3] https://lore.kernel.org/lkml/20200511233229.27745-11-keescook@chromium.org/

-- 
Kees Cook

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ