linux-kernel - Re: pstore/ramoops - why only collect a partial dmesg?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c5a04638-90c2-8ec0-4573-a0e5d2e24b6b@igalia.com>
Date:   Tue, 4 Jan 2022 15:03:54 -0300
From:   "Guilherme G. Piccoli" <gpiccoli@...lia.com>
To:     "Luck, Tony" <tony.luck@...el.com>,
        "keescook@...omium.org" <keescook@...omium.org>,
        "anton@...msg.org" <anton@...msg.org>,
        "ccross@...roid.com" <ccross@...roid.com>
Cc:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Linux-Fsdevel <linux-fsdevel@...r.kernel.org>,
        "Guilherme G. Piccoli" <kernel@...ccoli.net>
Subject: Re: pstore/ramoops - why only collect a partial dmesg?

On 04/01/2022 14:00, Luck, Tony wrote:
> [...] 
> Guilherme,
> 
> Linux is indeed somewhat reluctant to hand out allocations > 2MB. :-(
> 
> Do you really need the whole dmesg in the pstore dump?  The expectation
> is that systems run normally for a while. During that time console logs are
> saved off to /var/log/messages.
> 
> When the system crashes, the last part (the interesting bit!) of the console
> log is lost.  The purpose of pstore is to save that last bit.
> 
> So while you could add code to ramoops to save multiple 2MB chunks, it
> doesn't seem (to me) that it would add much value.
> 

Thanks again Tony, for the interesting points. So, I partially agree
with you: indeed, in a normal situation we have all messages collected
by some userspace daemon, and when some issue/oops happens, we can rely
on pstore to collect the latest portion of the log buffer (2M is a
bunch!) and "merge" that with the previously collected portion, likely
saved in a /var/log/ file.

The problem is that our use case is a bit different: the idea is to rely
on pstore/ramoops to collect the most information we can in a panic
event, without the need of kdump. The latter is a pretty
comprehensive/complete approach, but requires a bunch of memory reserved
- it's a bit too much if we want just the task list, backtraces and
memory state of the system, for example. And for that...we have the
"panic_print" setting!

There lies the issue: if I set panic_print to dump all backtraces, task
info and memory state in a panic event, that information + the
panic/oops and some previous relevant stuff, does it all fit in the 2M
chunk? Likely so, but *if it doesn't fit*, we may lose _exactly_ the
most important piece, which is the panic cause.

The same way I have the "log_buf_len" tuning to determine how much size
my log buffer has, I'd like to be able to effectively collect that much
information using pstore/ramoops. Requiring that amount of space in an
efi-pstore, for example, would be indeed really crazy! But ramoops is
just a way for using some portion of the system RAM to save the log
buffer, so I feel it'd be interesting to be able to properly collect
full logs there, no matter the size of the logs. Of course, I'd like to
see that as a setting, because the current behavior is great/enough for
most of users I guess, as you pointed, and there's no need to change it
by default.

Let me know your thoughts and maybe others also have good opinions about
that!
Cheers,

Guilherme