linux-kernel - Re: [PATCH 1/1] binfmt_elf, coredump: Log the reason of the failed core dumps

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <1465b0a4-6a99-45d5-b170-7d2e470f555d@linux.microsoft.com>
Date: Thu, 20 Jun 2024 12:10:10 -0700
From: Roman Kisel <romank@...ux.microsoft.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 akpm@...ux-foundation.org, apais@...ux.microsoft.com, ardb@...nel.org,
 brauner@...nel.org, jack@...e.cz, keescook@...omium.org,
 linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, nagvijay@...rosoft.com, oleg@...hat.com,
 tandersen@...flix.com, vincent.whitchurch@...s.com, viro@...iv.linux.org.uk,
 apais@...rosoft.com, ssengar@...rosoft.com, sunilmut@...rosoft.com,
 vdso@...bites.dev
Subject: Re: [PATCH 1/1] binfmt_elf, coredump: Log the reason of the failed
 core dumps



On 6/18/2024 2:21 PM, Eric W. Biederman wrote:
> Roman Kisel <romank@...ux.microsoft.com> writes:
> 
>> On 6/17/2024 11:18 PM, Sebastian Andrzej Siewior wrote:
>>> On 2024-06-17 16:41:30 [-0700], Roman Kisel wrote:
>>>> Missing, failed, or corrupted core dumps might impede crash
>>>> investigations. To improve reliability of that process and consequently
>>>> the programs themselves, one needs to trace the path from producing
>>>> a core dumpfile to analyzing it. That path starts from the core dump file
>>>> written to the disk by the kernel or to the standard input of a user
>>>> mode helper program to which the kernel streams the coredump contents.
>>>> There are cases where the kernel will interrupt writing the core out or
>>>> produce a truncated/not-well-formed core dump.
>>> How much of this happened and how much of this is just "let me handle
>>> everything that could go wrong".
>> Some of that must be happening as there are truncated dump files. Haven't run
>> the logging code at large scale yet with the systems being stressed a lot by the
>> customer workloads to hit all edge cases. Sent the changes to the kernel mail
>> list out of abundance of caution first, and being ecstatic about that: on the
>> other thread Kees noticed I didn't use the ratelimited logging. That has
>> absolutely made me day and whole week, just glowing :) Might've been a close
>> call due to something in a crash loop.
> 
> Another reason you could have truncated coredumps is the coredumping
> process being killed.
> 
> I suspect if you want reasons why the coredump is truncated you are
> going to want to instrument dump_interrupted, dump_skip and dump_emit
> rather than their callers.  As they don't actually report why the
> failed.
I'll add logging there as well, thanks for the great idea!

> 
> Are you using systemd-coredump?  Or another pipe based coredump
> collector?  It might be the dump collector is truncating things.
There is a collector program set via core_pattern so that the core dump 
is streamed to its standard input. That is a very simple memcpy-like 
bytes-in..bytes-out code. It logs how many bytes it receives and how 
many bytes it writes, and no bytes are lost in this path. Of the system 
itself, it is built out of the latest stable LTS kernel and a small user 
land, not based on any distribution and packet management. One might say 
it resembles an appliance.

> 
> Do you know if your application uses io_uring?  There were some weird
> issues with io_uring and coredumps that were causing things to get
> truncation at one point.  As I recall a hack was put in the coredump
> code so that it worked but maybe there is another odd case that still
> needs to be handled.
Couldn't appreciate the pointer more! There are cases when the user land 
reaches out to io_uring, not the work horse though.

>>
>> I think it'd be fair to say that I am asking to please "let me handle (log)
>> everything that could go wrong", ratelimited, as these error cases are present
>> in the code, and logging can give a clue why the core dump collection didn't
>> succeed and what one would need to explore to increase reliability of the
>> system.
> 
> If you are looking for reasons you definitely want to instrument
> fs/coredump.c much more than fs/binfmt_elf.c.  As fs/coredump.c is the
> code that actually performs the writes.
Understood, thank you very much!

> 
> One of these days if someone is ambitious we should probably merge the
> coredump code from fs/binfmt_elf.c and fs/binfmt_elf_fdpic.c and just
> hardcode the coredump code to always produce an elf format coredump.
> Just for the simplicity of it all.
I've had loads of experience with collecting and analyzing ELF core dump 
files, including a tool that parses machine state, rebuilds the 
necessary Linux kernel structures and produces ELF core dump files for 
the user land processes from that. Perhaps I could embark on that 
ambitious journey if no one else has time :)

> 
> Eric

-- 
Thank you,
Roman