lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <E3873B59-D80F-42E7-B571-DBE3A63A0C77@juniper.net>
Date: Mon, 5 Aug 2024 17:56:11 +0000
From: Brian Mak <makb@...iper.net>
To: Linus Torvalds <torvalds@...ux-foundation.org>
CC: Oleg Nesterov <oleg@...hat.com>,
        "Eric W. Biederman"
	<ebiederm@...ssion.com>,
        Kees Cook <kees@...nel.org>, Alexander Viro
	<viro@...iv.linux.org.uk>,
        Christian Brauner <brauner@...nel.org>, Jan Kara
	<jack@...e.cz>,
        "linux-fsdevel@...r.kernel.org"
	<linux-fsdevel@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH] piped/ptraced coredump (was: Dump smaller VMAs first
 in ELF cores)

On Aug 4, 2024, at 10:47 AM, Linus Torvalds <torvalds@...ux-foundation.org> wrote:

> On Sun, 4 Aug 2024 at 08:23, Oleg Nesterov <oleg@...hat.com> wrote:
>> 
>> What do you think?
> 
> Eww. I really don't like giving the dumper ptrace rights.
> 
> I think the real limitations of the "dump to pipe" is that it's just
> being very stupid. Which is fine in the sense that core dumps aren't
> likely to be a huge priority. But if or when they _are_ a priority,
> it's not a great model.
> 
> So I prefer the original patch because it's also small, but it's
> conceptually much smaller.
> 
> That said, even that simplified v2 looks a bit excessive to me.
> 
> Does it really help so much to create a new array of core_vma_metadata
> pointers - could we not just sort those things in place?

Hi Linus,

Thanks for taking the time to reply.

Yep, I don't see any immediate reason for why we can't sort this in
place to begin with.

Thanks, Eric, for originally bringing this up. Will send out a v3 with
these edits.

> Also, honestly, if the issue is core dump truncation, at some point we
> should just support truncating individual mappings rather than the
> whole core file anyway. No?

Do you mean support truncating VMAs in addition to sorting or as a
replacement to sorting? If you mean in addition, then I agree, there may
be some VMAs that are known to not contain information critical to
debugging, but may aid, and therefore have less priority.

If you mean as a replacement to sorting, then we'd need to know exactly
which VMAs to keep/discard, which is a non-trivial task, as discussed in
v1 of my patch, and so it doesn't seem like a viable alternative.

> Depending on what the major issue is, we might also tweak the
> heuristics for which vmas get written out.
> 
> For example, I wouldn't be surprised if there's a fair number of "this
> read-only private file mapping gets written out because it has been
> written to" due to runtime linking. And I kind of suspect that in many
> cases that's not all that interesting.
> 
> Anyway, I assume that Brian had some specific problem case that
> triggered this all, and I'd like to know a bit more.

Yes, there were a couple problem cases that triggered the need for this
patch. I'll repeat what i said in v1 about this:

At Juniper, we have some daemons that can consume a lot of memory, where
upon crash, can result in core dumps of several GBs. While dumping,
we've encountered these two scenarios resulting in a unusable core:

1. Disk space is low at the time of core dump, resulting in a truncated
core once the disk is full.

2. A daemon has a TimeoutStopSec option configured in its systemd unit
file, where upon core dumping, could timeout (triggering a SIGKILL) if
the core dump is too large and is taking too long to dump.

In both scenarios, we see that the core file is already several GB, and
still does not contain the information necessary to form a backtrace,
thus creating the need for this change. In the second scenario, we are
unable to increase the timeout option due to our recovery time objective
requirements.

Best,
Brian Mak

>           Linus


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ