linux-kernel - Re: [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit special case

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87mtk2qf5s.fsf@email.froward.int.ebiederm.org>
Date:   Tue, 11 Jan 2022 12:51:43 -0600
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Olivier Langlois <olivier@...llion01.com>
Cc:     Heiko Carstens <hca@...ux.ibm.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        "<linux-arch@...r.kernel.org>" <linux-arch@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        Alexey Gladkov <legion@...nel.org>,
        Kyle Huey <me@...ehuey.com>, Oleg Nesterov <oleg@...hat.com>,
        Kees Cook <keescook@...omium.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Jens Axboe <axboe@...nel.dk>,
        Pavel Begunkov <asml.silence@...il.com>
Subject: Re: [PATCH 1/8] signal: Make SIGKILL during coredumps an explicit
 special case

"Eric W. Biederman" <ebiederm@...ssion.com> writes:

> Olivier Langlois <olivier@...llion01.com> writes:
>
>> On Mon, 2022-01-10 at 15:11 -0600, Eric W. Biederman wrote:
>>> 
>>> 
>>> I have been able to confirm that changing wait_event_interruptible to
>>> wait_event_killable was the culprit.  Something about the way
>>> systemd-coredump handles coredumps is not compatible with
>>> wait_event_killable.
>>
>> This is my experience too that systemd-coredump is doing something
>> unexpected. When I tested the patch:
>> https://lore.kernel.org/lkml/cover.1629655338.git.olivier@trillion01.com/
>>
>> to make sure that the patch worked, sending coredumps to systemd-
>> coredump was making systemd-coredump, well, core dump... Not very
>> useful...
>
> Oh.  Wow....
>
>> Sending the dumps through a pipe to anything else than systemd-coredump
>> was working fine.
>
> Interesting.
>
> I need to read through the pipe code and see how all of that works.  For
> writing directly to disk only ignoring killable interruptions are the
> usual semantics.  Ordinary pipe code has different semantics, and I
> suspect that is what is tripping things up.
>
> As for systemd-coredump it does whatever it does and I suspect some
> versions of systemd-coredump are simply not robust if a coredump stops
> unexpectedly.
>
> The good news is the pipe code is simple enough, it will be possible to
> completely read through that code.

My bug, obvious in hindsight is that "try_to_wait_up(TASK_INTERRUPTIBLE)"
does not work on a task that is in sleeping in TASK_KILLABLE.
That looks fixable in wait_for_dump_helpers it just won't be as easy
as changing wait_event_interruptible to wait_event_killable.

To prevent short pipe write from causing short writes during a coredump
I believe all we need to do handle -ERSTARTSYS with TIF_NOTIFY_SIGNAL.
Something like what I have below.

Until wait_for_dump_helpers is sorted out the coredump won't wait for
the dump helper the way it should, but otherwise things should work.

diff --git a/fs/coredump.c b/fs/coredump.c
index 7dece20b162b..0db1baf91420 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -796,6 +796,10 @@ static int __dump_emit(struct coredump_params *cprm, const void *addr, int nr)
 	if (dump_interrupted())
 		return 0;
 	n = __kernel_write(file, addr, nr, &pos);
+	while ((n == -ERESTARTSYS) && test_thread_flag(TIF_NOTIFY_SIGNAL)) {
+		tracehook_notify_signal();
+		n = __kernel_write(file, addr, nr, &pos);
+	}
 	if (n != nr)
 		return 0;
 	file->f_pos = pos;

Eric