lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 21 Jun 2024 00:46:15 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Oleg Nesterov <oleg@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,  Tejun Heo <tj@...nel.org>,
  linux-kernel@...r.kernel.org
Subject: Re: [PATCH 01/17] signal: Make SIGKILL during coredumps an explicit
 special case

Oleg Nesterov <oleg@...hat.com> writes:

> On 06/19, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@...hat.com> writes:
>>
>> > Hi Eric,
>> >
>> > I'll _try_ to read this (nontrivial) changes this week. To be honest,
>> > right now I don't really understand your goals after the quick glance...
>> >
>> > So far I have only looked at this simple 1/17 and it doesn't look right
>> > to me.
>>
>> It might be worth applying them all on a branch and just looking at the
>> end result.
>
> Perhaps. Say, the next 2/17 patch. I'd say it is very difficult to understand
> the purpose unless you read the next patches. OK, at least the change log
> mentions "in preparation".
>
>> > 	- complete_signal() won't be called, so signal->group_exit_code
>> > 	  won't be updated.
>> >
>> > 	  coredump_finish() won't change it too so the process will exit
>> > 	  with group_exit_code == signr /* coredumping signal */.
>> >
>> > 	  Yes, the fix is obvious and trivial...
>>
>> The signal handling from the coredump is arguably correct.  The process
>> has already exited, and gotten an exit code.
>
> And zap_process() sets roup_exit_code = signr. But,
>
>> But I really don't care about the exit_code either way.  I just want to
>> make ``killing'' a dead process while it core dumps independent of
>> complete_signal.
>>
>> That ``killing'' of a dead process is a completely special case.
>
> Sorry I fail to understand...
>
> If the coredumping process is killed by SIGKILL, it should exit with
> group_exit_code = SIGKILL, right? At least this is what we have now.

In general when a fatal signal is sent:
- It is short circuit delivered.
- If SIGNAL_GROUP_EXIT is not set
   SIGNAL_GROUP_EXIT is set
   signal->group_exit_code is set

Under those rules group_exit_code should not be updated.  Frankly no
signals should even be processed (except to be queued) after a fatal
signal comes in.

There is an issue that short circuit delivery does not work on coredump
signals (because of the way the coredump code works).  So it winds up
being zap_threads that tests if SIGNAL_GROUP_EXIT is already set and
zap_process that sets SIGNAL_GROUP_EXIT.  Essentially the logic remains
the same, and importantly no later signal is able to set
group_exit_code.  Or really have any effect because the signal sent was
fatal.

Except except except when the kernel in the context of the userspace
process is writing a coredump for that userspace process.  Then we allow
the kernel to be sent SIGKILL to stop it's coredumping activities
because sometimes it can block indefinitely otherwise.

Which is why I call handling that SIGKILL after a coredump has
begun and SIGNAL_GROUP_EXIT is already set a completely special case.

We might have to change group_exit_code to SIGKILL in that special case,
if someone in userspace cares.  However I expect no one cares.

Further if adding support for SIGKILL during a coredump were being added
from scratch.  The semantics I would choose would be for that SIGKILL
and indeed all of the coredumping activity would be invisible to
userspace except for the delay to make it happen.  Otherwise a coredump
which every occasionally gets it's return code changed could introduce
heisenbugs.

But none of this is documented in the change description and at a bare
minimum this change of behavior should be before such code is merged.

Eric


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ