[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <878rai7u0l.fsf@email.froward.int.ebiederm.org>
Date: Thu, 10 Aug 2023 11:16:10 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Petr Skocik <pskocik@...il.com>
Cc: Kees Cook <keescook@...omium.org>, Oleg Nesterov <oleg@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Marco Elver <elver@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/1] *** Fix kill(-1,s) returning 0 on 0 kills ***
Petr Skocik <pskocik@...il.com> writes:
> Hi.
>
> Is there anything else I can do to help get this (or some other equivalent
> change that results in kill(-1,s) returning -ESRCH when it has nothing to kill
> (like it does on the BSDs),
> as opposed to the current return value of 0 in that case) incorporated into
> mainline Linux?
I think you are talking about a rare enough case that we can safely
change the error handling behavior without real risk of trouble.
I think there is room for cleanup here.
I don't think we can change the set of processes signaled. The linux
man page should be updated to note that we skip sending a signal
to ourselves in the event of -1.
Reading the code the error handling logic is dubious.
POSIX provides some guidance it says:
If pid is -1, sig shall be sent to all processes (excluding an
unspecified set of system processes) for which the process has
permission to send that signal.
[EINVAL]
The value of the sig argument is an invalid or unsupported signal number.
[EPERM]
The process does not have permission to send the signal to any receiving process.
[ESRCH]
No process or process group can be found corresponding to that specified by pid.
> if (err != -EPERM)
> ret = err; /*either all 0 or all -EINVAL*/
The comment in your proposed patch is wrong:
-EAGAIN an be returned in the case of real time signals.
-ENOMEM can be returned due to linux audit.
-EINVAL can be returned, but arguably it should be caught
before we even go into the loop.
Given that the comment is wrong I don't like what you have done with the
error handling logic. It just adds confusion.
My question: What would a good and carefully implemented version
of kill(2) return?
Today for -pgrp we return 0 if any signal delivery succeeds and
the error from the last process in the signal group otherwise.
For -1 we return -EINVAL if the signal is invalid.
For -1 we return -ESRCH only if we are the init process and
there are no other processes in the system, aka never except
when we are the init process in a pid namespace.
For -1 otherwise we return the return value of the last
process signaled.
I would argue that what needs to happen for -1 is:
- Return 0 if the signal was sent to any process successfully.
- Return -EINVAL for invalid signals.
- When everything errors return some error value and not 0.
What error value should we return when everything errors?
Especially what error value should we return when everything
says -EPERM?
Should we follow BSD and return ESRCH?
Should we follow Posix and return EPERM?
Should we follow SYSV unix?
Looking at FreeBSD aka:
https://cgit.freebsd.org/src/tree/sys/kern/kern_sig.c?id=9e283cf9a2fe2b3aa6e4a47a012fd43b4e49ebec
kill(2) aka killpg1 only returns 0 or ESRCH when sending a signal
to multiple processes (after passing the EINVAL) check.
The man pages for AIX and Solaris suggest that -EPERM is returned when
things don't work.
So what should Linux do?
Historically Linux signaling is very SysV unix with a some compatibility
flags for BSD where it matters. So I am not convinced that return
ESRCH in this case is the right answer.
Basing the logic off of __kill_pgrp_info I would do:
diff --git a/kernel/signal.c b/kernel/signal.c
index b5370fe5c198..369499ebb8e2 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -1602,7 +1602,8 @@ static int kill_something_info(int sig, struct kernel_siginfo *info, pid_t pid)
ret = __kill_pgrp_info(sig, info,
pid ? find_vpid(-pid) : task_pgrp(current));
} else {
- int retval = 0, count = 0;
+ bool found = false, success = false;
+ int retval = 0;
struct task_struct * p;
for_each_process(p) {
@@ -1610,12 +1611,12 @@ static int kill_something_info(int sig, struct kernel_siginfo *info, pid_t pid)
!same_thread_group(p, current)) {
int err = group_send_sig_info(sig, info, p,
PIDTYPE_MAX);
- ++count;
- if (err != -EPERM)
- retval = err;
+ found = true;
+ success |= !err;
+ retval = err;
}
}
- ret = count ? retval : -ESRCH;
+ ret = success ? 0 : (found ? retval : -ESRCH);
}
read_unlock(&tasklist_lock);
I think that fits in better with Linux, and doesn't have any surprising
behavior.
> It would rather help some of the user software I'm developing, and the slightly
> new semantics are IMO definitely reasonable (BSDs have them).
Would my proposal above work for the software you are developing?
The behavior your patch was implementing was:
ret = success ? 0 : ((retval == -EINVAL)? -EINVAL : -ESRCH);
Which gives less information. So I am not thrilled by it.
Eric
Powered by blists - more mailing lists