lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <874k45633f.fsf@email.froward.int.ebiederm.org>
Date:   Thu, 10 Mar 2022 13:04:52 -0600
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Kees Cook <keescook@...omium.org>
Cc:     linux-kernel@...r.kernel.org,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Alexey Gladkov <legion@...nel.org>,
        Kyle Huey <me@...ehuey.com>, Oleg Nesterov <oleg@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>, Jens Axboe <axboe@...nel.dk>
Subject: Re: [PATCH 08/13] task_work: Call tracehook_notify_signal from
 get_signal on all architectures

Kees Cook <keescook@...omium.org> writes:

> On Wed, Mar 09, 2022 at 10:24:49AM -0600, Eric W. Biederman wrote:
>> Always handle TIF_NOTIFY_SIGNAL in get_signal.  With commit 35d0b389f3b2
>> ("task_work: unconditionally run task_work from get_signal()") always
>> calling task_wofffffffrk_run all of the work of tracehook_notify_signal is
>
> typo: cat on keyboard
>
>> already happening except clearing TIF_NOTIFY_SIGNAL.
>> 
>> Factor clear_notify_signal out of tracehook_notify_signal and use it in
>> get_signal so that get_signal only needs one call of trask_work_run.
>
> typo: trask -> task
>
>> 
>> To keep the semantics in sync update xfer_to_guest_mode_work (which
>> does not call get_signal) to call tracehook_notify_signal if either
>> _TIF_SIGPENDING or _TIF_NOTIFY_SIGNAL.

First let me say thanks for the close look at this work.

> I see three logical changes in this patch, I think?
>
> - creation and use of clear_notify_signal()
> - removal of handle_signal_work() and removal of
>   arch_do_signal_or_restart() has_signal arg
> - something with get_signal() I don't understand yet:
>   - why is clear_notify_signal() added?
>   - why is tracehook_notify_signal() removed?


The spoiler is the change to get_signal is the logical change.
The rest of the changes follow from that change.  Please see below.

The inline expansion of tracehook_notify_signal in get_signal and
in it's other two callers in the next change is the only real kernel
internal api change in this series of changes.

The optimization that was tried with TIF_NOTIFY_SIGNAL and being able to
only call task_work_run() when TIF_NOTIFY_SIGNAL was set instead of when
get_signal was called failed, and caused a regression.  The removal of
calling task_work_run from get_signal has been reverted but the rest
of the change had not been.  So this change just removes the rest of
the failed optimization.

Please see below for my detailed description of the get_signal change.

I hope this helps.

>> 
>> Signed-off-by: "Eric W. Biederman" <ebiederm@...ssion.com>
>> ---
>>  arch/s390/kernel/signal.c    |  4 ++--
>>  arch/x86/kernel/signal.c     |  4 ++--
>>  include/linux/entry-common.h |  2 +-
>>  include/linux/tracehook.h    |  9 +++++++--
>>  kernel/entry/common.c        | 12 ++----------
>>  kernel/entry/kvm.c           |  2 +-
>>  kernel/signal.c              | 14 +++-----------
>>  7 files changed, 18 insertions(+), 29 deletions(-)
>> 
>> diff --git a/arch/s390/kernel/signal.c b/arch/s390/kernel/signal.c
>> index 307f5d99514d..ea9e5e8182cd 100644
>> --- a/arch/s390/kernel/signal.c
>> +++ b/arch/s390/kernel/signal.c
>> @@ -453,7 +453,7 @@ static void handle_signal(struct ksignal *ksig, sigset_t *oldset,
>>   * stack-frames in one go after that.
>>   */
>>  
>> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
>> +void arch_do_signal_or_restart(struct pt_regs *regs)
>>  {
>>  	struct ksignal ksig;
>>  	sigset_t *oldset = sigmask_to_save();
>> @@ -466,7 +466,7 @@ void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
>>  	current->thread.system_call =
>>  		test_pt_regs_flag(regs, PIF_SYSCALL) ? regs->int_code : 0;
>>  
>> -	if (has_signal && get_signal(&ksig)) {
>
> Right, the only caller of arch_do_signal_or_restart(),
> handle_signal_work(), only happens after its caller has already checked
> _TIF_SIGPENDING.

It could be TIF_SIGPENDING or TIF_NOTIFY_SIGNAL.  The work for
TIF_NOTIFY_SIGNAL has been moved unconditionally into get_signal.
So it no longer makes sense to care which flag has been set.

>> +	if (get_signal(&ksig)) {
>>  		/* Whee!  Actually deliver the signal.  */
>>  		if (current->thread.system_call) {
>>  			regs->int_code = current->thread.system_call;
>> diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
>> index ec71e06ae364..de3d5b5724d8 100644
>> --- a/arch/x86/kernel/signal.c
>> +++ b/arch/x86/kernel/signal.c
>> @@ -861,11 +861,11 @@ static inline unsigned long get_nr_restart_syscall(const struct pt_regs *regs)
>>   * want to handle. Thus you cannot kill init even with a SIGKILL even by
>>   * mistake.
>>   */
>> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal)
>> +void arch_do_signal_or_restart(struct pt_regs *regs)
>>  {
>>  	struct ksignal ksig;
>>  
>> -	if (has_signal && get_signal(&ksig)) {
>> +	if (get_signal(&ksig)) {
>>  		/* Whee! Actually deliver the signal.  */
>>  		handle_signal(&ksig, regs);
>>  		return;
>> diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
>> index 9efbdda61f7a..3537fd25f14e 100644
>> --- a/include/linux/entry-common.h
>> +++ b/include/linux/entry-common.h
>> @@ -257,7 +257,7 @@ static __always_inline void arch_exit_to_user_mode(void) { }
>>   *
>>   * Invoked from exit_to_user_mode_loop().
>>   */
>> -void arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal);
>> +void arch_do_signal_or_restart(struct pt_regs *regs);
>>  
>>  /**
>>   * exit_to_user_mode - Fixup state when exiting to user mode
>> diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
>> index fa834a22e86e..b44a7820c468 100644
>> --- a/include/linux/tracehook.h
>> +++ b/include/linux/tracehook.h
>> @@ -106,6 +106,12 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
>>  	rseq_handle_notify_resume(NULL, regs);
>>  }
>>  
>> +static inline void clear_notify_signal(void)
>> +{
>> +	clear_thread_flag(TIF_NOTIFY_SIGNAL);
>> +	smp_mb__after_atomic();
>> +}
>> +
>>  /*
>>   * called by exit_to_user_mode_loop() if ti_work & _TIF_NOTIFY_SIGNAL. This
>>   * is currently used by TWA_SIGNAL based task_work, which requires breaking
>> @@ -113,8 +119,7 @@ static inline void tracehook_notify_resume(struct pt_regs *regs)
>>   */
>>  static inline void tracehook_notify_signal(void)
>>  {
>> -	clear_thread_flag(TIF_NOTIFY_SIGNAL);
>> -	smp_mb__after_atomic();
>> +	clear_notify_signal();
>>  	if (task_work_pending(current))
>>  		task_work_run();
>>  }
>> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
>> index f0b1daa1e8da..79eaf9b4b10d 100644
>> --- a/kernel/entry/common.c
>> +++ b/kernel/entry/common.c
>> @@ -139,15 +139,7 @@ void noinstr exit_to_user_mode(void)
>>  }
>>  
>>  /* Workaround to allow gradual conversion of architecture code */
>> -void __weak arch_do_signal_or_restart(struct pt_regs *regs, bool has_signal) { }
>> -
>> -static void handle_signal_work(struct pt_regs *regs, unsigned long ti_work)
>> -{
>> -	if (ti_work & _TIF_NOTIFY_SIGNAL)
>> -		tracehook_notify_signal();
>> -
>> -	arch_do_signal_or_restart(regs, ti_work & _TIF_SIGPENDING);
>> -}

With the work of tracehook_notify_signal happening in get_signal (called
from arch_do_signal_or_restart) there is no longer a reason to call
tracehook_notify_signal on it's own, or to remember if it was
TIF_NOTIFY_SIGNAL or TIF_SIGPENDING which was set.

>> +void __weak arch_do_signal_or_restart(struct pt_regs *regs) { }
>>  
>>  static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>>  					    unsigned long ti_work)
>> @@ -170,7 +162,7 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs,
>>  			klp_update_patch_state(current);
>>  
>>  		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
>> -			handle_signal_work(regs, ti_work);
>> +			arch_do_signal_or_restart(regs);
>>  
>>  		if (ti_work & _TIF_NOTIFY_RESUME)
>>  			tracehook_notify_resume(regs);
>> diff --git a/kernel/entry/kvm.c b/kernel/entry/kvm.c
>> index 96d476e06c77..cabf36a489e4 100644
>> --- a/kernel/entry/kvm.c
>> +++ b/kernel/entry/kvm.c
>> @@ -8,7 +8,7 @@ static int xfer_to_guest_mode_work(struct kvm_vcpu *vcpu, unsigned long ti_work)
>>  	do {
>>  		int ret;
>>  
>> -		if (ti_work & _TIF_NOTIFY_SIGNAL)
>> +		if (ti_work & (_TIF_SIGPENDING | _TIF_NOTIFY_SIGNAL))
>>  			tracehook_notify_signal();
>>  
>>  		if (ti_work & _TIF_SIGPENDING) {
>> diff --git a/kernel/signal.c b/kernel/signal.c
>> index 3b4cf25fb9b3..8632b88982c9 100644
>> --- a/kernel/signal.c
>> +++ b/kernel/signal.c
>> @@ -2626,20 +2626,12 @@ bool get_signal(struct ksignal *ksig)
>>  	struct signal_struct *signal = current->signal;
>>  	int signr;
>>  
>> +	clear_notify_signal();
>
> Why is this added?

See below.
>
>>  	if (unlikely(task_work_pending(current)))
>>  		task_work_run();
>>  
>> -	/*
>> -	 * For non-generic architectures, check for TIF_NOTIFY_SIGNAL so
>> -	 * that the arch handlers don't all have to do it. If we get here
>> -	 * without TIF_SIGPENDING, just exit after running signal work.
>> -	 */
>> -	if (!IS_ENABLED(CONFIG_GENERIC_ENTRY)) {
>> -		if (test_thread_flag(TIF_NOTIFY_SIGNAL))
>> -			tracehook_notify_signal();
>
> I don't see why this gets removed?

This is the core change of this change, the rest
follows from this change.

The definition of tracehook_notify_signal is:

  static inline void tracehook_notify_signal(void)
  {
  	clear_notify_signal();
  	if (task_work_pending(current))
  		task_work_run();
  }


Which means the only difference between:
	if (unlikely(task_work_pending(current)))
		task_work_run()

and tracehook_notify_signal is the work done by clear_notify_signal.
So I added the missing work and remove the fancy conditional.

There are only two architectures that define CONFIG_GENERIC_ENTRY
x86 and s390.  At some point the task_work_run was moved completely
out of get_signal (on x86 and s390) and it was assumed it was enough
to call tracehook_notify_signal when TIF_NOTIFY_SIGNAL was set.

That turned out to be false and kernel regressions were encountered.  So
the call to task_work_run was added back to get_signal.


Which is where this change comes in.  There is an unconditional call to
task_work_run() in get_signal, and a funny conditional call to
task_work_run().

So this change is just changing the kernel to unconditionally call
task_work_run() in get_signal(), as well as clearing TIF_NOTIFY_SIGNAL.

The result is that on the common path tracehook_notify_signal no longer
needs to be called.

My next change removes tracehook_notify_signal entirely.  Which is
the other reason I add clear_notify_signal.

>
>> -		if (!task_sigpending(current))
>> -			return false;
>> -	}
>> +	if (!task_sigpending(current))
>> +		return false;
>>  
>>  	if (unlikely(uprobe_deny_signal()))
>>  		return false;
>> -- 
>> 2.29.2
>> 

Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ