linux-kernel - Re: [PATCH] signal/x86: Delay calling signals in atomic

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87o81nl3b6.fsf@email.froward.int.ebiederm.org>
Date:   Wed, 30 Mar 2022 13:10:05 -0500
From:   "Eric W. Biederman" <ebiederm@...ssion.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org,
        Oleg Nesterov <oleg@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>,
        Andy Lutomirski <luto@...nel.org>,
        Ben Segall <bsegall@...gle.com>,
        Borislav Petkov <bp@...en8.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Mel Gorman <mgorman@...e.de>,
        Peter Zijlstra <peterz@...radead.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vincent Guittot <vincent.guittot@...aro.org>
Subject: Re: [PATCH] signal/x86: Delay calling signals in atomic

Sebastian Andrzej Siewior <bigeasy@...utronix.de> writes:

> On 2022-03-28 17:07:25 [-0500], Eric W. Biederman wrote:
>> Sebastian Andrzej Siewior <bigeasy@...utronix.de> writes:
>> 
>> We have a few other cases where we deliver signals from interrupts.
>> Off the top of my head there is SAK and magic sysrq, but I think there
>> are more.  So I am also not convinced that all signals you care about
>> will go through force_sig_info_to_task.
>> 
>> What I really don't know and this is essentially a PREEMPT_RT question
>> is what makes int3 special?  Why don't other faults have this problem?
>
> int3 on x86 is delivered from the debug interrupt and at this point
> interrupts are in general disabled even on PREEMPT_RT.
> If you are in a section which disables interrupts via
> spin_lock_irqsave() then interrupts are not disabled on PREEMPT_RT.
> If you are in an interrupt handler (as per request_irq(), not the
> special vector that is used for int3 handling) then interrupts are also
> not disabled because the interrupt handler is threaded by default.
>
> In both cases in_atomic() reports 0 on PREEMPT_RT. So the exception is:
> - explicit usage of local_irq_{save|disable}, preempt_disable().
> - usage of raw_spinlock_t locks.
> - interrupts vectors which are not threaded (like int3 or the SMP IPI
>   function call).
>
>> I remember there was a change where we had threaded interrupt handlers
>> to get around this for interrupt service routines.  I wonder if we need
>> to do something similar with faults.  Have a fast part and a threaded
>> part that runs in a schedulable way.  Although given that for a fault
>> you need to be fundamentally bound to the task/thread you faulted from
>> it probably just means having a way to switch to a kernel stack that you
>> can schedule from, and not use a reserved per cpu stack.  The
>> task_struct would certainly need to stay the same for all of the pieces.
>> 
>> Or maybe for PREMPT_RT you pick the i386 mechanism.  How does PREEMPT_RT
>> deal with page faults, or general protection faults?
>
> An in-kernel stack overflow will panic() with interrupts disabled.
> An in-kernel NULL-pointer is also entered with disabled interrupts and
> complains later about sleeping locks in do_exit(). I do remember that
> the arch code conditionally enabled interrupts based on IRQ-flags on
> stack.
>
>> This is my long winded way of saying that I rather expect that if
>> PREEMPT_RT is going to call code it has modified to be sleeping that it
>> would also make it safe for that code to sleep.
>> 
>> Further (and this is probably my ignorance) I just don't see what makes
>> any of this specific to just int3.  Why aren't other faults affected?
>
> That NULL-pointer in kernel doesn't look good. If you have a test-case
> (like do this) then I can definitely look into it in case more is
> missed.

The linux kernel dump test module aka drivers/misc/lkdtm should have
that and several other nasty ways to die as a test cases.

So I am confused.  The call path that is trying to be fixed is:

idtentry
  idtentry_body
    exc_int3
      do_int3_user
        do_trap
          force_sig

There are a multiple am I from userspace checks and I did not verify
they are all equal, so I may have missed something subtle.

But it looks like if we are coming from userspace then we use the same
stack as any other time we would come from userspace.  AKA a stack
that allows the kernel to sleep.

So I don't see what the problem is that is trying to be fixed.

I know that code has been changed over the years, perhaps this is
something that was fixed upstream and the real time tree didn't realize
there was no longer a need to fix anything?

Or am I missing something subtle when reading the idtentry assembly?

Eric