linux-kernel - Re: perf_event_open+clone = unkillable process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <878syu7tcm.fsf@xmission.com>
Date:   Mon, 04 Feb 2019 22:27:21 -0600
From:   ebiederm@...ssion.com (Eric W. Biederman)
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     Dmitry Vyukov <dvyukov@...gle.com>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        jolsa@...hat.com, Namhyung Kim <namhyung@...nel.org>,
        luca abeni <luca.abeni@...tannapisa.it>,
        syzkaller <syzkaller@...glegroups.com>,
        Oleg Nesterov <oleg@...hat.com>
Subject: Re: perf_event_open+clone = unkillable process

ebiederm@...ssion.com (Eric W. Biederman) writes:

> Thomas Gleixner <tglx@...utronix.de> writes:
>
>> On Mon, 4 Feb 2019, Dmitry Vyukov wrote:
>>
>>> On Mon, Feb 4, 2019 at 10:27 AM Thomas Gleixner <tglx@...utronix.de> wrote:
>>> >
>>> > On Fri, 1 Feb 2019, Dmitry Vyukov wrote:
>>> >
>>> > > On Fri, Feb 1, 2019 at 5:48 PM Dmitry Vyukov <dvyukov@...gle.com> wrote:
>>> > > >
>>> > > > Hello,
>>> > > >
>>> > > > The following program creates an unkillable process that eats CPU.
>>> > > > /proc/pid/stack is empty, I am not sure what other info I can provide.
>>> > > >
>>> > > > Tested is on upstream commit 4aa9fc2a435abe95a1e8d7f8c7b3d6356514b37a.
>>> > > > Config is attached.
>>> > >
>>> > > Looking through other reproducers that create unkillable processes, I
>>> > > think I found a much simpler reproducer (below). It's single threaded
>>> > > and just setups SIGBUS handler and does timer_create+timer_settime to
>>> > > send repeated SIGBUS. The resulting process can't be killed with
>>> > > SIGKILL.
>>> > > +Thomas for timers.
>>> >
>>> > +Oleg, Eric
>>> >
>>> > That's odd. With some tracing I can see that SIGKILL is generated and
>>> > queued, but its not delivered by some weird reason. I'm traveling in the
>>> > next days, so I won't be able to do much about it. Will look later this
>>> > week.
>>> 
>>> Just a random though looking at the repro: can constant SIGBUS
>>> delivery starve delivery of all other signals (incl SIGKILL)?
>>
>> Indeed. SIGBUS is 7, SIGKILL is 9 and next_signal() delivers the lowest
>> number first....
>
> We do have the special case in complete_signal that causes most of the
> signal delivery work of SIGKILL to happen when SIGKILL is queued.
>
> I need to look at your reproducer.  It would require being a per-thread
> signal to cause problems in next_signal.
>
> It is definitely worth fixing if there is any way for userspace to block
> SIGKILL.

Ugh.

The practical problem appears much worse.

Tracing the code I see that we attempt to deliver SIGBUS, I presume in a
per thread way.

At some point the delivery of SIGBUS fails.  Then the kernel attempts
to synchronously force SIGSEGV.  Which should be the end of it.

Unfortunately at that point our heuristic for dealing with syncrhonous
signals fails in next_signal and we attempt to deliver the timers
SIGBUS instead.

I suspect it is time to byte the bullet and handle the synchronous
unblockable signals differently.  I will see if I can cook up an
appropriate patch.

Eric