linux-kernel - Re: Signal handling in a page fault handler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <152275879566.32747.9293394837417347482@mail.alporthouse.com>
Date:   Tue, 03 Apr 2018 13:33:15 +0100
From:   Chris Wilson <chris@...is-wilson.co.uk>
To:     Matthew Wilcox <willy@...radead.org>,
        dri-devel@...ts.freedesktop.org, linux-mm@...ck.org,
        "Souptick Joarder" <jrdr.linux@...il.com>
Cc:     linux-kernel@...r.kernel.org
Subject: Re: Signal handling in a page fault handler

Quoting Matthew Wilcox (2018-04-02 15:10:58)
> 
> Souptick and I have been auditing the various page fault handler routines
> and we've noticed that graphics drivers assume that a signal should be
> able to interrupt a page fault.  In contrast, the page cache takes great
> care to allow only fatal signals to interrupt a page fault.
> 
> I believe (but have not verified) that a non-fatal signal being delivered
> to a task which is in the middle of a page fault may well end up in an
> infinite loop, attempting to handle the page fault and failing forever.
> 
> Here's one of the simpler ones:
> 
>         ret = mutex_lock_interruptible(&etnaviv_obj->lock);
>         if (ret)
>                 return VM_FAULT_NOPAGE;
> 
> (many other drivers do essentially the same thing including i915)
> 
> On seeing NOPAGE, the fault handler believes the PTE is in the page
> table, so does nothing before it returns to arch code at which point
> I get lost in the magic assembler macros.  I believe it will end up
> returning to userspace if the signal is non-fatal, at which point it'll
> go right back into the page fault handler, and mutex_lock_interruptible()
> will immediately fail.  So we've converted a sleeping lock into the most
> expensive spinlock.

I'll ask the obvious question: why isn't the signal handled on return to
userspace?

> I don't think the graphics drivers really want to be interrupted by
> any signal.

Assume the worst case and we may block for 10s. Even a 10ms delay may be
unacceptable to some signal handlers (one presumes). For the number one
^C usecase, yes that may be reduced to only bother if it's killable, but
I wonder if there are not timing loops (e.g. sigitimer in Xorg < 1.19)
that want to be able to interrupt random blockages.
-Chris