linux-kernel - [REGRESSION] Hang in 5.17.4+ that appears to be due to Xen

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YoZy6BRIkfoeY8af@itl-email>
Date:   Thu, 19 May 2022 12:39:40 -0400
From:   Demi Marie Obenour <demi@...isiblethingslab.com>
To:     Juergen Gross <jgross@...e.com>,
        Xen developer discussion <xen-devel@...ts.xenproject.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Cc:     Boris Ostrovski <boris.ostrovsky@...cle.com>,
        Marek Marczykowski-Górecki 
        <marmarek@...isiblethingslab.com>, linux-kernel@...r.kernel.org,
        Jani Nikula <jani.nikula@...ux.intel.com>,
        Joonas Lahtinen <joonas.lahtinen@...ux.intel.com>,
        Rodrigo Vivi <rodrigo.vivi@...el.com>,
        Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>,
        David Airlie <airlied@...ux.ie>,
        Daniel Vetter <daniel@...ll.ch>,
        Intel Graphics Development <intel-gfx@...ts.freedesktop.org>,
        DRI Development <dri-devel@...ts.freedesktop.org>,
        Linux Memory Management <linux-mm@...ck.org>,
        regressions@...ts.linux.dev
Subject: [REGRESSION] Hang in 5.17.4+ that appears to be due to Xen

On Mon, May 16, 2022 at 10:00:07AM -0400, Demi Marie Obenour wrote:
> On Mon, May 16, 2022 at 08:48:17AM +0200, Juergen Gross wrote:
> > On 14.05.22 17:55, Demi Marie Obenour wrote:
> > > In https://github.com/QubesOS/qubes-issues/issues/7481, a user reported
> > > that Xorg locked up when resizing a VM window.  While I do not have the
> > > same hardware the user does and thus cannot reproduce the bug, the stack
> > > trace seems to indicate a deadlock between xen_gntdev and i915.  It
> > > appears that gnttab_unmap_refs_sync() is waiting for i915 to free the
> > > pages, while i915 is waiting for the MMU notifier that called
> > > gnttab_unmap_refs_sync() to return.  Result: deadlock.
> > > 
> > > The problem appears to be that a mapped grant in PV mode will stay in
> > > the “invalidating” state until it is freed.  While MMU notifiers are
> > > allowed to sleep, it appears that they cannot wait for the page to be
> > > freed, as is happening here.  That said, I am not very familiar with
> > > this code, so my diagnosis might be incorrect.
> > 
> > All I can say for now is that your patch seems to be introducing a use after
> > free issue, as the parameters of the delayed work might get freed now before
> > the delayed work is being executed.
> 
> I figured it was wrong, not least because I don’t think it compiles
> (invalid use of void value).  That said, the current behavior is quite
> suspicious to me.  For one, it appears that munmap() on a grant in a PV
> domain will not return until nobody else is using the page.  This is not
> what I would expect, and I can easily imagine it causing deadlocks in
> userspace.  Instead, I would expect for gntdev to automatically release
> the grant when the reference count hits zero.  This would also allow for
> the same grant to be mapped in multiple processes, and might even unlock
> DMA-BUF support.
> 
> > I don't know why this is happening only with rather recent kernels, as the
> > last gntdev changes in this area have been made in kernel 4.13.
> > 
> > I'd suggest to look at i915, as quite some work has happened in the code
> > visible in your stack backtraces rather recently. Maybe it would be possible
> > to free the pages in i915 before calling the MMU notifier?
> 
> While I agree that the actual problem is almost certainly in i915, the
> gntdev code does appear rather fragile.  Since so few people use i915 +
> Xen, problems with the combination generally don’t show up until some
> Qubes user makes a bug report, which isn’t great.  It would be better if
> Xen didn’t introduce requirements on other kernel code that did not hold
> when not running on Xen.
> 
> In this case, if it is actually an invariant that one must not call MMU
> notifiers for pages that are still in use, it would be better if this
> was caught by a WARN_ON() or BUG_ON() in the core memory management
> code.  That would have found the bug instantly and deterministically on
> all platforms, whereas the current failure is nondeterministic and only
> happens under Xen.
> 
> I also wonder if this is a bug in the core MMU notifier infrastructure.
> My reading of the mmu_interval_notifier_remove() documentation is that
> it should only wait for the specific notifier being removed to finish,
> not for all notifiers to finish.  Adding the memory management
> maintainers.

Also adding the kernel regression tracker.

#regzbot introduced v5.16..v5.17.4
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)