lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 30 May 2022 08:41:20 +0200
From:   Juergen Gross <jgross@...e.com>
To:     Demi Marie Obenour <demi@...isiblethingslab.com>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Stefano Stabellini <sstabellini@...nel.org>,
        Jennifer Herbert <jennifer.herbert@...rix.com>,
        Marek Marczykowski-Górecki 
        <marmarek@...isiblethingslab.com>
Cc:     xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org,
        stable@...r.kernel.org
Subject: Re: [PATCH v2] xen/gntdev: Avoid blocking in unmap_grant_pages()

On 25.05.22 20:41, Demi Marie Obenour wrote:
> unmap_grant_pages() currently waits for the pages to no longer be used.
> In https://github.com/QubesOS/qubes-issues/issues/7481, this lead to a
> deadlock against i915: i915 was waiting for gntdev's MMU notifier to
> finish, while gntdev was waiting for i915 to free its pages.  I also
> believe this is responsible for various deadlocks I have experienced in
> the past.
> 
> Avoid these problems by making unmap_grant_pages async.  This requires
> making it return void, as any errors will not be available when the
> function returns.  Fortunately, the only use of the return value is a
> WARN_ON(), which can be replaced by a WARN_ON when the error is
> detected.  Additionally, a failed call will not prevent further calls
> from being made, but this is harmless.
> 
> Because unmap_grant_pages is now async, the grant handle will be sent to
> INVALID_GRANT_HANDLE too late to prevent multiple unmaps of the same
> handle.  Instead, a separate bool array is allocated for this purpose.
> This wastes memory, but stuffing this information in padding bytes is
> too fragile.  Furthermore, it is necessary to grab a reference to the
> map before making the asynchronous call, and release the reference when
> the call returns.

I think there is even more syncing needed:

- In the error path of gntdev_mmap() unmap_grant_pages() is being called and
   it is assumed, map is available afterwards again. This should be rather easy
   to avoid by adding a counter of active mappings to struct gntdev_grant_map
   (number of pages not being unmapped yet). In case this counter is not zero
   gntdev_mmap() should bail out early.

- gntdev_put_map() is calling unmap_grant_pages() in case the refcount has
   dropped to zero. This call can set the refcount to 1 again, so there is
   another delay needed before freeing map. I think unmap_grant_pages() should
   return in case the count of mapped pages is zero (see above), thus avoiding
   to increment the refcount of map if nothing is to be done. This would enable
   gntdev_put_map() to just return after the call of unmap_grant_pages() in case
   the refcount has been incremented again.


Juergen

Download attachment "OpenPGP_0xB0DE9DD628BF132F.asc" of type "application/pgp-keys" (3099 bytes)

Download attachment "OpenPGP_signature" of type "application/pgp-signature" (496 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ