lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <cover.60659b549d8509ddecafad4f498ee7f03bb23c69.1664366292.git-series.apopple@nvidia.com>
Date:   Wed, 28 Sep 2022 22:01:14 +1000
From:   Alistair Popple <apopple@...dia.com>
To:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org
Cc:     Alistair Popple <apopple@...dia.com>, linux-kernel@...r.kernel.org,
        amd-gfx@...ts.freedesktop.org, nouveau@...ts.freedesktop.org,
        dri-devel@...ts.freedesktop.org
Subject: [PATCH v2 0/8] Fix several device private page reference counting issues

This series aims to fix a number of page reference counting issues in
drivers dealing with device private ZONE_DEVICE pages. These result in
use-after-free type bugs, either from accessing a struct page which no
longer exists because it has been removed or accessing fields within the
struct page which are no longer valid because the page has been freed.

During normal usage it is unlikely these will cause any problems. However
without these fixes it is possible to crash the kernel from userspace.
These crashes can be triggered either by unloading the kernel module or
unbinding the device from the driver prior to a userspace task exiting. In
modules such as Nouveau it is also possible to trigger some of these issues
by explicitly closing the device file-descriptor prior to the task exiting
and then accessing device private memory.

This involves some minor changes to both PowerPC and AMD GPU code.
Unfortunately I lack hardware to test either of those so any help there
would be appreciated. The changes mimic what is done in for both Nouveau
and hmm-tests though so I doubt they will cause problems.

To: Andrew Morton <akpm@...ux-foundation.org>
To: linux-mm@...ck.org
Cc: linux-kernel@...r.kernel.org
Cc: amd-gfx@...ts.freedesktop.org
Cc: nouveau@...ts.freedesktop.org
Cc: dri-devel@...ts.freedesktop.org

Alistair Popple (8):
  mm/memory.c: Fix race when faulting a device private page
  mm: Free device private pages have zero refcount
  mm/memremap.c: Take a pgmap reference on page allocation
  mm/migrate_device.c: Refactor migrate_vma and migrate_deivce_coherent_page()
  mm/migrate_device.c: Add migrate_device_range()
  nouveau/dmem: Refactor nouveau_dmem_fault_copy_one()
  nouveau/dmem: Evict device private memory during release
  hmm-tests: Add test for migrate_device_range()

 arch/powerpc/kvm/book3s_hv_uvmem.c       |  17 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.c |  19 +-
 drivers/gpu/drm/amd/amdkfd/kfd_migrate.h |   2 +-
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c     |  11 +-
 drivers/gpu/drm/nouveau/nouveau_dmem.c   | 108 +++++++----
 include/linux/memremap.h                 |   1 +-
 include/linux/migrate.h                  |  15 ++-
 lib/test_hmm.c                           | 129 ++++++++++---
 lib/test_hmm_uapi.h                      |   1 +-
 mm/memory.c                              |  16 +-
 mm/memremap.c                            |  30 ++-
 mm/migrate.c                             |  34 +--
 mm/migrate_device.c                      | 239 +++++++++++++++++-------
 mm/page_alloc.c                          |   8 +-
 tools/testing/selftests/vm/hmm-tests.c   |  49 +++++-
 15 files changed, 516 insertions(+), 163 deletions(-)

base-commit: 088b8aa537c2c767765f1c19b555f21ffe555786
-- 
git-series 0.9.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ