lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20201026233150.371577-1-mike.kravetz@oracle.com>
Date:   Mon, 26 Oct 2020 16:31:46 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc:     Hugh Dickins <hughd@...gle.com>, Michal Hocko <mhocko@...nel.org>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Prakash Sangappa <prakash.sangappa@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>
Subject: [RFC PATCH v2 0/4] hugetlbfs: introduce hinode_rwsem for pmd sharing synchronization

In commit c0d0381ade79, changes were made to use i_mmap_rwsem for pmd
sharing synchronization.  This required changes to mm locking order that
are hugetlb specific.  Specifically, i_mmap_rwsem must be taken before
the page lock.  This is not not a huge issue in hugetlb specific code,
but becomes more problematic in the areas of page migration and memory
failure where generic mm code had to deal with this change to lock
ordering.  An ugly routine 'hugetlb_page_mapping_lock_write' was added
to help with these issues.

Recently, Hugh Dickins diagnosed a migration BUG as caused by code
introduced with hugetlb i_mmap_rwsem synchronization [1].  Subsequent
discussion in that thread pointed out additional problems in the code.

Adding a rw_semaphore to the hugetlbfs inode for this type of synchronization
was mentioned.  Such an approach is actually 'cleaner' as it can be
inserted in the lock hierarchy where needed.  And, there is no issue
with other parts of the mm using this rw_semaphore.

This series adds a rw_semaphore (hinode_rwsem) to the hugetlbfs inode.

The first patch reverts all commits having to deal with the current use
of i_mmap_rwsem for pmd sharing and fault/truncate synchronization.  The
revert of 5 commits was combined into a single patch.  I am looking for
feedback on this approach.  I considered:
- 5 Patches to revert the 5 commits
- Reverting patches depending on c0d0381ade79, then having a patch to
  change from i_mmap_rwsem to hinode_rwsem.
To me, a 'clean slate' approach seemed best but I am open to whatever
would be easiest to review.

Changes in RFC v2
  - Added missing locking pointed out by Naoya Horiguchi
  - Cleaned up some comments as suggested by Naoya Horiguchi
  - Cleaned up and documented hinode_lock_read() helper and added
    hinode_lock_write() helper.
  - Split out addition of hinode_rwsem and helper routines to a separate
    patch.

[1] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.anvils/

Mike Kravetz (4):
  hugetlbfs: revert use of i_mmap_rwsem for pmd sharing and more sync
  hugetlbfs: add hinode_rwsem to hugetlb specific inode
  hugetlbfs: use hinode_rwsem for pmd sharing synchronization
  huegtlbfs: handle page fault/truncate races

 fs/hugetlbfs/inode.c    |  87 +++++++------
 include/linux/fs.h      |  15 ---
 include/linux/hugetlb.h | 135 ++++++++++++++++++--
 mm/hugetlb.c            | 267 ++++++++++++++++------------------------
 mm/memory-failure.c     |  34 ++---
 mm/memory.c             |   5 +
 mm/migrate.c            |  34 +++--
 mm/rmap.c               |  17 +--
 mm/userfaultfd.c        |  19 +--
 9 files changed, 322 insertions(+), 291 deletions(-)

-- 
2.25.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ