lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon,  6 Jul 2020 13:26:12 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     linux-mm@...ck.org, linux-kernel@...r.kernel.org
Cc:     Michal Hocko <mhocko@...nel.org>, Hugh Dickins <hughd@...gle.com>,
        Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Prakash Sangappa <prakash.sangappa@...cle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Mike Kravetz <mike.kravetz@...cle.com>
Subject: [RFC PATCH 0/3] hugetlbfs: address fault time regression

Commits c0d0381ade79 and 87bf91d39bb5 changed the way huegtlb locking
was performed to address BUGs.  One specific change was to always take
the i_mmap_rwsem in read mode during fault processing.  One result of
this change was a 33% regression for anon non-shared page faults [1].

Technically, i_mmap_rwsem only needs to be taken during page faults
if the pmd can potentially be shared.  pmd sharing is not possible for
anon non-shared mappings (as in the reported regression), therefore the
code can be modified to not acquire the semaphore in this case.

Unfortunately, commit 87bf91d39bb5 depends on i_mmap_rwsem always being
taken in the fault path to prevent fault/truncation races.  So, that
approach is no longer appropriate.  Rather, the code now detects races
and backs out operations.

This code "works" in that it only takes i_mmap_rwsem when necessary and
addresses the original BUGs.  However, I am sending as an RFC because:
- I am unsure if the added complexity is worth performance benefit.
- There needs to be a better way/location to make a decison about taking
  the semaphore.  See FIXME's in the code.

Comments and suggestions would be appreciated.

[1] https://lore.kernel.org/lkml/20200622005551.GK5535@shao2-debian

Mike Kravetz (3):
  Revert: "hugetlbfs: Use i_mmap_rwsem to address page fault/truncate
    race"
  hugetlbfs: Only take i_mmap_rwsem when sharing is possible
  huegtlbfs: handle page fault/truncate races

 fs/hugetlbfs/inode.c |  69 +++++++++-----------
 mm/hugetlb.c         | 150 ++++++++++++++++++++++++++++++-------------
 2 files changed, 137 insertions(+), 82 deletions(-)

-- 
2.25.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ