linux-kernel - Re: [PATCH 4/8] hugetlb: handle truncate racing with page faults

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Yxiv0SkMkZ0JWGGp@monkey>
Date:   Wed, 7 Sep 2022 07:50:57 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Sven Schnelle <svens@...ux.ibm.com>
Cc:     Miaohe Lin <linmiaohe@...wei.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Muchun Song <songmuchun@...edance.com>,
        David Hildenbrand <david@...hat.com>,
        Michal Hocko <mhocko@...e.com>, Peter Xu <peterx@...hat.com>,
        Naoya Horiguchi <naoya.horiguchi@...ux.dev>,
        "Aneesh Kumar K . V" <aneesh.kumar@...ux.vnet.ibm.com>,
        Andrea Arcangeli <aarcange@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Davidlohr Bueso <dave@...olabs.net>,
        Prakash Sangappa <prakash.sangappa@...cle.com>,
        James Houghton <jthoughton@...gle.com>,
        Mina Almasry <almasrymina@...gle.com>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        Axel Rasmussen <axelrasmussen@...gle.com>,
        Ray Fucillo <Ray.Fucillo@...ersystems.com>,
        linux-s390@...r.kernel.org, hca@...ux.ibm.com, gor@...ux.ibm.com,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 4/8] hugetlb: handle truncate racing with page faults

On 09/07/22 10:22, Sven Schnelle wrote:
> Mike Kravetz <mike.kravetz@...cle.com> writes:
> 
> > Would you be willing to try the patch below in your environment?
> > It addresses the stall I can create with a file that has a VERY large hole.
> > In addition, it passes libhugetlbfs tests and has run for a while in my
> > truncate/page fault race stress test.  However, it is very early code.
> > It would be nice to see if it addresses the issue in your environment.
> 
> Yes, that fixes the issue for me. I added some debugging yesterday
> evening after sending the initial report, and the end value in the loop
> was indeed quite large - i didn't record the exact number, but it was
> something like 0xffffffffff800001. Feel free to add my Tested-by.
> 

Thank you!

When thinking about this some more, the new vma_lock introduced by this series
may address truncation/fault races without the need of involving the fault
mutex.

How?

Before truncating or hole punching, we need to unmap all users of that range.
To unmap, we need to acquire the vma_lock for each vma mapping the file.  This
same lock is acquired in the page fault path.  As such, it provides the same
type of synchronization around i_size as provided by the fault mutex in this
patch.  So, I think we can make the code much simpler (and faster) by removing
the code taking the fault mutex for holes in files.  Of course, this can not
happen until the vma_lock is actually put into use which is done in the last
patch of this series.
-- 
Mike Kravetz