linux-kernel - Re: [PATCH v3 1/3] s390/pci: Fix s390_mmio_read/write syscall page fault handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b7d247a8759982f3fda64ad250151672452aed84.camel@linux.ibm.com>
Date: Tue, 11 Jun 2024 17:56:46 +0200
From: Niklas Schnelle <schnelle@...ux.ibm.com>
To: David Hildenbrand <david@...hat.com>,
        Gerald Schaefer
 <gerald.schaefer@...ux.ibm.com>,
        Heiko Carstens <hca@...ux.ibm.com>, Vasily
 Gorbik <gor@...ux.ibm.com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Christian Borntraeger <borntraeger@...ux.ibm.com>,
        Sven Schnelle
 <svens@...ux.ibm.com>,
        Alex Williamson <alex.williamson@...hat.com>,
        Gerd
 Bayer <gbayer@...ux.ibm.com>,
        Matthew Rosato <mjrosato@...ux.ibm.com>,
        Jason Gunthorpe <jgg@...pe.ca>, Suren Baghdasaryan <surenb@...gle.com>
Cc: linux-s390@...r.kernel.org, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org
Subject: Re: [PATCH v3 1/3] s390/pci: Fix s390_mmio_read/write syscall page
 fault handling

On Tue, 2024-06-11 at 17:10 +0200, David Hildenbrand wrote:
> > > 
> > > which checks mmap_assert_write_locked().
> > > 
> > > Setting VMA flags would be racy with the mmap lock in read mode.
> > > 
> > > 
> > > remap_pfn_range() documents: "this is only safe if the mm semaphore is
> > > held when called." which doesn't spell out if it needs to be held in
> > > write mode (which I think it does) :)
> > 
> > Logically this makes sense to me. At the same time it looks like
> > fixup_user_fault() expects the caller to only hold mmap_read_lock() as
> > I do here. In there it even retakes mmap_read_lock(). But then wouldn't
> > any fault handling by its nature need to hold the write lock?
> 
> Well, if you're calling remap_pfn_range() right now the expectation is 
> that we hold it in write mode. :)
> 
> Staring at some random users, they all call it from mmap(), where you 
> hold the mmap lock in write mode.
> 
> 
> I wonder why we are not seeing that splat with vfio all of the time?
> 
> That mmap lock check was added "recently". In 1c71222e5f23 we started 
> using vm_flags_set(). That (including the mmap_assert_write_locked()) 
> check was added via bc292ab00f6c almost 1.5 years ago.
> 
> Maybe vfio is a bit special and was never really run with lockdep?
> 
> > 
> > > 
> > > 
> > > My best guess is: if you are using remap_pfn_range() from a fault
> > > handler (not during mmap time) you are doing something wrong, that's why
> > > you get that report.
> > 
> > @Alex: I guess so far the vfio_pci_mmap_fault() handler is only ever
> > triggered by "normal"/"actual" page faults where this isn't a problem?
> > Or could it be a problem there too?
> > 
> 
> I think we should see it there as well, unless I am missing something.
> 
> > > 
> > > vmf_insert_pfn() and friends might be better alternatives, that make
> > > sure that the VMA already received the proper VMA flags at mmap time.
> > > 
> 
> 
> There would be ways of silencing that check: for example, making sure at 
> mmap time that these flags are already set, and skipping modifications 
> if the flags are already set.
> 
> But, we'll run into more similar checks in x86 VM_PAT code, where we 
> would do vm_flags_set(vma, VM_PAT) from track_pfn_remap. Some of that 
> code really doesn't want to be called concurrently (e.g., "vma->vm_pgoff 
> = pfn;").
> 
> I thought that we silenced some of these warnings in the past using 
> __vm_flags_mod(). But it sounds hacky.
> 
> CCing Sureen.
> 

Before I forget it, thanks a lot for your incredible help David!