lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190107193718.GB6310@bombadil.infradead.org>
Date:   Mon, 7 Jan 2019 11:37:18 -0800
From:   Matthew Wilcox <willy@...radead.org>
To:     Dave Hansen <dave.hansen@...el.com>
Cc:     Vincent Whitchurch <vincent.whitchurch@...s.com>,
        akpm@...ux-foundation.org, viro@...iv.linux.org.uk,
        linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, mcgrof@...nel.org,
        keescook@...omium.org, corbet@....net, linux-doc@...r.kernel.org,
        Vincent Whitchurch <rabinv@...s.com>
Subject: Re: [PATCH] drop_caches: Allow unmapping pages

On Mon, Jan 07, 2019 at 11:25:16AM -0800, Dave Hansen wrote:
> On 1/7/19 6:15 AM, Matthew Wilcox wrote:
> > You're going to get data corruption doing this.  try_to_unmap_one()
> > does:
> > 
> >          /* Move the dirty bit to the page. Now the pte is gone. */
> >          if (pte_dirty(pteval))
> >                  set_page_dirty(page);
> > 
> > so PageDirty() can be false above, but made true by calling
> > try_to_unmap().
> 
> I don't think that PageDirty() check is _required_ for correctness.  You
> can always safely try_to_unmap() no matter the state of the PTE.  We
> can't lock out the hardware from setting the Dirty bit, ever.
> 
> It's also just fine to unmap PageDirty() pages, as long as when the PTE
> is created, we move the dirty bit from the PTE into the 'struct page'
> (which try_to_unmap() does, as you noticed).

Right, but the very next thing the patch does is call
invalidate_complete_page(), which calls __remove_mapping() which ... oh,
re-checks PageDirty() and refuses to drop the page.  So this isn't the
data corruptor that I had thought it was.

> > I also think the way you've done this is expedient at the cost of
> > efficiency and layering violations.  I think you should first tear
> > down the mappings of userspace processes (which will reclaim a lot
> > of pages allocated to page tables), then you won't need to touch the
> > invalidate_inode_pages paths at all.
> 
> By "tear down the mappings", do you mean something analogous to munmap()
> where the VMA goes away?  Or madvise(MADV_DONTNEED) where the PTE is
> destroyed but the VMA remains?
> 
> Last time I checked, we only did free_pgtables() when tearing down VMAs,
> but not for pure unmappings like reclaim or MADV_DONTNEED.  I've thought
> it might be fun to make a shrinker that scanned page tables looking for
> zero'd pages, but I've never run into a system where empty page table
> pages were actually causing a noticeable problem.

A few hours ago when I thought this patch had the ordering of checking
PageDirty() the wrong way round, I had the madvise analogy in mind so
that the PTEs would get destroyed and the dirty information transferred
to the struct page first before trying to drop pages.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ