lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140402151825.GA3614@gmail.com>
Date:	Wed, 2 Apr 2014 11:18:27 -0400
From:	Jerome Glisse <j.glisse@...il.com>
To:	Haggai Eran <haggaie@...lanox.com>
Cc:	Andrea Arcangeli <aarcange@...hat.com>,
	Mike Rapoport <mike.rapoport@...ellosystems.com>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org,
	Izik Eidus <izik.eidus@...ellosystems.com>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Or Gerlitz <ogerlitz@...lanox.com>,
	Sagi Grimberg <sagig@...lanox.com>,
	Shachar Raindel <raindel@...lanox.com>
Subject: Re: [PATCH] mm/mmu_notifier: restore set_pte_at_notify semantics

On Wed, Apr 02, 2014 at 03:52:45PM +0300, Haggai Eran wrote:
> On 03/30/2014 11:33 PM, Jerome Glisse wrote:
> >On Wed, Jan 22, 2014 at 04:01:15PM +0200, Haggai Eran wrote:
> >>I'm worried about the following scenario:
> >>
> >>Given a read-only page, suppose one host thread (thread 1) writes to
> >>that page, and performs COW, but before it calls the
> >>mmu_notifier_invalidate_page_if_missing_change_pte function another host
> >>thread (thread 2) writes to the same page (this time without a page
> >>fault). Then we have a valid entry in the secondary page table to a
> >>stale page, and someone (thread 3) may read stale data from there.
> >>
> >>Here's a diagram that shows this scenario:
> >>
> >>Thread 1                                | Thread 2        | Thread 3
> >>========================================================================
> >>do_wp_page(page 1)                      |                 |
> >>   ...                                   |                 |
> >>   set_pte_at_notify                     |                 |
> >>   ...                                   | write to page 1 |
> >>                                         |                 | stale access
> >>   pte_unmap_unlock                      |                 |
> >>   invalidate_page_if_missing_change_pte |                 |
> >>
> >>This is currently prevented by the use of the range start and range end
> >>notifiers.
> >>
> >>Do you agree that this scenario is possible with the new patch, or am I
> >>missing something?
> >>
> >I believe you are right, but of all the upstream user of the mmu_notifier
> >API only xen would suffer from this ie any user that do not have a proper
> >change_pte callback can see the bogus scenario you describe above.
> Yes. I sent our RDMA paging RFC patch-set on linux-rdma [1] last
> month, and it would also suffer from this scenario, but it's not
> upstream yet.
> >The issue i see is with user that want to/or might sleep when they are
> >invalidation the secondary page table. The issue being that change_pte is
> >call with the cpu page table locked (well at least for the affected pmd).
> >
> >I would rather keep the invalidate_range_start/end bracket around change_pte
> >and invalidate page. I think we can fix the kvm regression by other means.
> Perhaps another possibility would be to do the
> invalidate_range_start/end bracket only when the mmu_notifier is
> missing a change_pte implementation.

This would imply either to scan all mmu_notifier currently register or to
have a global flags for the mm to know if there is one mmu_notifier without
change_pte. Moreover this would means that kvm would remain "broken" if one
of the mmu notifier do not have the change_pte callback.

Solution i have in mind and is part of a patchset i am working on, just
involve passing along an enum value to mmu notifier callback. The enum
value would tell what are the exact event that actually triggered the
mmu notifier call (vmscan, migrate, ksm, ...). Knowing this kvm could then
simply ignore invalidate_range_start/end for event it knows it will get
a change_pte callback.

Cheers,
Jérôme

> 
> Best regards,
> Haggai
> 
> [1] http://www.spinics.net/lists/linux-rdma/msg18906.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ