lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <C47A7C40-BE3E-4F0F-B854-D40D4795A236@vmware.com>
Date:   Thu, 9 Nov 2023 10:16:57 +0000
From:   Nadav Amit <namit@...are.com>
To:     Byungchul Park <byungchul@...com>
CC:     Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        "kernel_team@...ynix.com" <kernel_team@...ynix.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "ying.huang@...el.com" <ying.huang@...el.com>,
        "xhao@...ux.alibaba.com" <xhao@...ux.alibaba.com>,
        "mgorman@...hsingularity.net" <mgorman@...hsingularity.net>,
        "hughd@...gle.com" <hughd@...gle.com>,
        "willy@...radead.org" <willy@...radead.org>,
        "david@...hat.com" <david@...hat.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        Andy Lutomirski <luto@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "bp@...en8.de" <bp@...en8.de>,
        "dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>
Subject: Re: [v3 2/3] mm: Defer TLB flush by keeping both src and dst folios
 at migration



> On Nov 8, 2023, at 6:12 AM, Byungchul Park <byungchul@...com> wrote:
> 
> !! External Email
> 
> On Mon, Oct 30, 2023 at 09:51:30PM +0900, Byungchul Park wrote:
>>>> diff --git a/mm/memory.c b/mm/memory.c
>>>> index 6c264d2f969c..75dc48b6e15f 100644
>>>> --- a/mm/memory.c
>>>> +++ b/mm/memory.c
>>>> @@ -3359,6 +3359,19 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
>>>>  if (vmf->page)
>>>>          folio = page_folio(vmf->page);
>>>> 
>>>> + /*
>>>> +  * This folio has its read copy to prevent inconsistency while
>>>> +  * deferring TLB flushes. However, the problem might arise if
>>>> +  * it's going to become writable.
>>>> +  *
>>>> +  * To prevent it, give up the deferring TLB flushes and perform
>>>> +  * TLB flush right away.
>>>> +  */
>>>> + if (folio && migrc_pending_folio(folio)) {
>>>> +         migrc_unpend_folio(folio);
>>>> +         migrc_try_flush_free_folios(NULL);
>>> 
>>> So many potential function calls… Probably they should have been combined
>>> into one and at least migrc_pending_folio() should have been an inline
>>> function in the header.
>> 
>> I will try to change it as you mention.
>> 
>>>> + }
>>>> +
>>> 
>>> What about mprotect? I thought David has changed it so it can set writable
>>> PTEs.
>> 
>> I will check it out.
> 
> I found mprotect stuff is already performing TLB flushes needed for it.
> So some redundant TLB flushes might happen by migrc but it's not that
> harmful I think. Thanks.

Let me explain the scenario I am concerned with. Assume page P is RO, and
moves from Psrc to Pdst. Pointer “p” points to P. Initially (*p == 0).

Let’s also assume we also have an atomic variable “a”. Initially (a == 0).

I hope I got the migration function names right, but I hope the problem
itself can be clear regardless. 

CPU0			CPU1			CPU2		CPU3
----			----			----		----
			(user-mode)		(user-mode)		

			Access *p
			[Psrc cached in TLB]
 
migrate_pages_batch()
-> migrate_folio_unmap()

[ PTE updated, 
  still no flush ]

								mprotect(p,
									RW)

								[ Psrc is
								  RW ]

								[ flush
								  deferred]


						*p = 1  # Pdst
						
						xchg(&a, 1)
			mfence
			if (a == 1)
			  assert(*p == 1);


				
Now at this point the assertion might fail. CPU2 wrote into Pdst, whereas
CPU1 reads from Psrc. But based on x86 memory model, userspace might not
expect this scenario to be possible, hence leading to bugs.






Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ