lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a3a8a428-17d3-e3cb-913c-b44de12db9e4@intel.com>
Date:   Fri, 13 Mar 2020 09:59:50 -0700
From:   Dave Hansen <dave.hansen@...el.com>
To:     Minchan Kim <minchan@...nel.org>
Cc:     Michal Hocko <mhocko@...nel.org>, Jann Horn <jannh@...gle.com>,
        Linux-MM <linux-mm@...ck.org>,
        kernel list <linux-kernel@...r.kernel.org>,
        Daniel Colascione <dancol@...gle.com>,
        "Joel Fernandes (Google)" <joel@...lfernandes.org>,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: interaction of MADV_PAGEOUT with CoW anonymous mappings?

On 3/12/20 7:00 PM, Minchan Kim wrote:
> On Thu, Mar 12, 2020 at 02:41:07PM -0700, Dave Hansen wrote:
>> One other fun thing.  I have a "victim" thread sitting in a loop doing:
>>
>> 	sleep(1)
>> 	memcpy(&garbage, buffer, sz);
>>
>> The "attacker" is doing
>>
>> 	madvise(buffer, sz, MADV_PAGEOUT);
>>
>> in a loop.  That, oddly enough doesn't cause the victim to page fault.
>> But, if I do:
>>
>> 	memcpy(&garbage, buffer, sz);
>> 	madvise(buffer, sz, MADV_PAGEOUT);
>>
>> It *does* cause the memory to get paged out.  The MADV_PAGEOUT code
>> actually has a !pte_present() check.  It will punt on a PTE if it sees
>> it.  In other words, if a page is in the swap cache but not mapped by a
>> pte_present() PTE, MADV_PAGEOUT won't touch it.
>>
>> Shouldn't MADV_PAGEOUT be able to find and reclaim those pages?  Patch
>> attached.
> 
>>
>>
>> ---
>>
>>  b/mm/madvise.c |   38 +++++++++++++++++++++++++++++++-------
>>  1 file changed, 31 insertions(+), 7 deletions(-)
>>
>> diff -puN mm/madvise.c~madv-pageout-find-swap-cache mm/madvise.c
>> --- a/mm/madvise.c~madv-pageout-find-swap-cache	2020-03-12 14:24:45.178775035 -0700
>> +++ b/mm/madvise.c	2020-03-12 14:35:49.706773378 -0700
>> @@ -248,6 +248,36 @@ static void force_shm_swapin_readahead(s
>>  #endif		/* CONFIG_SWAP */
>>  
>>  /*
>> + * Given a PTE, find the corresponding 'struct page'.  Also handles
>> + * non-present swap PTEs.
>> + */
>> +struct page *pte_to_reclaim_page(struct vm_area_struct *vma,
>> +				 unsigned long addr, pte_t ptent)
>> +{
>> +	swp_entry_t entry;
>> +
>> +	/* Totally empty PTE: */
>> +	if (pte_none(ptent))
>> +		return NULL;
>> +
>> +	/* A normal, present page is mapped: */
>> +	if (pte_present(ptent))
>> +		return vm_normal_page(vma, addr, ptent);
>> +
> 
> Please check is_swap_pte first.

Why?

is_swap_pte() duplicates the first two checks.  But, I need an explicit
pte_present() check somewhere because I need to call vm_normal_page()
only on present PTEs.

I guess the pte_present() check could be:

	if (!is_swap_pte(ptent))
		return vm_normal_page(...);

*after* the pte_none() check.

>> +	entry = pte_to_swp_entry(vmf->orig_pte);
>> +	/* Is it one of the "swap PTEs" that's not really swap? */
>> +	if (non_swap_entry(entry))
>> +		return false;
>> +
>> +	/*
>> +	 * The PTE was a true swap entry.  The page may be in the
>> +	 * swap cache.  If so, find it and return it so it may be
>> +	 * reclaimed.
>> +	 */
>> +	return lookup_swap_cache(entry, vma, addr);
> 
> If we go with handling only exclusived owned page for anon,
> I think we should apply the rule to swap cache, too.

I'm going back and forth on it.  If we're just trying to avoid causing
faults in other processes, we could add a mapcount>0 check here in
addition to the mapcount>1 checks that were added in the other patch.

But, if we want a check for true exclusivity: no other swap entries or
mappings, we need to check swap_count() too.  It's getting quite a bit
uglier as I add that it, but I guess we'll see how it looks in the end.

> Do you mind posting it as formal patch?

Yeah, I'll send something out.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ