lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 27 Nov 2023 21:42:17 +1300
From:   Barry Song <21cnbao@...il.com>
To:     david@...hat.com
Cc:     akpm@...ux-foundation.org, andreyknvl@...il.com,
        anshuman.khandual@....com, ardb@...nel.org,
        catalin.marinas@....com, dvyukov@...gle.com, glider@...gle.com,
        james.morse@....com, jhubbard@...dia.com,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, mark.rutland@....com, maz@...nel.org,
        oliver.upton@...ux.dev, ryabinin.a.a@...il.com,
        ryan.roberts@....com, suzuki.poulose@....com,
        vincenzo.frascino@....com, wangkefeng.wang@...wei.com,
        will@...nel.org, willy@...radead.org, yuzenghui@...wei.com,
        yuzhao@...gle.com, ziy@...dia.com
Subject: Re: Re: [PATCH v2 01/14] mm: Batch-copy PTE ranges during fork()

>> +		for (i = 0; i < nr; i++, page++) {
>> +			if (anon) {
>> +				/*
>> +				 * If this page may have been pinned by the
>> +				 * parent process, copy the page immediately for
>> +				 * the child so that we'll always guarantee the
>> +				 * pinned page won't be randomly replaced in the
>> +				 * future.
>> +				 */
>> +				if (unlikely(page_try_dup_anon_rmap(
>> +						page, false, src_vma))) {
>> +					if (i != 0)
>> +						break;
>> +					/* Page may be pinned, we have to copy. */
>> +					return copy_present_page(
>> +						dst_vma, src_vma, dst_pte,
>> +						src_pte, addr, rss, prealloc,
>> +						page);
>> +				}
>> +				rss[MM_ANONPAGES]++;
>> +				VM_BUG_ON(PageAnonExclusive(page));
>> +			} else {
>> +				page_dup_file_rmap(page, false);
>> +				rss[mm_counter_file(page)]++;
>> +			}
>>   		}
>> -		rss[MM_ANONPAGES]++;
>> -	} else if (page) {
>> -		folio_get(folio);
>> -		page_dup_file_rmap(page, false);
>> -		rss[mm_counter_file(page)]++;
>> +
>> +		nr = i;
>> +		folio_ref_add(folio, nr);
> 
> You're changing the order of mapcount vs. refcount increment. Don't. 
> Make sure your refcount >= mapcount.
> 
> You can do that easily by doing the folio_ref_add(folio, nr) first and 
> then decrementing in case of error accordingly. Errors due to pinned 
> pages are the corner case.
> 
> I'll note that it will make a lot of sense to have batch variants of 
> page_try_dup_anon_rmap() and page_dup_file_rmap().
> 

i still don't understand why it is not a entire map+1, but an increment
in each basepage.

as long as it is a CONTPTE large folio, there is no much difference with
PMD-mapped large folio. it has all the chance to be DoubleMap and need
split.

When A and B share a CONTPTE large folio, we do madvise(DONTNEED) or any
similar things on a part of the large folio in process A,

this large folio will have partially mapped subpage in A (all CONTPE bits
in all subpages need to be removed though we only unmap a part of the
large folioas HW requires consistent CONTPTEs); and it has entire map in
process B(all PTEs are still CONPTES in process B).

isn't it more sensible for this large folios to have entire_map = 0(for
process B), and subpages which are still mapped in process A has map_count
=0? (start from -1).

> Especially, the batch variant of page_try_dup_anon_rmap() would only 
> check once if the folio maybe pinned, and in that case, you can simply 
> drop all references again. So you either have all or no ptes to process, 
> which makes that code easier.
> 
> But that can be added on top, and I'll happily do that.
> 
> -- 
> Cheers,
> 
> David / dhildenb

Thanks
Barry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ