linux-kernel - Re: [PATCH 15/25] mm, compaction: Finish pageblock scanning on contention

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aa34f8da-a592-48b2-9e86-32a9433d55ce@suse.cz>
Date:   Fri, 18 Jan 2019 09:57:38 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Linux-MM <linux-mm@...ck.org>,
        David Rientjes <rientjes@...gle.com>,
        Andrea Arcangeli <aarcange@...hat.com>, ying.huang@...el.com,
        kirill@...temov.name, Andrew Morton <akpm@...ux-foundation.org>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 15/25] mm, compaction: Finish pageblock scanning on
 contention

On 1/17/19 6:11 PM, Mel Gorman wrote:
> On Thu, Jan 17, 2019 at 05:38:36PM +0100, Vlastimil Babka wrote:
>> > rate but also by the fact that the scanners do not meet for longer when
>> > pageblocks are actually used. Overall this is justified and completing
>> > a pageblock scan is very important for later patches.
>> > 
>> > Signed-off-by: Mel Gorman <mgorman@...hsingularity.net>
>> 
>> Acked-by: Vlastimil Babka <vbabka@...e.cz>
>> 
>> Some comments below.
>> 
> 
> Thanks
> 
>> > @@ -538,18 +535,8 @@ static unsigned long isolate_freepages_block(struct compact_control *cc,
>> >  		 * recheck as well.
>> >  		 */
>> >  		if (!locked) {
>> > -			/*
>> > -			 * The zone lock must be held to isolate freepages.
>> > -			 * Unfortunately this is a very coarse lock and can be
>> > -			 * heavily contended if there are parallel allocations
>> > -			 * or parallel compactions. For async compaction do not
>> > -			 * spin on the lock and we acquire the lock as late as
>> > -			 * possible.
>> > -			 */
>> > -			locked = compact_trylock_irqsave(&cc->zone->lock,
>> > +			locked = compact_lock_irqsave(&cc->zone->lock,
>> >  								&flags, cc);
>> > -			if (!locked)
>> > -				break;
>> 
>> Seems a bit dangerous to continue compact_lock_irqsave() to return bool that
>> however now always returns true, and remove the safety checks that test the
>> result. Easy for somebody in the future to reintroduce some 'return false'
>> condition (even though the name now says lock and not trylock) and start
>> crashing. I would either change it to return void, or leave the checks in place.
>> 
> 
> I considered changing it from bool at the same time as "Rework
> compact_should_abort as compact_check_resched". It turned out to be a
> bit clumsy because the locked state must be explicitly updated in the
> caller then. e.g.
> 
> locked = compact_lock_irqsave(...)
> 
> becomes
> 
> compact_lock_irqsave(...)
> locked = true
> 
> I didn't think the result looked that great to be honest but maybe it's
> worth revisiting as a cleanup patch like "Rework compact_should_abort as
> compact_check_resched" on top.
> 
>> > 
>> > @@ -1411,12 +1395,8 @@ static void isolate_freepages(struct compact_control *cc)
>> >  		isolate_freepages_block(cc, &isolate_start_pfn, block_end_pfn,
>> >  					freelist, false);
>> >  
>> > -		/*
>> > -		 * If we isolated enough freepages, or aborted due to lock
>> > -		 * contention, terminate.
>> > -		 */
>> > -		if ((cc->nr_freepages >= cc->nr_migratepages)
>> > -							|| cc->contended) {
>> 
>> Does it really make sense to continue in the case of free scanner, when we know
>> we will just return back the extra pages in the end? release_freepages() will
>> update the cached pfns, but the pageblock skip bit will stay, so we just leave
>> those pages behind. Unless finishing the block is important for the later
>> patches (as changelog mentions) even in the case of free scanner, but then we
>> can just skip the rest of it, as truly scanning it can't really help anything?
>> 
> 
> Finishing is important for later patches is one factor but not the only
> factor. While we eventually return all pages, we do not know at this
> point in time how many free pages are needed. Remember the migration
> source isolates COMPACT_CLUSTER_MAX pages and then looks for migration
> targets.  If the source isolates 32 pages, free might isolate more from
> one pageblock but that's ok as the migration source may need more free
> pages in the immediate future. It's less wasteful than it looks at first
> glance (or second or even third glance).
> 
> However, if we isolated exactly enough targets, and the pageblock gets
> marked skipped, then each COMPACT_CLUSTER_MAX isolation from the target
> could potentially marge one new pageblock unnecessarily and increase
> scanning+resets overall. That would be bad.
> 
> There still can be waste because we do not know in advance exactly how
> many migration sources there will be -- sure, we could calculate it but
> that involves scanning the source pageblock twice which is wasteful.
> I did try estimating it based on the remaining number of pages in the
> pageblock but the additional complexity did not appear to help.
> 
> Does that make sense?

OK, thanks.