linux-kernel - Re: [PATCH v5 02/14] mm, compaction: defer each zone individually instead of preferred zone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <53D91BC9.7080506@suse.cz>
Date:	Wed, 30 Jul 2014 18:22:33 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Joonsoo Kim <iamjoonsoo.kim@....com>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	linux-kernel@...r.kernel.org, linux-mm@...r.kernel.org,
	Michal Nazarewicz <mina86@...a86.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Christoph Lameter <cl@...ux.com>,
	Rik van Riel <riel@...hat.com>, Mel Gorman <mgorman@...e.de>,
	Minchan Kim <minchan@...nel.org>,
	Zhang Yanfei <zhangyanfei@...fujitsu.com>
Subject: Re: [PATCH v5 02/14] mm, compaction: defer each zone individually
 instead of preferred zone

On 07/29/2014 11:12 AM, Vlastimil Babka wrote:
> On 07/29/2014 08:38 AM, Joonsoo Kim wrote:
>>>    /* Return values for compact_zone() and try_to_compact_pages() */
>>> +/* compaction didn't start as it was deferred due to past failures */
>>> +#define COMPACT_DEFERRED	0
>>>    /* compaction didn't start as it was not possible or direct reclaim was more suitable */
>>> -#define COMPACT_SKIPPED		0
>>> +#define COMPACT_SKIPPED		1
>>
>> Hello,
>>
>> This change makes some users of compaction_suitable() failed
>> unintentionally, because they assume that COMPACT_SKIPPED is 0.
>> Please fix them according to this change.
>
> Oops, good catch. Thanks!
>
>>> @@ -2324,27 +2327,31 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
>>>    				order, zonelist, high_zoneidx,
>>>    				alloc_flags & ~ALLOC_NO_WATERMARKS,
>>>    				preferred_zone, classzone_idx, migratetype);
>>> +
>>>    		if (page) {
>>> -			preferred_zone->compact_blockskip_flush = false;
>>> -			compaction_defer_reset(preferred_zone, order, true);
>>> +			struct zone *zone = page_zone(page);
>>> +
>>> +			zone->compact_blockskip_flush = false;
>>> +			compaction_defer_reset(zone, order, true);
>>>    			count_vm_event(COMPACTSUCCESS);
>>>    			return page;
>>>    		}
>>>
>>>    		/*
>>> +		 * last_compact_zone is where try_to_compact_pages thought
>>> +		 * allocation should succeed, so it did not defer compaction.
>>> +		 * But now we know that it didn't succeed, so we do the defer.
>>> +		 */
>>> +		if (last_compact_zone && mode != MIGRATE_ASYNC)
>>> +			defer_compaction(last_compact_zone, order);
>>
>> I still don't understand why defer_compaction() is needed here.
>> defer_compaction() is intended for not struggling doing compaction on
>> the zone where we already have tried compaction and found that it
>> isn't suitable for compaction. Allocation failure doesn't tell us
>> that we have tried compaction for all the zone range so we shouldn't
>> make a decision here to defer compaction on this zone carelessly.
>
> OK I can remove that, it should make the code nicer anyway.

Weird, that removal of this defer_compaction() call seems ho have 
quadrupled compact_stall and compact_fail counts. The scanner pages 
counters however increased by only 10% so that could indicate the 
problem is occuring only in a small zone such as DMA. Could be another 
case of mismatch between watermark checking in compaction and 
allocation? Perhaps the lack of proper classzone_idx in the compaction 
check? Sigh.

> I also agree
> with the argument "for all the zone range" and I also realized that it's
> not (both before and after this patch) really the case. I planned to fix
> that in the future, but I can probably do it now.
> The plan is to call defer_compaction() only when compaction returned
> COMPACT_COMPLETE (and not COMPACT_PARTIAL) as it means the whole zone
> was scanned. Otherwise there will be bias towards the beginning of the
> zone in the migration scanner - compaction will be deferred half-way and
> then cached pfn's might be reset when it restarts, and the rest of the
> zone won't be scanned at all.

Hm despite expectations, this didn't seem to make much difference. But 
maybe there will be once I have some idea what happened to those stalls.

>> Thanks.
>>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/