lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5371DDE2.4050506@suse.cz>
Date:	Tue, 13 May 2014 10:54:58 +0200
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Joonsoo Kim <iamjoonsoo.kim@....com>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	David Rientjes <rientjes@...gle.com>,
	Hugh Dickins <hughd@...gle.com>,
	Greg Thelen <gthelen@...gle.com>, linux-kernel@...r.kernel.org,
	linux-mm@...ck.org, Minchan Kim <minchan@...nel.org>,
	Mel Gorman <mgorman@...e.de>,
	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>,
	Michal Nazarewicz <mina86@...a86.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Christoph Lameter <cl@...ux.com>,
	Rik van Riel <riel@...hat.com>
Subject: Re: [PATCH] mm, compaction: properly signal and act upon lock and
 need_sched() contention

On 05/13/2014 02:44 AM, Joonsoo Kim wrote:
> On Mon, May 12, 2014 at 04:15:11PM +0200, Vlastimil Babka wrote:
>> Compaction uses compact_checklock_irqsave() function to periodically check for
>> lock contention and need_resched() to either abort async compaction, or to
>> free the lock, schedule and retake the lock. When aborting, cc->contended is
>> set to signal the contended state to the caller. Two problems have been
>> identified in this mechanism.
>>
>> First, compaction also calls directly cond_resched() in both scanners when no
>> lock is yet taken. This call either does not abort async compaction, or set
>> cc->contended appropriately. This patch introduces a new
>> compact_check_resched() function to achieve both.
>>
>> Second, isolate_freepages() does not check if isolate_freepages_block()
>> aborted due to contention, and advances to the next pageblock. This violates
>> the principle of aborting on contention, and might result in pageblocks not
>> being scanned completely, since the scanning cursor is advanced. This patch
>> makes isolate_freepages_block() check the cc->contended flag and abort.
>>
>> Reported-by: Joonsoo Kim <iamjoonsoo.kim@....com>
>> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
>> Cc: Minchan Kim <minchan@...nel.org>
>> Cc: Mel Gorman <mgorman@...e.de>
>> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>
>> Cc: Michal Nazarewicz <mina86@...a86.com>
>> Cc: Naoya Horiguchi <n-horiguchi@...jp.nec.com>
>> Cc: Christoph Lameter <cl@...ux.com>
>> Cc: Rik van Riel <riel@...hat.com>
>> ---
>>   mm/compaction.c | 40 +++++++++++++++++++++++++++++++++-------
>>   1 file changed, 33 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 83ca6f9..b34ab7c 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -222,6 +222,27 @@ static bool compact_checklock_irqsave(spinlock_t *lock, unsigned long *flags,
>>   	return true;
>>   }
>>
>> +/*
>> + * Similar to compact_checklock_irqsave() (see its comment) for places where
>> + * a zone lock is not concerned.
>> + *
>> + * Returns false when compaction should abort.
>> + */
>> +static inline bool compact_check_resched(struct compact_control *cc)
>> +{
>> +	/* async compaction aborts if contended */
>> +	if (need_resched()) {
>> +		if (cc->mode == MIGRATE_ASYNC) {
>> +			cc->contended = true;
>> +			return false;
>> +		}
>> +
>> +		cond_resched();
>> +	}
>> +
>> +	return true;
>> +}
>> +
>>   /* Returns true if the page is within a block suitable for migration to */
>>   static bool suitable_migration_target(struct page *page)
>>   {
>> @@ -491,11 +512,8 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
>>   			return 0;
>>   	}
>>
>> -	if (cond_resched()) {
>> -		/* Async terminates prematurely on need_resched() */
>> -		if (cc->mode == MIGRATE_ASYNC)
>> -			return 0;
>> -	}
>> +	if (!compact_check_resched(cc))
>> +		return 0;
>>
>>   	/* Time to isolate some pages for migration */
>>   	for (; low_pfn < end_pfn; low_pfn++) {
>> @@ -718,9 +736,10 @@ static void isolate_freepages(struct zone *zone,
>>   		/*
>>   		 * This can iterate a massively long zone without finding any
>>   		 * suitable migration targets, so periodically check if we need
>> -		 * to schedule.
>> +		 * to schedule, or even abort async compaction.
>>   		 */
>> -		cond_resched();
>> +		if (!compact_check_resched(cc))
>> +			break;
>>
>>   		if (!pfn_valid(block_start_pfn))
>>   			continue;
>> @@ -758,6 +777,13 @@ static void isolate_freepages(struct zone *zone,
>>   		 */
>>   		if (isolated)
>>   			cc->finished_update_free = true;
>> +
>> +		/*
>> +		 * isolate_freepages_block() might have aborted due to async
>> +		 * compaction being contended
>> +		 */
>> +		if (cc->contended)
>> +			break;
>>   	}
>
> Hello,
>
> I think that we can do further.
>
> The problem is that this cc->contended is checked only in
> isolate_migratepages() to break out the compaction. So if there are
> free pages we are already taken, compaction wouldn't stopped
> immediately and isolate_freepages() could be invoked again on next
> compaction_alloc(). If there is no contention at this time, we would try
> to get free pages from one pageblock because cc->contended checking is
> on bottom of the loop in isolate_migratepages() and will continue to
> run compaction. AFAIK, we want to stop the compaction in this case.
>
> Moreover, if this isolate_freepages() don't stop the compaction,
> next isolate_migratepages() will be invoked and it would be stopped
> by checking cc->contended after isolating some pages for migration.
> This is useless overhead so should be removed.

Good catch again, thanks! So that means checking the flag also in 
compaction_alloc(). But what to do if we managed isolated something and 
then found out about being contended? Put all pages back and go home, or 
try to migrate what we have?

I'm becoming worried that all these changes will mean that async 
compaction will have near zero probability of finishing anything before 
hitting a contention. And then everything it did until the contention 
would be a wasted work.

> Thanks.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ