lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <54606F02.5070808@suse.cz>
Date:	Mon, 10 Nov 2014 08:53:38 +0100
From:	Vlastimil Babka <vbabka@...e.cz>
To:	Joonsoo Kim <iamjoonsoo.kim@....com>
CC:	"P. Christeas" <xrg@...ux.gr>, linux-mm@...ck.org,
	lkml <linux-kernel@...r.kernel.org>,
	David Rientjes <rientjes@...gle.com>,
	Norbert Preining <preining@...ic.at>,
	Markus Trippelsdorf <markus@...ppelsdorf.de>,
	Pavel Machek <pavel@....cz>
Subject: Re: Early test: hangs in mm/compact.c w. Linus's 12d7aacab56e9ef185c

On 11/10/2014 07:07 AM, Joonsoo Kim wrote:
> On Sat, Nov 08, 2014 at 11:18:37PM +0100, Vlastimil Babka wrote:
>> On 11/08/2014 02:11 PM, P. Christeas wrote:
>>
>> Hi,
>>
>> I think I finally found the cause by staring into the code... CCing
>> people from all 4 separate threads I know about this issue.
>> The problem with finding the cause was that the first report I got from
>> Markus was about isolate_freepages_block() overhead, and later Norbert
>> reported that reverting a patch for isolate_freepages* helped. But the
>> problem seems to be that although the loop in isolate_migratepages exits
>> because the scanners almost meet (they are within same pageblock), they
>> don't truly meet, therefore compact_finished() decides to continue, but
>> isolate_migratepages() exits immediately... boom! But indeed e14c720efdd7
>> made this situation possible, as free scaner pfn can now point to a
>> middle of pageblock.
>
> Indeed.
>
>>
>> So I hope the attached patch will fix the soft-lockup issues in
>> compact_zone. Please apply on 3.18-rc3 or later without any other reverts,
>> and test. It probably won't help Markus and his isolate_freepages_block()
>> overhead though...
>
> Yes, I found this bug too, but, it can't explain
> isolate_freepages_block() overhead. Anyway, I can't find another bug
> related to isolate_freepages_block(). :/

Thanks for checking.

>> Thanks,
>> Vlastimil
>>
>> ------8<------
>> >From fbf8eb0bcd2897090312e23da6a31bad9cc6b337 Mon Sep 17 00:00:00 2001
>> From: Vlastimil Babka <vbabka@...e.cz>
>> Date: Sat, 8 Nov 2014 22:20:43 +0100
>> Subject: [PATCH] mm, compaction: prevent endless loop in migrate scanner
>>
>> ---
>>   mm/compaction.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index ec74cf0..1b7a1be 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1029,8 +1029,12 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
>>   	}
>>
>>   	acct_isolated(zone, cc);
>> -	/* Record where migration scanner will be restarted */
>> -	cc->migrate_pfn = low_pfn;
>> +	/*
>> +	 * Record where migration scanner will be restarted. If we end up in
>> +	 * the same pageblock as the free scanner, make the scanners fully
>> +	 * meet so that compact_finished() terminates compaction.
>> +	 */
>> +	cc->migrate_pfn = (end_pfn <= cc->free_pfn) ? low_pfn : cc->free_pfn;
>>
>>   	return cc->nr_migratepages ? ISOLATE_SUCCESS : ISOLATE_NONE;
>>   }
>
> IMHO, proper fix is not to change this logic, but, to change decision
> logic in compact_finished() and in compact_zone(). Maybe helper
> function would be good for readability.

OK but I would think that to fix 3.18 ASAP and not introduce more 
regressions, go with the patch above first, as it is the minimal fix and 
people already test it. Then we can implement your suggestion later as a 
cleanup for 3.19?

Vlastimil

> Thanks.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ