lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Fri, 14 Sep 2012 10:15:50 +0900
From:	Minchan Kim <minchan@...nel.org>
To:	Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>
Cc:	Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	Andrew Morton <akpm@...ux-foundation.org>,
	Michal Nazarewicz <mina86@...a86.com>,
	Mel Gorman <mel@....ul.ie>,
	Wen Congyang <wency@...fujitsu.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	Marek Szyprowski <m.szyprowski@...sung.com>
Subject: Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from
 free_area->free_list

Hi Bart,

On Thu, Sep 13, 2012 at 04:21:42PM +0200, Bartlomiej Zolnierkiewicz wrote:
> On Thursday 06 September 2012 18:34:35 Bartlomiej Zolnierkiewicz wrote:
> > 
> > Hi,
> > 
> > On Thursday 06 September 2012 04:53:38 Minchan Kim wrote:
> > > Normally, MIGRATE_ISOLATE type is used for memory-hotplug.
> > > But it's irony type because the pages isolated would exist
> > > as free page in free_area->free_list[MIGRATE_ISOLATE] so people
> > > can think of it as allocatable pages but it is *never* allocatable.
> > > It ends up confusing NR_FREE_PAGES vmstat so it would be
> > > totally not accurate so some of place which depend on such vmstat
> > > could reach wrong decision by the context.
> > > 
> > > There were already report about it.[1]
> > > [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem
> > > 
> > > Then, there was other report which is other problem.[2]
> > > [2] http://www.spinics.net/lists/linux-mm/msg41251.html
> > > 
> > > I believe it can make problems in future, too.
> > > So I hope removing such irony type by another design.
> > > 
> > > I hope this patch solves it and let's revert [1] and doesn't need [2].
> 
> For our needs (CMA) patch [2] is much simpler / less intrusive way
> to have correct NR_FREE_PAGES counter than this patch and currently
> I would prefer to have it merged upstream instead of this one.

I agree my patch could be somewhat big change so I will take things easy.
Of course, it shouldn't prevent your patch merge if yours make sense.
Afterward, if this patch solves all issues and better than other band-aid,
then, we can revert.
Shortly, I will review yours.

> 
> > > * Changelog v1
> > >  * Fix from Michal's many suggestion
> > > 
> > > Cc: Michal Nazarewicz <mina86@...a86.com>
> > > Cc: Mel Gorman <mel@....ul.ie>
> > > Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
> > > Cc: Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>
> > > Cc: Wen Congyang <wency@...fujitsu.com>
> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
> > > Signed-off-by: Minchan Kim <minchan@...nel.org>
> > > ---
> > > It's very early version which show the concept so I still marked it with RFC.
> > > I just tested it with simple test and works.
> > > This patch is needed indepth review from memory-hotplug guys from fujitsu
> > > because I saw there are lots of patches recenlty they sent to about
> > > memory-hotplug change. Please take a look at this patch.
> > 
> > [...]
> > 
> > > @@ -948,8 +954,13 @@ static int move_freepages(struct zone *zone,
> > >  		}
> > >  
> > >  		order = page_order(page);
> > > -		list_move(&page->lru,
> > > -			  &zone->free_area[order].free_list[migratetype]);
> > > +		if (migratetype != MIGRATE_ISOLATE) {
> > > +			list_move(&page->lru,
> > > +				&zone->free_area[order].free_list[migratetype]);
> > > +		} else {
> > > +			list_del(&page->lru);
> > > +			isolate_free_page(page, order);
> > > +		}
> > >  		page += 1 << order;
> > >  		pages_moved += 1 << order;
> > >  	}
> > 
> > Shouldn't NR_FREE_PAGES counter be decreased somewhere above?
> > 
> > [ I can see that it is not modified in __free_pages_ok() and
> >   free_hot_cold_page() because page is still counted as non-free one but
> >   here situation is different AFAICS. ]
> > 
> > I tested the patch locally here with CONFIG_CMA=y and it causes some
> > major problems for CMA (multiple errors from dma_alloc_from_contiguous()
> > about memory ranges being busy and allocation failures).
> > 
> > [ I'm sorry that I don't know more details yet but the issue should be
> >   easily reproducible. ]
> 
> We spent some more time on the issue and it seems that the approach
> taken in the patch (removal of MIGRATE_ISOLATE free_list) is currently
> incompatible with CMA.
> 
> In alloc_contig_range() we have:
> 
> 	order = 0;
> 	outer_start = start;
> 	while (!PageBuddy(pfn_to_page(outer_start))) {
> 		if (++order >= MAX_ORDER) {
> 			ret = -EBUSY;
> 			goto done;
> 		}
> 		outer_start &= ~0UL << order;
> 	}
> 
> for handling cases when the CMA area begins inside the higher order
> page from buddy (that got already isolated).  Unfortunately this code
> no longer works as isolated pages are no longer hold in buddy allocator
> (isolate_free_page() clears buddy bit).
> 
> The other part of code that is probably affected by your patch is:
> 
> 	/* Grab isolated pages from freelists. */
> 	outer_end = isolate_freepages_range(outer_start, end);
> 	if (!outer_end) {
> 		ret = -EBUSY;
> 		goto done;
> 	}
> 
> also in alloc_contig_range().  isolate_freepages_range() calls
> isolate_freepages_block() which assume that free pages (in isolated
> pageblock) are in buddy allocator:
> 
> 		if (!PageBuddy(page)) {
> 			if (strict)
> 				return 0;
> 			continue;
> 		}
> 
> (which is no longer true) and also calls split_free_page() that
> attempts to remove page from the free_list & buddy:
> 
> 	/* Remove page from free list */
> 	list_del(&page->lru);
> 	zone->free_area[order].nr_free--;
> 	rmv_page_order(page);
> 
> (the isolated page is on the isolated_pages list instead).

Thanks for detailed pointing out, Bart!
I will revisit this issues after I will fisnish other jobs.

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ