[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20120914011550.GE5085@bbox>
Date: Fri, 14 Sep 2012 10:15:50 +0900
From: Minchan Kim <minchan@...nel.org>
To: Bartlomiej Zolnierkiewicz <b.zolnierkie@...sung.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Nazarewicz <mina86@...a86.com>,
Mel Gorman <mel@....ul.ie>,
Wen Congyang <wency@...fujitsu.com>,
Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
Marek Szyprowski <m.szyprowski@...sung.com>
Subject: Re: [RFC v2] memory-hotplug: remove MIGRATE_ISOLATE from
free_area->free_list
Hi Bart,
On Thu, Sep 13, 2012 at 04:21:42PM +0200, Bartlomiej Zolnierkiewicz wrote:
> On Thursday 06 September 2012 18:34:35 Bartlomiej Zolnierkiewicz wrote:
> >
> > Hi,
> >
> > On Thursday 06 September 2012 04:53:38 Minchan Kim wrote:
> > > Normally, MIGRATE_ISOLATE type is used for memory-hotplug.
> > > But it's irony type because the pages isolated would exist
> > > as free page in free_area->free_list[MIGRATE_ISOLATE] so people
> > > can think of it as allocatable pages but it is *never* allocatable.
> > > It ends up confusing NR_FREE_PAGES vmstat so it would be
> > > totally not accurate so some of place which depend on such vmstat
> > > could reach wrong decision by the context.
> > >
> > > There were already report about it.[1]
> > > [1] 702d1a6e, memory-hotplug: fix kswapd looping forever problem
> > >
> > > Then, there was other report which is other problem.[2]
> > > [2] http://www.spinics.net/lists/linux-mm/msg41251.html
> > >
> > > I believe it can make problems in future, too.
> > > So I hope removing such irony type by another design.
> > >
> > > I hope this patch solves it and let's revert [1] and doesn't need [2].
>
> For our needs (CMA) patch [2] is much simpler / less intrusive way
> to have correct NR_FREE_PAGES counter than this patch and currently
> I would prefer to have it merged upstream instead of this one.
I agree my patch could be somewhat big change so I will take things easy.
Of course, it shouldn't prevent your patch merge if yours make sense.
Afterward, if this patch solves all issues and better than other band-aid,
then, we can revert.
Shortly, I will review yours.
>
> > > * Changelog v1
> > > * Fix from Michal's many suggestion
> > >
> > > Cc: Michal Nazarewicz <mina86@...a86.com>
> > > Cc: Mel Gorman <mel@....ul.ie>
> > > Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
> > > Cc: Yasuaki Ishimatsu <isimatu.yasuaki@...fujitsu.com>
> > > Cc: Wen Congyang <wency@...fujitsu.com>
> > > Cc: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
> > > Signed-off-by: Minchan Kim <minchan@...nel.org>
> > > ---
> > > It's very early version which show the concept so I still marked it with RFC.
> > > I just tested it with simple test and works.
> > > This patch is needed indepth review from memory-hotplug guys from fujitsu
> > > because I saw there are lots of patches recenlty they sent to about
> > > memory-hotplug change. Please take a look at this patch.
> >
> > [...]
> >
> > > @@ -948,8 +954,13 @@ static int move_freepages(struct zone *zone,
> > > }
> > >
> > > order = page_order(page);
> > > - list_move(&page->lru,
> > > - &zone->free_area[order].free_list[migratetype]);
> > > + if (migratetype != MIGRATE_ISOLATE) {
> > > + list_move(&page->lru,
> > > + &zone->free_area[order].free_list[migratetype]);
> > > + } else {
> > > + list_del(&page->lru);
> > > + isolate_free_page(page, order);
> > > + }
> > > page += 1 << order;
> > > pages_moved += 1 << order;
> > > }
> >
> > Shouldn't NR_FREE_PAGES counter be decreased somewhere above?
> >
> > [ I can see that it is not modified in __free_pages_ok() and
> > free_hot_cold_page() because page is still counted as non-free one but
> > here situation is different AFAICS. ]
> >
> > I tested the patch locally here with CONFIG_CMA=y and it causes some
> > major problems for CMA (multiple errors from dma_alloc_from_contiguous()
> > about memory ranges being busy and allocation failures).
> >
> > [ I'm sorry that I don't know more details yet but the issue should be
> > easily reproducible. ]
>
> We spent some more time on the issue and it seems that the approach
> taken in the patch (removal of MIGRATE_ISOLATE free_list) is currently
> incompatible with CMA.
>
> In alloc_contig_range() we have:
>
> order = 0;
> outer_start = start;
> while (!PageBuddy(pfn_to_page(outer_start))) {
> if (++order >= MAX_ORDER) {
> ret = -EBUSY;
> goto done;
> }
> outer_start &= ~0UL << order;
> }
>
> for handling cases when the CMA area begins inside the higher order
> page from buddy (that got already isolated). Unfortunately this code
> no longer works as isolated pages are no longer hold in buddy allocator
> (isolate_free_page() clears buddy bit).
>
> The other part of code that is probably affected by your patch is:
>
> /* Grab isolated pages from freelists. */
> outer_end = isolate_freepages_range(outer_start, end);
> if (!outer_end) {
> ret = -EBUSY;
> goto done;
> }
>
> also in alloc_contig_range(). isolate_freepages_range() calls
> isolate_freepages_block() which assume that free pages (in isolated
> pageblock) are in buddy allocator:
>
> if (!PageBuddy(page)) {
> if (strict)
> return 0;
> continue;
> }
>
> (which is no longer true) and also calls split_free_page() that
> attempts to remove page from the free_list & buddy:
>
> /* Remove page from free list */
> list_del(&page->lru);
> zone->free_area[order].nr_free--;
> rmv_page_order(page);
>
> (the isolated page is on the isolated_pages list instead).
Thanks for detailed pointing out, Bart!
I will revisit this issues after I will fisnish other jobs.
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists