[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53CDB8A6.80801@lge.com>
Date: Tue, 22 Jul 2014 10:04:38 +0900
From: Gioh Kim <gioh.kim@....com>
To: Minchan Kim <minchan@...nel.org>, Mel Gorman <mel@....ul.ie>
CC: Andrew Morton <akpm@...ux-foundation.org>,
'?????????' <iamjoonsoo.kim@....com>,
Laura Abbott <lauraa@...eaurora.org>,
Michal Nazarewicz <mina86@...a86.com>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Johannes Weiner <hannes@...xchg.org>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
????????? <gunho.lee@....com>, 'Chanho Min' <chanho.min@....com>,
linux-fsdevel@...r.kernel.org
Subject: Re: [PATCH] CMA/HOTPLUG: clear buffer-head lru before page migration
2014-07-22 오전 9:15, Minchan Kim 쓴 글:
> Hello Mel,
>
> On Mon, Jul 21, 2014 at 02:01:46PM +0100, Mel Gorman wrote:
>> On Mon, Jul 21, 2014 at 04:36:51PM +0900, Minchan Kim wrote:
>>
>> I'm not reviewing this in detail at all, didn't even look at the patch
>> but two things popped out at me during the discussion.
>>
>>>>> Anyway, why cannot CMA have the cost without affecting other subsystem?
>>>>> I mean it's okay for CMA to consume more time to shoot out the bh
>>>>> instead of simple all bh_lru invalidation because big order allocation is
>>>>> kinds of slow thing in the VM and everybody already know that and even
>>>>> sometime get failed so it's okay to add more code that extremly slow path.
>>>>
>>>> There are 2 reasons to invalidate entire bh_lru.
>>>>
>>>> 1. I think CMA allocation is very rare so that invalidaing bh_lru affects the system little.
>>>> How do you think about it? My platform does not call CMA allocation often.
>>>> Is the CMA allocation or Memory-Hotplug called often?
>>>
>>> It depends on usecase and you couldn't assume anyting because we couldn't
>>> ask every people in the world. "Please ask to us whenever you try to use CMA".
>>>
>>> The key point is how the patch is maintainable.
>>> If it's too complicate to maintain, maybe we could go with simple solution
>>> but if it's not too complicate, we can go with more smart thing to consider
>>> other cases in future. Why not?
>>>
>>> Another point is that how user can detect where the regression is from.
>>> If we cannot notice the regression, it's not a good idea to go with simple
>>> version.
>>>
>>
>> The buffer LRU avoids a lookup of a radix tree. If the LRU hit rate is
>> low then the performance penalty of repeated radix tree lookups is
>> severe but the cost of missing one hot lookup because CMA invalidate it
>> is not.
>>
>> The real cost to be concerned with is the cost of performing the
>> invalidation not the fact a lookup in the LRU was missed. It's because
>> the cost of invalidation is high that this is being pushed to CMA because
>> for CMA an allocation failure can be a functional failure and not just a
>> performance problem.
>>
>>>>
>>>> 2. Adding code in drop_buffers() can affect the system more that adding code in alloc_contig_range()
>>>> because the drop_buffers does not have a way to distinguish migrate type.
>>>> Even-though the lmbech results that it has almost the same performance.
>>>> But I am afraid that it can be changed.
>>>> As you said if bh_lru size can be changed it affects more than now.
>>>> SO I do not want to touch non-CMA related code.
>>>
>>> I'm not saying to add hook in drop_buffers.
>>> What I suggest is to handle failure by bh_lrus in migrate_pages
>>> because it's not a problem only in CMA.
>>
>> No, please do not insert a global IPI to invalidate buffer heads in the
>> general migration case. It's too expensive for either THP allocations or
>> automatic NUMA migrates. The global IPI cost is justified for rare events
>> where it causes functional problems if it fails to migreate -- CMA, memory
>> hot-remove, memory poisoning etc.
>
> I didn't want to add that flushing in migrate_pages *unconditionlly*.
> Please, look at this patch. It fixes only CMA although it's an issue
> for others. Even, it depends on retry logic of upper layer of
> alloc_contig_range but even cma_alloc(ie, upper layer of alloc_contig_range)
> doesn't have retry logic. :(
> That's why I suggested it in migrate_pages.
>
> Actually, I'd like to go with making migrate_pages's user blind on pcp
> draining stuff by squeezing that inside migrate_pages.
> IOW, current users of migrate pages don't need to be aware of per-cpu
> draining. What they should know is just they should use MIGRATE_SYNC
> for best effort but costly opeartion.
>
> For implemenation, we could use retry logic in migrate_pages.
>
> int migrate_pages(xxx)
> {
> for (pass = 0; pass < 10 && retry; pass++)
> if (retry && pass > 2 && mode == MIGRATE_SYNC)
> flush_all_of_percpu_stuff();
> }
>
> migrate_page has migrate_mode and retry logic with 'pass', even
> reason if we want ot filter out MR_CMA|MEMORY_HOTPLUG|MR_MEMORY_FAILURE.
> so that we could handle all of things inside migrate_pages.
>
> Normally, MIGRATE_SYNC would be expensive operation and mostly
> it is used for CMA, memory-hotplug, memory-poisoning so THP and
> automatic NUMA cannot affect so I believe adding IPI to that is not
> a big problem in such trouble condition(ie, retry && pass > 2).
I agree Minchan's point.
I am not sure it is ok to touch the common code such as migrate_pages().
If Mel agrees, I am going to report another patch of flush_all_of_percpu_stuff() like following:
flush_all_of_percpu_stuff()
{
drop_only_bh_of_migrating_page();
lru_add_drain_all();
drain_all_pages();
}
And remove lru_add_drain_all() and drain_all_pages() in CMA/HOTPLUG codes.
>
>>
>> --
>> Mel Gorman
>> SUSE Labs
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@...ck.org. For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@...ck.org"> email@...ck.org </a>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists