[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <548FF2BE.4060601@suse.cz>
Date: Tue, 16 Dec 2014 09:52:14 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Joonsoo Kim <iamjoonsoo.kim@....com>
CC: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org,
"Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
David Rientjes <rientjes@...gle.com>,
Johannes Weiner <hannes@...xchg.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Mel Gorman <mgorman@...e.de>, Michal Hocko <mhocko@...e.cz>,
Minchan Kim <minchan@...nel.org>,
Rik van Riel <riel@...hat.com>,
Zhang Yanfei <zhangyanfei@...fujitsu.com>
Subject: Re: [PATCH v2 0/3] page stealing tweaks
On 12/16/2014 03:54 AM, Joonsoo Kim wrote:
> On Mon, Dec 15, 2014 at 10:05:22AM +0100, Vlastimil Babka wrote:
>> On 12/15/2014 08:50 AM, Joonsoo Kim wrote:
>>> On Fri, Dec 12, 2014 at 05:01:22PM +0100, Vlastimil Babka wrote:
>>>> Changes since v1:
>>>> o Reorder patch 2 and 3, Cc stable for patch 1
>>>> o Fix tracepoint in patch 1 (Joonsoo Kim)
>>>> o Cleanup in patch 2 (suggested by Minchan Kim)
>>>> o Improved comments and changelogs per Minchan and Mel.
>>>> o Considered /proc/pagetypeinfo in evaluation with 3.18 as baseline
>>>>
>>>> When studying page stealing, I noticed some weird looking decisions in
>>>> try_to_steal_freepages(). The first I assume is a bug (Patch 1), the following
>>>> two patches were driven by evaluation.
>>>>
>>>> Testing was done with stress-highalloc of mmtests, using the
>>>> mm_page_alloc_extfrag tracepoint and postprocessing to get counts of how often
>>>> page stealing occurs for individual migratetypes, and what migratetypes are
>>>> used for fallbacks. Arguably, the worst case of page stealing is when
>>>> UNMOVABLE allocation steals from MOVABLE pageblock. RECLAIMABLE allocation
>>>> stealing from MOVABLE allocation is also not ideal, so the goal is to minimize
>>>> these two cases.
>>>>
>>>> For some reason, the first patch increased the number of page stealing events
>>>> for MOVABLE allocations in the former evaluation with 3.17-rc7 + compaction
>>>> patches. In theory these events are not as bad, and the second patch does more
>>>> than just to correct this. In v2 evaluation based on 3.18, the weird result
>>>> was gone completely.
>>>>
>>>> In v2 I also checked if /proc/pagetypeinfo has shown an increase of the number
>>>> of unmovable/reclaimable pageblocks during and after the test, and it didn't.
>>>> The test was repeated 25 times with reboot only after each 5 to show
>>>> longer-term differences in the state of the system, which also wasn't the case.
>>>>
>>>> Extfrag events summed over first iteration after reboot (5 repeats)
>>>> 3.18 3.18 3.18 3.18
>>>> 0-nothp-1 1-nothp-1 2-nothp-1 3-nothp-1
>>>> Page alloc extfrag event 4547160 4593415 2343438 2198189
>>>> Extfrag fragmenting 4546361 4592610 2342595 2196611
>>>> Extfrag fragmenting for unmovable 5725 9196 5720 1093
>>>> Extfrag fragmenting unmovable placed with movable 3877 4091 1330 859
>>>> Extfrag fragmenting for reclaimable 770 628 511 616
>>>> Extfrag fragmenting reclaimable placed with movable 679 520 407 492
>>>> Extfrag fragmenting for movable 4539866 4582786 2336364 2194902
>>>>
>>>> Compared to v1 this looks like a regression for patch 1 wrt unmovable events,
>>>> but I blame noise and less repeats (it was 10 in v1). On the other hand, the
>>>> the mysterious increase in movable allocation events in v1 is gone (due to
>>>> different baseline?)
>>>
>>> Hmm... the result on patch 2 looks odd.
>>> Because you reorder patches, patch 2 have some effects on unmovable
>>> stealing and I expect that 'Extfrag fragmenting for unmovable' decreases.
>>> But, the result looks not. Is there any reason you think?
>>
>> Hm, I don't see any obvious reason.
>>
>>> And, could you share compaction success rate and allocation success
>>> rate on each iteration? In fact, reducing Extfrag event isn't our goal.
>>> It is natural result of this patchset because we steal pages more
>>> aggressively. Our utimate goal is to make the system less fragmented
>>> and to get more high order freepage, so I'd like to know this results.
>>
>> I don't think there's much significant difference. Could be a limitation
>> of the benchmark. But even if there's no difference, it means the reduction
>> of fragmenting events at least saves time on allocations.
>
> Hmm... Allocation success rate of 3-nothp-N on phase 1,2 shows minor degradation
> from 2-nothp-N and compaction success rate also decreases. Isn't it?
> I think that allocation success rate on phase 1 is important because
> workload in phase 1 mostly resemble real world scenario. Do you have
> any idea why this happens?
It could be just noise, keep in mind that each 3-nothp-N is averaged
from just from 5 repeats. And the iterations without reboot (N) are not
independent, so if there's some "bad luck" upon boot, it will carry to
all N of 3-nothp-N.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists