linux-kernel - Re: [PATCH 4/5] mm, compaction: always update cached scanner positions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20141104002850.GA8412@js1304-P5Q-DELUXE>
Date:	Tue, 4 Nov 2014 09:28:50 +0900
From:	Joonsoo Kim <iamjoonsoo.kim@....com>
To:	Vlastimil Babka <vbabka@...e.cz>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, Minchan Kim <minchan@...nel.org>,
	Mel Gorman <mgorman@...e.de>,
	Michal Nazarewicz <mina86@...a86.com>,
	Naoya Horiguchi <n-horiguchi@...jp.nec.com>,
	Christoph Lameter <cl@...ux.com>,
	Rik van Riel <riel@...hat.com>,
	David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH 4/5] mm, compaction: always update cached scanner
 positions

On Fri, Oct 31, 2014 at 04:53:44PM +0100, Vlastimil Babka wrote:
> On 10/28/2014 08:08 AM, Joonsoo Kim wrote:
> >>
> >>>And, I guess that pageblock skip feature effectively disable pageblock
> >>>rescanning if there is no freepage during rescan.
> >>
> >>If there's no freepage during rescan, then the cached free_pfn also
> >>won't be pointed to the pageblock anymore. Regardless of pageblock skip
> >>being set, there will not be second rescan. But there will still be the
> >>first rescan to determine there are no freepages.
> >
> >Yes, What I'd like to say is that these would work well. Just decreasing
> >few percent of scanning page doesn't look good to me to validate this
> >patch, because there is some facilities to reduce rescan overhead and
> 
> The mechanisms have a tradeoff, while this patch didn't seem to have
> negative consequences.
> 
> >compaction is fundamentally time-consuming process. Moreover, failure of
> >compaction could cause serious system crash in some cases.
> 
> Relying on successful high-order allocation for not crashing is
> dangerous, success is never guaranteed. Such critical allocation
> should try harder than fail due to a single compaction attempt. With
> this argument you could aim to remove all the overhead reducing
> heuristics.
> 
> >>>This patch would
> >>>eliminate effect of pageblock skip feature.
> >>
> >>I don't think so (as explained above). Also if free pages were isolated
> >>(and then returned and skipped over), the pageblock should remain
> >>without skip bit, so after scanners meet and positions reset (which
> >>doesn't go hand in hand with skip bit reset), the next round will skip
> >>over the blocks without freepages and find quickly the blocks where free
> >>pages were skipped in the previous round.
> >>
> >>>IIUC, compaction logic assume that there are many temporary failure
> >>>conditions. Retrying from others would reduce effect of this temporary
> >>>failure so implementation looks as is.
> >>
> >>The implementation of pfn caching was written at time when we did not
> >>keep isolated free pages between migration attempts in a single
> >>compaction run. And the idea of async compaction is to try with minimal
> >>effort (thus latency), and if there's a failure, try somewhere else.
> >>Making sure we don't skip anything doesn't seem productive.
> >
> >free_pfn is shared by async/sync compaction and unconditional updating
> >causes sync compaction to stop prematurely, too.
> >
> >And, if this patch makes migrate/freepage scanner meet more frequently,
> >there is one problematic scenario.
> 
> OK, so you don't find a problem with how this patch changes
> migration scanner caching, just the free scanner, right?
> So how about making release_freepages() return the highest freepage
> pfn it encountered (could perhaps do without comparing individual
> pfn's, the list should be ordered so it could be just the pfn of
> first or last page in the list, but need to check that) and updating
> cached free pfn with that? That should ensure rescanning only when
> needed.

Hello,

Updating cached free pfn in release_freepages() looks good to me.

In fact, I guess that migration scanner also has similar problems, but,
it's just my guess. I admit your following arguments in patch description.

  However, the downside is that potentially many pages are rescanned without
  successful isolation. At worst, there might be a page where isolation from LRU 
  succeeds but migration fails (potentially always).

So, I'm okay if you update cached free pfn in release_freepages().

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/