lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Jan 2023 18:33:45 +0100
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     patches@...ts.linux.dev, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, regressions@...mhuis.info,
        Jiri Slaby <jirislaby@...nel.org>,
        Maxim Levitsky <mlevitsk@...hat.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...nel.org>,
        Pedro Falcato <pedro.falcato@...il.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Chuyi Zhou <zhouchuyi@...edance.com>,
        Vlastimil Babka <vbabka@...e.cz>, stable@...r.kernel.org
Subject: [PATCH for 6.1 regression] Revert "mm/compaction: fix set skip in fast_find_migrateblock"

This reverts commit 7efc3b7261030da79001c00d92bc3392fd6c664c.

We have got openSUSE reports (Link 1) for 6.1 kernel with khugepaged
stalling CPU for long periods of time. Investigation of tracepoint data
shows that compaction is stuck in repeating fast_find_migrateblock()
based migrate page isolation, and then fails to migrate all isolated
pages. Commit 7efc3b726103 ("mm/compaction: fix set skip in
fast_find_migrateblock") was suspected as it was merged in 6.1 and in
theory can indeed remove a termination condition for
fast_find_migrateblock() under certain conditions, as it removes a place
that always marks a scanned pageblock from being re-scanned. There are
other such places, but those can be skipped under certain conditions,
which seems to match the tracepoint data.

Testing of revert also appears to have resolved the issue, thus revert
the commit until a more robust solution for the original problem is
developed.

It's also likely this will fix qemu stalls with 6.1 kernel reported in
Link 2, but that is not yet confirmed.

Link: https://bugzilla.suse.com/show_bug.cgi?id=1206848
Link: https://lore.kernel.org/kvm/b8017e09-f336-3035-8344-c549086c2340@kernel.org/
Fixes: 7efc3b726103 ("mm/compaction: fix set skip in fast_find_migrateblock")
Cc: <stable@...r.kernel.org>
---
 mm/compaction.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/compaction.c b/mm/compaction.c
index ca1603524bbe..8238e83385a7 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1839,6 +1839,7 @@ static unsigned long fast_find_migrateblock(struct compact_control *cc)
 					pfn = cc->zone->zone_start_pfn;
 				cc->fast_search_fail = 0;
 				found_block = true;
+				set_pageblock_skip(freepage);
 				break;
 			}
 		}
-- 
2.39.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ