lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 19 Feb 2022 09:49:40 -0800
From:   Suren Baghdasaryan <surenb@...gle.com>
To:     akpm@...ux-foundation.org
Cc:     hannes@...xchg.org, mhocko@...e.com, peterz@...radead.org,
        guro@...com, shakeelb@...gle.com, minchan@...nel.org,
        timmurray@...gle.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, kernel-team@...roid.com,
        surenb@...gle.com
Subject: [PATCH 1/1] mm: count time in drain_all_pages during direct reclaim
 as memory pressure

When page allocation in direct reclaim path fails, the system will
make one attempt to shrink per-cpu page lists and free pages from
high alloc reserves. Draining per-cpu pages into buddy allocator can
be a very slow operation because it's done using workqueues and the
task in direct reclaim waits for all of them to finish before
proceeding. Currently this time is not accounted as psi memory stall.

While testing mobile devices under extreme memory pressure, when
allocations are failing during direct reclaim, we notices that psi
events which would be expected in such conditions were not triggered.
After profiling these cases it was determined that the reason for
missing psi events was that a big chunk of time spent in direct
reclaim is not accounted as memory stall, therefore psi would not
reach the levels at which an event is generated. Further investigation
revealed that the bulk of that unaccounted time was spent inside
drain_all_pages call.

Annotate drain_all_pages and unreserve_highatomic_pageblock during
page allocation failure in the direct reclaim path so that delays
caused by these calls are accounted as memory stall.

Reported-by: Tim Murray <timmurray@...gle.com>
Signed-off-by: Suren Baghdasaryan <surenb@...gle.com>
---
 mm/page_alloc.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3589febc6d31..7fd0d392b39b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4639,8 +4639,12 @@ __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
 	 * Shrink them and try again
 	 */
 	if (!page && !drained) {
+		unsigned long pflags;
+
+		psi_memstall_enter(&pflags);
 		unreserve_highatomic_pageblock(ac, false);
 		drain_all_pages(NULL);
+		psi_memstall_leave(&pflags);
 		drained = true;
 		goto retry;
 	}
-- 
2.35.1.473.g83b2b277ed-goog

Powered by blists - more mailing lists