lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <1638523180-4884-1-git-send-email-huangzhaoyang@gmail.com>
Date:   Fri,  3 Dec 2021 17:19:40 +0800
From:   Huangzhaoyang <huangzhaoyang@...il.com>
To:     Nitin Gupta <ngupta@...are.org>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Jens Axboe <axboe@...nel.dk>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Minchan Kim <minchan@...nel.org>,
        Michal Hocko <mhocko@...nel.org>,
        Zhaoyang Huang <zhaoyang.huang@...soc.com>,
        Ingo Molnar <mingo@...hat.com>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: [RFC PATCH] mm: count swap_writepage into PSI_IO STALL

From: Zhaoyang Huang <zhaoyang.huang@...soc.com>

We would like to count swap_writepage into PSI_IO STALL time. There are
two reasons for doing so:
1. Swap_writepage introduces non-productive times. especially under the
   scenario of RAM based swap device.
2. High swappiness value will lead to more anon pages to be swap out.
3. IO pressure is inconsistent to PGSWPOUT.

Signed-off-by: Zhaoyang Huang <zhaoyang.huang@...soc.com>
---
 include/linux/psi.h |  6 ++++++
 kernel/sched/psi.c  | 15 +++++++++++++++
 mm/vmscan.c         | 10 ++++++++++
 3 files changed, 31 insertions(+)

diff --git a/include/linux/psi.h b/include/linux/psi.h
index 65eb147..6eb3a6f 100644
--- a/include/linux/psi.h
+++ b/include/linux/psi.h
@@ -23,6 +23,9 @@ void psi_task_switch(struct task_struct *prev, struct task_struct *next,
 void psi_memstall_enter(unsigned long *flags);
 void psi_memstall_leave(unsigned long *flags);
 
+void psi_iostall_enter(void);
+void psi_iostall_leave(void);
+
 int psi_show(struct seq_file *s, struct psi_group *group, enum psi_res res);
 
 #ifdef CONFIG_CGROUPS
@@ -45,6 +48,9 @@ static inline void psi_init(void) {}
 static inline void psi_memstall_enter(unsigned long *flags) {}
 static inline void psi_memstall_leave(unsigned long *flags) {}
 
+static inline void psi_iostall_enter(void) {}
+static inline void psi_iostall_leave(void) {}
+
 #ifdef CONFIG_CGROUPS
 static inline int psi_cgroup_alloc(struct cgroup *cgrp)
 {
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 923a0d6..643b48c 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -958,6 +958,21 @@ void psi_memstall_leave(unsigned long *flags)
 	rq_unlock_irq(rq, &rf);
 }
 
+void psi_iostall_enter(void)
+{
+	if (static_branch_likely(&psi_disabled))
+		return;
+
+	psi_task_change(current, 0, TSK_IOWAIT);
+}
+
+void psi_iostall_leave(void)
+{
+	if (static_branch_likely(&psi_disabled))
+		return;
+
+	psi_task_change(current, TSK_IOWAIT, 0);
+}
 #ifdef CONFIG_CGROUPS
 int psi_cgroup_alloc(struct cgroup *cgroup)
 {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 74296c2..798907b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1072,7 +1072,17 @@ static pageout_t pageout(struct page *page, struct address_space *mapping)
 		};
 
 		SetPageReclaim(page);
+
+		/*
+		 * For the ram based swap device, there is no chance for reclaim
+		 * context sleeping on the congested IO while it really introduce
+		 * non-productive time. So count the period into PSI_IO.
+		 * Don't worry about the file page, just counting it in as it has
+		 * less chance to be here.
+		 */
+		psi_iostall_enter();
 		res = mapping->a_ops->writepage(page, &wbc);
+		psi_iostall_leave();
 		if (res < 0)
 			handle_write_error(mapping, page, res);
 		if (res == AOP_WRITEPAGE_ACTIVATE) {
-- 
1.9.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ