lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <lsq.1510009387.317911469@decadent.org.uk>
Date:   Mon, 06 Nov 2017 23:03:07 +0000
From:   Ben Hutchings <ben@...adent.org.uk>
To:     linux-kernel@...r.kernel.org, stable@...r.kernel.org
CC:     akpm@...ux-foundation.org,
        "Mel Gorman" <mgorman@...hsingularity.net>,
        "Michal Hocko" <mhocko@...e.com>,
        "Linus Torvalds" <torvalds@...ux-foundation.org>,
        "Dan Williams" <dan.j.williams@...el.com>,
        "Jan Filipcewicz" <jan.filipcewicz@...el.com>,
        "Vlastimil Babka" <vbabka@...e.cz>,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        "Len Brown" <lenb@...nel.org>, "Chen Yu" <yu.c.chen@...el.com>
Subject: [PATCH 3.2 048/147] PM/hibernate: touch NMI watchdog when
 creating snapshot

3.2.95-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Chen Yu <yu.c.chen@...el.com>

commit 556b969a1cfe2686aae149137fa1dfcac0eefe54 upstream.

There is a problem that when counting the pages for creating the
hibernation snapshot will take significant amount of time, especially on
system with large memory.  Since the counting job is performed with irq
disabled, this might lead to NMI lockup.  The following warning were
found on a system with 1.5TB DRAM:

  Freezing user space processes ... (elapsed 0.002 seconds) done.
  OOM killer disabled.
  PM: Preallocating image memory...
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 27
  CPU: 27 PID: 3128 Comm: systemd-sleep Not tainted 4.13.0-0.rc2.git0.1.fc27.x86_64 #1
  task: ffff9f01971ac000 task.stack: ffffb1a3f325c000
  RIP: 0010:memory_bm_find_bit+0xf4/0x100
  Call Trace:
   swsusp_set_page_free+0x2b/0x30
   mark_free_pages+0x147/0x1c0
   count_data_pages+0x41/0xa0
   hibernate_preallocate_memory+0x80/0x450
   hibernation_snapshot+0x58/0x410
   hibernate+0x17c/0x310
   state_store+0xdf/0xf0
   kobj_attr_store+0xf/0x20
   sysfs_kf_write+0x37/0x40
   kernfs_fop_write+0x11c/0x1a0
   __vfs_write+0x37/0x170
   vfs_write+0xb1/0x1a0
   SyS_write+0x55/0xc0
   entry_SYSCALL_64_fastpath+0x1a/0xa5
  ...
  done (allocated 6590003 pages)
  PM: Allocated 26360012 kbytes in 19.89 seconds (1325.28 MB/s)

It has taken nearly 20 seconds(2.10GHz CPU) thus the NMI lockup was
triggered.  In case the timeout of the NMI watch dog has been set to 1
second, a safe interval should be 6590003/20 = 320k pages in theory.
However there might also be some platforms running at a lower frequency,
so feed the watchdog every 100k pages.

[yu.c.chen@...el.com: simplification]
  Link: http://lkml.kernel.org/r/1503460079-29721-1-git-send-email-yu.c.chen@intel.com
[yu.c.chen@...el.com: use interval of 128k instead of 100k to avoid modulus]
Link: http://lkml.kernel.org/r/1503328098-5120-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Chen Yu <yu.c.chen@...el.com>
Reported-by: Jan Filipcewicz <jan.filipcewicz@...el.com>
Suggested-by: Michal Hocko <mhocko@...e.com>
Reviewed-by: Michal Hocko <mhocko@...e.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
Cc: Mel Gorman <mgorman@...hsingularity.net>
Cc: Vlastimil Babka <vbabka@...e.cz>
Cc: Len Brown <lenb@...nel.org>
Cc: Dan Williams <dan.j.williams@...el.com>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@...ux-foundation.org>
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings <ben@...adent.org.uk>
---
 mm/page_alloc.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -57,6 +57,7 @@
 #include <linux/ftrace_event.h>
 #include <linux/memcontrol.h>
 #include <linux/prefetch.h>
+#include <linux/nmi.h>
 
 #include <asm/tlbflush.h>
 #include <asm/div64.h>
@@ -1124,9 +1125,14 @@ void drain_all_pages(void)
 
 #ifdef CONFIG_HIBERNATION
 
+/*
+ * Touch the watchdog for every WD_PAGE_COUNT pages.
+ */
+#define WD_PAGE_COUNT	(128*1024)
+
 void mark_free_pages(struct zone *zone)
 {
-	unsigned long pfn, max_zone_pfn;
+	unsigned long pfn, max_zone_pfn, page_count = WD_PAGE_COUNT;
 	unsigned long flags;
 	int order, t;
 	struct list_head *curr;
@@ -1141,6 +1147,11 @@ void mark_free_pages(struct zone *zone)
 		if (pfn_valid(pfn)) {
 			struct page *page = pfn_to_page(pfn);
 
+			if (!--page_count) {
+				touch_nmi_watchdog();
+				page_count = WD_PAGE_COUNT;
+			}
+
 			if (!swsusp_page_is_forbidden(page))
 				swsusp_unset_page_free(page);
 		}
@@ -1150,8 +1161,13 @@ void mark_free_pages(struct zone *zone)
 			unsigned long i;
 
 			pfn = page_to_pfn(list_entry(curr, struct page, lru));
-			for (i = 0; i < (1UL << order); i++)
+			for (i = 0; i < (1UL << order); i++) {
+				if (!--page_count) {
+					touch_nmi_watchdog();
+					page_count = WD_PAGE_COUNT;
+				}
 				swsusp_set_page_free(pfn_to_page(pfn + i));
+			}
 		}
 	}
 	spin_unlock_irqrestore(&zone->lock, flags);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ