linux-kernel - [PATCH -v3 0/5] OOM vs PM freezer fixes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1420801555-22659-1-git-send-email-mhocko@suse.cz>
Date:	Fri,  9 Jan 2015 12:05:50 +0100
From:	Michal Hocko <mhocko@...e.cz>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Tejun Heo <tj@...nel.org>,
	"\\\"Rafael J. Wysocki\\\"" <rjw@...ysocki.net>,
	David Rientjes <rientjes@...gle.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Cong Wang <xiyou.wangcong@...il.com>, <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>, linux-pm@...r.kernel.org
Subject: [PATCH -v3 0/5] OOM vs PM freezer fixes

Hi,
this is an updated version of the patchset previous posted here:
http://marc.info/?l=linux-mm&m=141779771518056&w=2. 
Changes since then are:
- cleanups, doc and function renames as per Tejun
- __thaw_task moved to mark_tsk_oom_victim and frozen() check removed
  as it would be racy and it is not necessary anyway - per Tejun
- obvious typo in wait_event condition
- oom_killer_enable moved to thaw_processes before user tasks are thawed
  rather than thaw_kernel_threads which is even not called from s2ram resume
  path - per Tejun
- oom_killer_disable moved to freeze_processes to be more in sync with
  the enable.

I have tested the series in KVM with 100M RAM:
- many small tasks (20M anon mmap) which are triggering OOM continually
- s2ram which resumes automatically is triggered in a loop
	echo processors > /sys/power/pm_test
	while true
	do
		echo mem > /sys/power/state
		sleep 1s
	done
- simple module which allocates and frees 20M in 8K chunks. If it sees
  freezing(current) then it tries another round of allocation before calling
  try_to_freeze
- debugging messages of PM stages and OOM killer enable/disable/fail added
  and unmark_oom_victim is delayed by 1s after it clears TIF_MEMDIE and before
  it wakes up waiters.
- rebased on top of the current mmotm which means some necessary updates
  in mm/oom_kill.c. mark_tsk_oom_victim is now called under task_lock but
  I think this should be OK because __thaw_task shouldn't interfere with any
  locking down wake_up_process. Oleg?

As expected there are no OOM killed tasks after oom is disabled and
allocations requested by the kernel thread are failing after all the
tasks are frozen and OOM disabled. I wasn't able to catch a race where
oom_killer_disable would really have to wait but I kinda expected the
race is really unlikely.

[  242.609330] Killed process 2992 (mem_eater) total-vm:24412kB, anon-rss:2164kB, file-rss:4kB
[  243.628071] Unmarking 2992 OOM victim. oom_victims: 1
[  243.636072] (elapsed 2.837 seconds) done.
[  243.641985] Trying to disable OOM killer
[  243.643032] Waiting for concurent OOM victims
[  243.644342] OOM killer disabled
[  243.645447] Freezing remaining freezable tasks ... (elapsed 0.005 seconds) done.
[  243.652983] Suspending console(s) (use no_console_suspend to debug)
[  243.903299] kmem_eater: page allocation failure: order:1, mode:0x204010
[...]
[  243.992600] PM: suspend of devices complete after 336.667 msecs
[  243.993264] PM: late suspend of devices complete after 0.660 msecs
[  243.994713] PM: noirq suspend of devices complete after 1.446 msecs
[  243.994717] ACPI: Preparing to enter system sleep state S3
[  243.994795] PM: Saving platform NVS memory
[  243.994796] Disabling non-boot CPUs ...

The first 2 patches are simple cleanups for OOM. They should go in
regardless the rest IMO.
Patches 3 and 4 are trivial printk -> pr_info conversion and they should
go in ditto.
The main patch is the last one and I would appreciate acks from Tejun
and Rafael. I think the OOM part should be OK (except for __thaw_task
vs. task_lock where a look from Oleg would appreciated) but I am not
so sure I haven't screwed anything in the freezer code. I have found
several surprises there.

The patchset is based on the current mmotm tree (mmotm-2015-01-07-17-07).
I think it make more sense if it is routed via Andrew due to dependences on
other OOM killer patches.

Shortlog says:
Michal Hocko (5):
      oom: add helpers for setting and clearing TIF_MEMDIE
      oom: thaw the OOM victim if it is frozen
      PM: convert printk to pr_* equivalent
      sysrq: convert printk to pr_* equivalent
      oom, PM: make OOM detection in the freezer path raceless

And diffstat:
 drivers/staging/android/lowmemorykiller.c |   7 +-
 drivers/tty/sysrq.c                       |  23 ++---
 include/linux/oom.h                       |  18 ++--
 kernel/exit.c                             |   3 +-
 kernel/power/process.c                    |  76 +++++----------
 mm/memcontrol.c                           |   4 +-
 mm/oom_kill.c                             | 149 ++++++++++++++++++++++++++----
 mm/page_alloc.c                           |  17 +---
 8 files changed, 185 insertions(+), 112 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/