linux-kernel - Re: kthread: Make kthread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <201309281618.JAG82364.VFMtSOOQOLFHJF@I-love.SAKURA.ne.jp>
Date:	Sat, 28 Sep 2013 16:18:47 +0900
From:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:	rientjes@...gle.com
Cc:	akpm@...ux-foundation.org, oleg@...hat.com, security@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: kthread: Make kthread_create() killable.

David Rientjes wrote:
> There may not be any eligible processes left and then the machine panics.  

Some of enterprise users might prefer "kernel panic followed by kdump and
automatic reboot" to "a system is not responding for unpredictable period", for
the panic helps getting information for analyzing what process caused the
freeze. Well, can they use "Panic (Reboot) On Soft Lockups" option?

> These time-based delays also have caused a complete depletion of memory 
> reserves if more than one process is chosen and each consumes an 
> non-neglible amount of memory which would then cause livelock.  We used to 
> have a jiffies-based rekill in 2.6.18 internally and we finally could 
> remove it when mm->mmap_sem issues were fixed (mostly by checking for 
> fatal_signal_pending() and aborting when necessary).

So, you've already tried that.

Currently the OOM killer kills a process after

  blocking_notifier_call_chain(&oom_notify_list, 0, &freed);

in out_of_memory() released all reclaimable memory. This call helps reducing
the chance to kill a process if the bad process no longer asks for more memory.
But if the bad process continues asking for more memory and the chosen task is
in TASK_UNINTERRUPTIBLE state, this call helps the OOM killer to be disabled
for unpredictable period. Therefore, releasing all reclaimable memory before
the OOM killer kills a process might be considered bad.

Then, what about an approach described below?

(1) Introduce a kernel thread which reserves (e.g.) 1 percent of kernel memory
    (this amount should be configurable via sysctl) upon startup.

(2) The kernel thread sleeps using wait_event(memory_reservoir_wait) and
    releases PAGE_SIZE bytes from the reserved memory upon each wakeup.

(3) The OOM killer calls wake_up() like

     	if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
     		if (unlikely(frozen(task)))
     			__thaw_task(task);
    +		/* Let the memory reservoir release memory if the chosen process cannot die. */
    +		if (time_after(jiffies, p->memdie_stamp) &&
    +		    task->state == TASK_UNINTERRUPTIBLE)
    +		        wake_up(&memory_reservoir_wait);
     		if (!force_kill)
     			return OOM_SCAN_ABORT;
     	}

    in oom_scan_process_thread().

(4) When a task where test_tsk_thread_flag(task, TIF_MEMDIE) is true has
    terminated and memory used by the task is reclaimed, the reclaimed memory
    is again reserved by the kernel thread up to 1 percent of kernel memory.

In this way, we could shorten the duration of the OOM killer being disabled
unless the reserved memory was not enough to terminate the chosen process.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/