linux-kernel - Re: [PATCH v2 2/4] driver core: enable drivers to use deferred probefrom init

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <201407300222.s6U2MElj035655@www262.sakura.ne.jp>
Date:	Wed, 30 Jul 2014 11:22:14 +0900
From:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To:	mcgrof@...e.com
Cc:	hare@...e.de, gregkh@...uxfoundation.org, santosh@...lsio.com,
	hariprasad@...lsio.com, tiwai@...e.de,
	linux-kernel@...r.kernel.org, joseph.salisbury@...onical.com,
	kay@...y.org, gnomes@...rguk.ukuu.org.uk,
	tim.gardner@...onical.com, pierre-fersing@...rref.org,
	akpm@...ux-foundation.org, oleg@...hat.com,
	nagalakshmi.nandigama@...gotech.com,
	praveen.krishnamoorthy@...gotech.com,
	sreekanth.reddy@...gotech.com, abhijit.mahajan@...gotech.com,
	MPT-FusionLinux.pdl@...gotech.com, linux-scsi@...r.kernel.org,
	netdev@...r.kernel.org, bpoirier@...e.de
Subject: Re: [PATCH v2 2/4] driver core: enable drivers to use deferred probefrom init

Luis R. Rodriguez wrote:
> Tetsuo is it possible / desirable to allow tasks to not kill unless the
> reason is OOM ? Its unclear if this was discussed before, sorry if it was,
> have just been a bit busy today to review the archive / discussions on this.

Are we aware that the 10 seconds timeout after SIGKILL is not the duration
between the beginning of module loading and the end of kthread_create() but
the duration to wait for kthreadd to create a new kernel thread?

If the kthreadd is unable to create a new kernel thread within 10 seconds,
something very bad is happening. For example, memory allocation deadlock
sequence shown below might be happening.

 (1) process1 holds a mutex using mutex_lock().
 (2) process1 calls kthread_create() and enters into killable wait state
     at wait_for_completion_killable().
 (3) kthreadd calls kernel_thread() and enters into oom-killable busy loop
     due to out of memory at alloc_pages_nodemask().
 (4) process2 is chosen by the OOM killer, but process2 is unable to
     terminate because process2 is waiting in unkillable state at
     mutex_lock() which was held by process1 at (1).
 (5) kthreadd continues busy loop because process2 does not release memory
     and the OOM killer does not kill more processes.
 (6) process1 continues waiting in oom-killable state because process1 is
     not chosen by the OOM killer.

See? The system will remain unresponding unless somebody releases memory
that is enough for kthreadd to complete. We cannot teach process1 that
process1 needs to give up waiting for kthreadd and call mutex_unlock()
in order to allow process2 to terminate. Also, we cannot teach the OOM
killer that process1 needs to be oom-killed after process2 is oom-killed.

Making the 10 seconds timeout after SIGKILL longer is safe.
Changing it to no-timeout-unless-oom-killed is unsafe.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/