linux-kernel - Re: [PATCH 9/12] ksm: fix oom deadlock

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20090825174730.GR14722@random.random>
Date:	Tue, 25 Aug 2009 19:47:30 +0200
From:	Andrea Arcangeli <aarcange@...hat.com>
To:	Hugh Dickins <hugh.dickins@...cali.co.uk>
Cc:	Izik Eidus <ieidus@...hat.com>, Rik van Riel <riel@...hat.com>,
	Chris Wright <chrisw@...hat.com>,
	Nick Piggin <nickpiggin@...oo.com.au>,
	Andrew Morton <akpm@...ux-foundation.org>,
	"Justin M. Forbes" <jmforbes@...uxtx.org>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH 9/12] ksm: fix oom deadlock

On Tue, Aug 25, 2009 at 06:35:56PM +0100, Hugh Dickins wrote:
> "make munlock fast when mlock is canceled by sigkill".  It's just
> idiotic that munlock (in this case, munlocking pages on exit) should
> be trying to fault in pages, and that causes its own problems when

I also pondered if to address the thing by fixing automatic munlock,
but then I think the same way it's asking for troubles to cause page
faults with mm_users == 0 in munlock, it's also asking for troubles to
cause page faults with mm_users == 0 in ksm. So if munlock is wrong
ksm was also wrong, and I tried to fix ksm not to do that, while
leaving munlock fixage for later/others.. ;)

> I have now made a patch with munlock_vma_pages_range() doing a
> follow_page() loop instead of faulting in; but I've not yet tested

That is a separate problem in my view.

> I'd prefer not to have them too, but haven't yet worked out how to
> get along safely without them.

ok.

> But the mmap_sem is not enough to exclude the mm exiting
> (until __ksm_exit does its little down_write,up_write dance):
> break_cow etc. do the ksm_test_exit check on mm_users before
> proceeding any further, but that's just not enough to prevent
> break_ksm's handle_pte_fault racing with exit_mmap - hence the
> ksm_test_exits in mm/memory.c, to stop ptes being instantiated
> after the final zap thinks it's wiped the pagetables.
> 
> Let's look at your actual patch...

I tried to work out how to get along safely without them, in short my
patch makes mmap_sem + ksm_test_exit check on mm_users before
proceeding any further "enough" (while still allowing ksm loop to bail
out if mm_users suddenly reaches zero because of oom killer).

Furthermore the mmap_sem is already guaranteed l1 hot and exclusive
because we wrote to it a few nanoseconds before calling mmput (to be
fair locked ops are not cheap but I'd rather add two locked op to the
last exit syscall of a thread group than a new branch to every single
page fault as there are tons more page faults than exit syscalls).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/