lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANN689FfWVV4MyTUPKZQgQAWW9Dfdw9f0fqx98kc+USKj9g7TA@mail.gmail.com>
Date:	Mon, 3 Dec 2012 16:35:01 -0800
From:	Michel Lespinasse <walken@...gle.com>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	linux-mm@...ck.org, Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hughd@...gle.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: protect against concurrent vma expansion

On Mon, Dec 3, 2012 at 3:01 PM, Andrew Morton <akpm@...ux-foundation.org> wrote:
> On Fri, 30 Nov 2012 22:56:27 -0800
> Michel Lespinasse <walken@...gle.com> wrote:
>
>> expand_stack() runs with a shared mmap_sem lock. Because of this, there
>> could be multiple concurrent stack expansions in the same mm, which may
>> cause problems in the vma gap update code.
>>
>> I propose to solve this by taking the mm->page_table_lock around such vma
>> expansions, in order to avoid the concurrency issue. We only have to worry
>> about concurrent expand_stack() calls here, since we hold a shared mmap_sem
>> lock and all vma modificaitons other than expand_stack() are done under
>> an exclusive mmap_sem lock.
>>
>> I previously tried to achieve the same effect by making sure all
>> growable vmas in a given mm would share the same anon_vma, which we
>> already lock here. However this turned out to be difficult - all of the
>> schemes I tried for refcounting the growable anon_vma and clearing
>> turned out ugly. So, I'm now proposing only the minimal fix.
>
> I think I don't understand the problem fully.  Let me demonstrate:
>
> a) vma_lock_anon_vma() doesn't take a lock which is specific to
>    "this" anon_vma.  It takes anon_vma->root->mutex.  That mutex is
>    shared with vma->vm_next, yes?  If so, we have no problem here?
>    (which makes me suspect that the races lies other than where I think
>    it lies).

So, the first thing I need to mention is that this fix is NOT for any
problem that has been reported (and in particular, not for Sasha's
trinity fuzzing issue). It's just me looking at the code and noticing
I haven't gotten locking right for the case of concurrent stack
expansion.

Regarding vma and vma->vm_next sharing the same root anon_vma mutex -
this will often be the case, but not always. find_mergeable_anon_vma()
will try to make it so, but it could fail if there was another vma
in-between at the time the stack's anon_vmas got assigned (either a
non-stack vma that later gets unmapped, or another stack vma that
didn't get its own anon_vma assigned yet).

> b) I can see why a broader lock is needed in expand_upwards(): it
>    plays with a different vma: vma->vm_next.  But expand_downwards()
>    doesn't do that - it only alters "this" vma.  So I'd have thought
>    that vma_lock_anon_vma("this" vma) would be sufficient.

The issue there is that vma_gap_update() accesses vma->vm_prev, so the
issue is actually symetrical with expand_upwards().

> What are the performance costs of this change?

It's expected to be small. glibc doesn't use expandable stacks for the
threads it creates, so having multiple growable stacks is actually
uncommon (another reason why the problem hasn't been observed in
practice). Because of this, I don't expect the page table lock to get
bounced between threads, so the cost of taking it should be small
(compared to the cost of delivering the #PF, let alone handling it in
software).

But yes, the initial idea of forcing all growable vmas in an mm to
share the same root anon_vma sounded much more appealing at first.
Unfortunately I haven't been able to make that work in a simple enough
way to be comfortable submitting it this late in the release cycle :/

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ