linux-kernel - Re: [PATCH 2/2] kvm: arm/arm64: Fix use after free of stage2 page table

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c9c972f3-817a-fc1b-e4f3-a039778732f4@arm.com>
Date:   Mon, 15 May 2017 14:38:08 +0100
From:   Suzuki K Poulose <Suzuki.Poulose@....com>
To:     Christoffer Dall <cdall@...aro.org>
Cc:     christoffer.dall@...aro.org, agraf@...e.de, andreyknvl@...gle.com,
        marc.zyngier@....com, mark.rutland@....com, pbonzini@...hat.com,
        rkrcmar@...hat.com, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
        kvm@...r.kernel.org
Subject: Re: [PATCH 2/2] kvm: arm/arm64: Fix use after free of stage2 page
 table

On 15/05/17 14:36, Suzuki K Poulose wrote:
> On 15/05/17 11:00, Christoffer Dall wrote:
>> Hi Suzuki,
>>
>> On Wed, May 03, 2017 at 03:17:52PM +0100, Suzuki K Poulose wrote:
>>> We yield the kvm->mmu_lock occassionaly while performing an operation
>>> (e.g, unmap or permission changes) on a large area of stage2 mappings.
>>> However this could possibly cause another thread to clear and free up
>>> the stage2 page tables while we were waiting for regaining the lock and
>>> thus the original thread could end up in accessing memory that was
>>> freed. This patch fixes the problem by making sure that the stage2
>>> pagetable is still valid after we regain the lock. The fact that
>>> mmu_notifer->release() could be called twice (via __mmu_notifier_release
>>> and mmu_notifier_unregsister) enhances the possibility of hitting
>>> this race where there are two threads trying to unmap the entire guest
>>> shadow pages.
>>>
>>> While at it, cleanup the redudant checks around cond_resched_lock in
>>> stage2_wp_range(), as cond_resched_lock already does the same checks.
>>>
>>> Cc: Mark Rutland <mark.rutland@....com>
>>> Cc: Radim Krčmář <rkrcmar@...hat.com>
>>> Cc: andreyknvl@...gle.com
>>> Cc: Christoffer Dall <christoffer.dall@...aro.org>
>>> Cc: Marc Zyngier <marc.zyngier@....com>
>>> Cc: Paolo Bonzini <pbonzini@...hat.com>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@....com>
>>> ---
>>>  arch/arm/kvm/mmu.c | 17 ++++++++++++-----
>>>  1 file changed, 12 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>>> index 909a1a7..5b3e0db 100644
>>> --- a/arch/arm/kvm/mmu.c
>>> +++ b/arch/arm/kvm/mmu.c
>>> @@ -301,9 +301,14 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>>>          /*
>>>           * If the range is too large, release the kvm->mmu_lock
>>>           * to prevent starvation and lockup detector warnings.
>>> +         * Make sure the page table is still active when we regain
>>> +         * the lock.
>>>           */
>>> -        if (next != end)
>>> +        if (next != end) {
>>>              cond_resched_lock(&kvm->mmu_lock);
>>> +            if (!READ_ONCE(kvm->arch.pgd))
>>> +                break;
>>> +        }
>>
>> So I don't think this change is wrong, but I wonder if it's sufficient.
>> For example, I can see that this function is called from
>>
>> stage2_unmsp_vm
>>  -> stage2_unmap_memslot
>>    -> unmap_stage2_range
>>
>> and
>>
>> kvm_arch_flush_shadow_memslot
>>  -> unmap_stage2_range
>>
>> which never check if the pgd pointer is valid,
>
> You are right. Those two callers do not check it. We could fix all of this by simply
> moving the check to the beginning of the loop.
> i.e, something like this :
>
> @@ -295,6 +295,12 @@ static void unmap_stage2_range(struct kvm *kvm, phys_addr_t start, u64 size)
>     assert_spin_locked(&kvm->mmu_lock);
>     pgd = kvm->arch.pgd + stage2_pgd_index(addr);
>     do {
> +        /*
> +         * Make sure the page table is still active, as we could
> +         * another thread could have possibly freed the page table.
> +         */

The comment portion has been fixed locally.