linux-kernel - Re: [PATCH v2] KVM: kvm_io_bus_unregister

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0dd97243-db9b-4d22-970e-489d0f491851@redhat.com>
Date:   Fri, 24 Mar 2017 09:55:15 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Dmitry Vyukov <dvyukov@...gle.com>,
        Marcelo Tosatti <mtosatti@...hat.com>
Cc:     KVM list <kvm@...r.kernel.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Radim Krčmář <rkrcmar@...hat.com>,
        stable <stable@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Cornelia Huck <cornelia.huck@...ibm.com>
Subject: Re: [PATCH v2] KVM: kvm_io_bus_unregister_dev() should never fail


>>> -             return r;
>>> +     if (i == bus->dev_count)
>>> +             return;
>>>
>>>       new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count - 1) *
>>>                         sizeof(struct kvm_io_range)), GFP_KERNEL);
>>> -     if (!new_bus)
>>> -             return -ENOMEM;
>>> +     if (!new_bus)  {
>>> +             pr_err("kvm: failed to shrink bus, removing it completely\n");
>>> +             goto broken;
>>
>> The guest will fail in mysterious ways, if you do this (and
>> io_bus_unregister_dev can be called during runtime): in-kernel device
>> accesses will fail with unknown behaviour in the guest.

Actually, the next access to the BUS should result in -ENOMEM. And the
error message should be enough to then figure out what went wrong.
However, to hit this scenario at all feels very unlikely. So I would
like to avoid advanced allocation schemes.

>>
>> Can't you retry a handful of times with GFP_KERNEL before switching to GFP_ATOMIC?
>> (which in case fails the machine is likely to be crashing soon).
> 
> The process can run in a cgroup, then kmalloc failure has nothing to
> do with overall memory consumption. Machine can be perfectly fine.
> Also, this very process can be chosen as an OOM kill target, then it
> needs to gracefully deal with kmalloc failure and proceed to a
> termination point.
> Generally retrying something in a loop does not look like a solid plan
> to deal with errors.
> 

I agree, looping on memory allocations never feels like the right thing
to do.

-- 

Thanks,

David