linux-kernel - Re: [PATCH RFC v4 00/13] virtio-mem: paravirtualized memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <178a5e94-f1f1-130c-9a28-2b8dd2be2abe@redhat.com>
Date:   Mon, 16 Dec 2019 12:03:21 +0100
From:   David Hildenbrand <david@...hat.com>
To:     Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        virtio-dev@...ts.oasis-open.org,
        virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
        Michal Hocko <mhocko@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        "Michael S . Tsirkin" <mst@...hat.com>,
        Sebastien Boeuf <sebastien.boeuf@...el.com>,
        Samuel Ortiz <samuel.ortiz@...el.com>,
        Robert Bradford <robert.bradford@...el.com>,
        Luiz Capitulino <lcapitulino@...hat.com>,
        Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
        Alexander Potapenko <glider@...gle.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Anshuman Khandual <anshuman.khandual@....com>,
        Anthony Yznaga <anthony.yznaga@...cle.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Dave Young <dyoung@...hat.com>,
        Igor Mammedov <imammedo@...hat.com>,
        Jason Wang <jasowang@...hat.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Juergen Gross <jgross@...e.com>, Len Brown <lenb@...nel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Michal Hocko <mhocko@...e.com>,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Oscar Salvador <osalvador@...e.com>,
        Oscar Salvador <osalvador@...e.de>,
        Pavel Tatashin <pasha.tatashin@...een.com>,
        Pavel Tatashin <pavel.tatashin@...rosoft.com>,
        Pingfan Liu <kernelfans@...il.com>, Qian Cai <cai@....pw>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Stefan Hajnoczi <stefanha@...hat.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Wei Yang <richard.weiyang@...il.com>
Subject: Re: [PATCH RFC v4 00/13] virtio-mem: paravirtualized memory

On 13.12.19 21:15, Konrad Rzeszutek Wilk wrote:
> On Thu, Dec 12, 2019 at 06:11:24PM +0100, David Hildenbrand wrote:
>> This series is based on latest linux-next. The patches are located at:
>>     https://github.com/davidhildenbrand/linux.git virtio-mem-rfc-v4
> Heya!

Hi Konrad!

> 
> Would there be by any chance a virtio-spec git tree somewhere?

I haven't started working on a spec yet - it's on my todo list but has
low priority (one-man-team). I'll focus on the QEMU pieces next, once
the kernel part is in an acceptable state.

The uapi file contains quite some documentation - if somebody wants to
start hacking on an alternative hypervisor implementation, I'm happy to
answer questions until I have a spec ready.

> 
> ..snip..
>> --------------------------------------------------------------------------
>> 5. Future work
>> --------------------------------------------------------------------------
>>
>> The separate patches contain a lot of future work items. One of the next
>> steps is to make memory unplug more likely to succeed - currently, there
>> are no guarantees on how much memory can get unplugged again. I have
> 
> 
> Or perhaps tell the caller why we can't and let them sort it out?
> For example: "Application XYZ is mlocked. Can't offload'.

Yes, it might in general be interesting for the guest to indicate
persistent errors, both when hotplugging and hotunplugging memory.
Indicating why unplugging is not able to succeed in that detail is,
however, non-trivial.

The hypervisor sets the requested size can can watch over the actual
size of a virtio-mem device. Right now, after it updated the requested
size, it can wait some time (e.g., 1-5 minutes). If the requested size
was not reached after that time, it knows there is a persistent issue
limiting plug/unplug. In the future, this could be extended by a rough
or detailed root cause indication. In the worst case, the guest crashed
and is no longer able to respond (not even with an error indication).

One interesting piece of the current hypervisor (QEMU) design is that
the maximum memory size a VM can consume is always known and QEMU will
send QMP events to upper layers whenever that size changes. This means
that you can e.g., reliably charge a customer how much memory a VM is
actually able to consume over time (independent of hotplug/unplug
errors). But yeah, the QEMU bits are still in a very early stage.

-- 
Thanks,

David / dhildenb