lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5B1FA8EF.4030409@intel.com>
Date:   Tue, 12 Jun 2018 19:05:19 +0800
From:   Wei Wang <wei.w.wang@...el.com>
To:     Linus Torvalds <torvalds@...ux-foundation.org>,
        "Michael S. Tsirkin" <mst@...hat.com>
CC:     KVM list <kvm@...r.kernel.org>,
        virtualization <virtualization@...ts.linux-foundation.org>,
        Network Development <netdev@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>
Subject: Re: [PULL] vhost: cleanups and fixes

On 06/12/2018 09:59 AM, Linus Torvalds wrote:
> On Mon, Jun 11, 2018 at 6:36 PM Michael S. Tsirkin <mst@...hat.com> wrote:
>> Maybe it will help to have GFP_NONE which will make any allocation
>> fail if attempted. Linus, would this address your comment?
> It would definitely have helped me initially overlook that call chain.
>
> But then when I started looking at the whole dma_map_page() thing, it
> just raised my hackles again.
>
> I would seriously suggest having a much simpler version for the "no
> allocation, no dma mapping" case, so that it's *obvious* that that
> never happens.
>
> So instead of having virtio_balloon_send_free_pages() call a really
> generic complex chain of functions that in _some_ cases can do memory
> allocation, why isn't there a short-circuited "vitruque_add_datum()"
> that is guaranteed to never do anything like that?
>
> Honestly, I look at "add_one_sg()" and it really doesn't make me
> happy. It looks hacky as hell. If I read the code right, you're really
> trying to just queue up a simple tuple of <pfn,len>, except you encode
> it as a page pointer in order to play games with the SG logic, and
> then you hmap that to the ring, except in this case it's all a fake
> ring that just adds the cpu-physical address instead.
>
> And to figuer that out, it's like five layers of indirection through
> different helper functions that *can* do more generic things but in
> this case don't.
>
> And you do all of this from a core VM callback function with some
> _really_ core VM locks held.
>
> That makes no sense to me.
>
> How about this:
>
>   - get rid of all that code
>
>   - make the core VM callback save the "these are the free memory
> regions" in a fixed and limited array. One that DOES JUST THAT. No
> crazy "SG IO dma-mapping function crap". Just a plain array of a fixed
> size, pre-allocated for that virtio instance.
>
>   - make it obvious that what you do in that sequence is ten
> instructions and no allocations ("Look ma, I wrote a value to an array
> and incremented the array idex, and I'M DONE")
>
>   - then in that workqueue entry that you start *anyway*, you empty the
> array and do all the crazy virtio stuff.
>
> In fact, while at it, just simplify the VM interface too. Instead of
> traversing a random number of buddy lists, just trraverse *one* - the
> top-level one. Are you seriously ever going to shrink or mark
> read-only anythin *but* something big enough to be in the maximum
> order?
>
> MAX_ORDER is what, 11? So we're talking 8MB blocks. Do you *really*
> want the balloon code to work on smaller things, particularly since
> the whole interface is fundamentally racy and opportunistic to begin
> with?

OK, I will implement a new version based on the suggestions. Thanks.

Best,
Wei

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ