lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+2MQi9Qb5srEcx4qKNVWdphBGP0=HHV_h0hWghDMFKFmCOTMg@mail.gmail.com>
Date:   Tue, 5 Jan 2021 18:22:03 +0800
From:   Liang Li <liliang324@...il.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrea Arcangeli <aarcange@...hat.com>,
        Dan Williams <dan.j.williams@...el.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Jason Wang <jasowang@...hat.com>,
        Dave Hansen <dave.hansen@...el.com>,
        Michal Hocko <mhocko@...e.com>,
        Liang Li <liliangleo@...iglobal.com>,
        linux-mm <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        virtualization@...ts.linux-foundation.org
Subject: Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO

> >> That‘s mostly already existing scheduling logic, no? (How many vms can I put onto a specific machine eventually)
> >
> > It depends on how the scheduling component is designed. Yes, you can put
> > 10 VMs with 4C8G(4CPU, 8G RAM) on a host and 20 VMs with 2C4G on
> > another one. But if one type of them, e.g. 4C8G are sold out, customers
> > can't by more 4C8G VM while there are some free 2C4G VMs, the resource
> > reserved for them can be provided as 4C8G VMs
> >
>
> 1. You can, just the startup time will be a little slower? E.g., grow
> pre-allocated 4G file to 8G.
>
> 2. Or let's be creative: teach QEMU to construct a single
> RAMBlock/MemoryRegion out of multiple tmpfs files. Works as long as you
> don't go crazy on different VM sizes / size differences.
>
> 3. In your example above, you can dynamically rebalance as VMs are
> getting sold, to make sure you always have "big ones" lying around you
> can shrink on demand.
>
Yes, we can always come up with some ways to make things work.
it will make the developer of the upper layer component crazy :)

> >
> > You must know there are a lot of functions in the kernel which can
> > be done in userspace. e.g. Some of the device emulations like APIC,
> > vhost-net backend which has userspace implementation.   :)
> > Bad or not depends on the benefits the solution brings.
> > From the viewpoint of a user space application, the kernel should
> > provide high performance memory management service. That's why
> > I think it should be done in the kernel.
>
> As I expressed a couple of times already, I don't see why using
> hugetlbfs and implementing some sort of pre-zeroing there isn't sufficient.

Did I miss something before? I thought you doubt the need for
hugetlbfs free page pre zero out. Hugetlbfs is a good choice and is
sufficient.

> We really don't *want* complicated things deep down in the mm core if
> there are reasonable alternatives.
>
I understand your concern, we should have sufficient reason to add a new
feature to the kernel. And for this one, it's most value is to make the
application's life is easier. And implementing it in hugetlbfs can avoid
adding more complexity to core MM.
I will send out a new revision and drop the part for 'buddy free pages pre
zero out'. Thanks for your suggestion!

Liang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ