[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f517bfbe-18b8-6962-5c57-545f6ef47ad0@intel.com>
Date: Thu, 15 Dec 2016 17:09:10 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: "Li, Liang Z" <liang.z.li@...el.com>,
Andrea Arcangeli <aarcange@...hat.com>
Cc: David Hildenbrand <david@...hat.com>,
"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"mhocko@...e.com" <mhocko@...e.com>,
"mst@...hat.com" <mst@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"qemu-devel@...gnu.org" <qemu-devel@...gnu.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"dgilbert@...hat.com" <dgilbert@...hat.com>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
"virtualization@...ts.linux-foundation.org"
<virtualization@...ts.linux-foundation.org>,
"kirill.shutemov@...ux.intel.com" <kirill.shutemov@...ux.intel.com>
Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast
(de)inflating & fast live migration
On 12/15/2016 04:48 PM, Li, Liang Z wrote:
>>> It seems we leave too many bit for the pfn, and the bits leave for
>>> length is not enough, How about keep 45 bits for the pfn and 19 bits
>>> for length, 45 bits for pfn can cover 57 bits physical address, that should be
>> enough in the near feature.
>>> What's your opinion?
>> I still think 'order' makes a lot of sense. But, as you say, 57 bits is enough for
>> x86 for a while. Other architectures.... who knows?
Thinking about this some more... There are really only two cases that
matter: 4k pages and "much bigger" ones.
Squeezing each 4k page into 8 bytes of metadata helps guarantee that
this scheme won't regress over the old scheme in any cases. For bigger
ranges, 8 vs 16 bytes means *nothing*. And 16 bytes will be as good or
better than the old scheme for everything which is >4k.
How about this:
* 52 bits of 'pfn', 5 bits of 'order', 7 bits of 'length'
* One special 'length' value to mean "actual length in next 8 bytes"
That should be pretty simple to produce and decode. We have two record
sizes, but I think it is manageable.
Powered by blists - more mailing lists