[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <819e45c5-6ae3-1dff-3f1d-c0411b6e2e1d@redhat.com>
Date: Thu, 24 May 2018 23:07:23 +0200
From: David Hildenbrand <david@...hat.com>
To: Michal Hocko <mhocko@...nel.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Alexander Potapenko <glider@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrey Ryabinin <aryabinin@...tuozzo.com>,
Balbir Singh <bsingharora@...il.com>,
Baoquan He <bhe@...hat.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Boris Ostrovsky <boris.ostrovsky@...cle.com>,
Dan Williams <dan.j.williams@...el.com>,
Dave Young <dyoung@...hat.com>,
Dmitry Vyukov <dvyukov@...gle.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Hari Bathini <hbathini@...ux.vnet.ibm.com>,
Huang Ying <ying.huang@...el.com>,
Hugh Dickins <hughd@...gle.com>,
Ingo Molnar <mingo@...nel.org>,
Jaewon Kim <jaewon31.kim@...sung.com>, Jan Kara <jack@...e.cz>,
Jérôme Glisse <jglisse@...hat.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Juergen Gross <jgross@...e.com>,
Kate Stewart <kstewart@...uxfoundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Matthew Wilcox <mawilcox@...rosoft.com>,
Mel Gorman <mgorman@...e.de>,
Michael Ellerman <mpe@...erman.id.au>,
Miles Chen <miles.chen@...iatek.com>,
Oscar Salvador <osalvador@...hadventures.net>,
Paul Mackerras <paulus@...ba.org>,
Pavel Tatashin <pasha.tatashin@...cle.com>,
Philippe Ombredanne <pombredanne@...b.com>,
Rashmica Gupta <rashmica.g@...il.com>,
Reza Arbab <arbab@...ux.vnet.ibm.com>,
Souptick Joarder <jrdr.linux@...il.com>,
Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>,
Thomas Gleixner <tglx@...utronix.de>,
Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH v1 00/10] mm: online/offline 4MB chunks controlled by
device driver
On 24.05.2018 16:22, Michal Hocko wrote:
> I will go over the rest of the email later I just wanted to make this
> point clear because I suspect we are talking past each other.
It sounds like we are now talking about how to solve the problem. I like
that :)
>
> On Thu 24-05-18 16:04:38, David Hildenbrand wrote:
> [...]
>> The point I was making is: I cannot allocate 8MB/128MB using the buddy
>> allocator. All I want to do is manage the memory a virtio-mem device
>> provides as flexible as possible.
>
> I didn't mean to use the page allocator to isolate pages from it. We do
> have other means. Have a look at the page isolation framework and have a
> look how the current memory hotplug (ab)uses it. In short you mark the
> desired physical memory range as isolated (nobody can allocate from it)
> and then simply remove it from the page allocator. And you are done with
> it. Your particular range is gone, nobody will ever use it. If you mark
> those struct pages reserved then pfn walkers should already ignore them.
> If you keep those pages with ref count 0 then even hotplug should work
> seemlessly (I would have to double check).
>
> So all I am arguing is that whatever your driver wants to do can be
> handled without touching the hotplug code much. You would still need
> to add new ranges in the mem section units and manage on top of that.
> You need to do that anyway to keep track of what parts are in use or
> offlined anyway right? Now the mem sections. You have to do that anyway
> for memmaps. Our sparse memory model simply works in those units. Even
> if you make a part of that range unavailable then the section will still
> be there.
>
> Do I make at least some sense or I am completely missing your point?
>
I think we're heading somewhere. I understand that you want to separate
this "semi" offline part from the general offlining code. If so, we
should definitely enforce segment alignment for online_pages/offline_pages.
Importantly, what I need is:
1. Indicate and prepare memory sections to be used for adding memory
chunks (right now add_memory())
2. Make memory chunks of a section available to the system (right now
online_pages())
3. Remove memory chunks of a section from the system (right now
offline_pages())
4. Remove memory sections from the system (right now remove_memory())
5. Hinder dumping tools from reading memory chunks that are logically
offline (right now PageOffline())
6. For 3. find removable memory chunks in a certain memory range with a
variable size.
In an ideal world, 2. would never fail (in contrast to online_pages()
right now). This might make some further developments I have in mind
easier :) So if we can come up with an approach that can guarantee that,
extra points.
So what I think you are talking about is the following.
For 1. Use add_memory() followed by online_pages(). Don't actually
online the pages, keep them reserved (like XEN balloon). Fixup
stats.
For 2. Expose reserved pages to Buddy allocator. Clear reserved bit.
Fixup stats. This can never fail. (yay)
For 3. Isolate pages, try to move everything away (basically but not
comletely offlining code). Set reserved flag. Fixup flags.
For 4. offline_pages() followed by remove_memory().
-> Q: How to distinguish reserved offline from other reserved
pages? offline_pages() has to be able to deal with that
For 5. I don't think we can use reserved flag here.
-> Q: What else to use?
For 6. Scan for movable ranges. The use
"You need to do that anyway to keep track of what parts are in use or
offlined anyway right?"
I would manually track which chunks of a section is logically offline (I
do that right now already).
Is that what you had in mind? If not, where does your idea differ.
How could we solve 4/5. Of course, PageOffline() is again an option.
Thanks!
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists