[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9d19a844-9ae0-9520-c32a-0a4491f8de43@redhat.com>
Date: Thu, 15 Nov 2018 13:01:17 +0100
From: David Hildenbrand <david@...hat.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Dave Young <dyoung@...hat.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org,
devel@...uxdriverproject.org, linux-fsdevel@...r.kernel.org,
linux-pm@...r.kernel.org, xen-devel@...ts.xenproject.org,
Andrew Morton <akpm@...ux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Baoquan He <bhe@...hat.com>, Omar Sandoval <osandov@...com>,
Arnd Bergmann <arnd@...db.de>,
Matthew Wilcox <willy@...radead.org>,
Michal Hocko <mhocko@...e.com>,
Lianbo Jiang <lijiang@...hat.com>,
"Michael S. Tsirkin" <mst@...hat.com>
Subject: Re: [PATCH RFC 3/6] kexec: export PG_offline to VMCOREINFO
On 15.11.18 12:52, Borislav Petkov wrote:
> On Thu, Nov 15, 2018 at 12:20:40PM +0100, David Hildenbrand wrote:
>> Sorry to say, but that is the current practice without which
>> makedumpfile would not be able to work at all. (exclude user pages,
>> exclude page cache, exclude buddy pages). Let's not reinvent the wheel
>> here. This is how dumping works forever.
>
> Sorry, but "we've always done this in the past" doesn't make it better.
Just saying that "I'm not the first to do it, don't hit me with a stick" :)
>
>> I don't see how there should be "set of pages which do not have
>> PG_offline".
>
> It doesn't have to be a set of pages. Think a (mmconfig perhaps) region
> which the kdump kernel should completely skip because poking in it in
> the kdump kernel, causes all kinds of havoc like machine checks. etc.
> We've had and still have one issue like that.
Indeed. And we still have without makedumpfile. I think you are aware of
this, but I'll explain it just for consistency: PG_hwpoison
At some point we detect a HW error and mask a page as PG_hwpoison.
makedumpfile knows how to treat that flag and can exclude it from the
dump (== not access it). No crash.
kdump itself has no clue about old "struct pages". Especially:
a) Where they are located in memory (e.g. SPARSE)
b) What their format is ("where are the flags")
c) What the meaning of flags is ("what does bit X mean")
In order to know such information, we would have to do parsing of quite
some information inside the kernel in kdump. Basically what makedumpfile
does just now. Is this feasible? I don't think so.
So we would need another approach to communicate such information as you
said. I can't think of any, but if anybody reading this has an idea,
please speak up. I am interested.
The *only* way right now we would have to handle such scenarios:
1. While dumping memory and we get a machine check, fake reading a zero
page instead of crashing.
2. While dumping memory and we get a fault, fake reading a zero page
instead of crashing.
>
> But let me clarify my note: I don't want to be discussing with you the
> design of makedumpfile and how it should or should not work - that ship
> has already sailed. Apparently there are valid reasons to do it this
> way.
Indeed, and the basic design is to export these flags. (let's say
"unfortunately", being able to handle such stuff in kdump directly would
be the dream).
> I was *simply* stating that it feels wrong to export mm flags like that.
>
> But as I said already, that is mm guys' call and looking at how we're
> already exporting a bunch of stuff in the vmcoreinfo - including other
> mm flags - I guess one more flag doesn't matter anymore.
Fair enough, noted. If you have an idea how to handle this in kdump,
please let me know.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists