lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 28 May 2018 16:28:46 +0800
From:   Dave Young <dyoung@...hat.com>
To:     David Hildenbrand <david@...hat.com>
Cc:     Michal Hocko <mhocko@...nel.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        Alexander Potapenko <glider@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Andrey Ryabinin <aryabinin@...tuozzo.com>,
        Balbir Singh <bsingharora@...il.com>,
        Baoquan He <bhe@...hat.com>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        Dan Williams <dan.j.williams@...el.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Hari Bathini <hbathini@...ux.vnet.ibm.com>,
        Huang Ying <ying.huang@...el.com>,
        Hugh Dickins <hughd@...gle.com>,
        Ingo Molnar <mingo@...nel.org>,
        Jaewon Kim <jaewon31.kim@...sung.com>, Jan Kara <jack@...e.cz>,
        Jérôme Glisse <jglisse@...hat.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Juergen Gross <jgross@...e.com>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <mawilcox@...rosoft.com>,
        Mel Gorman <mgorman@...e.de>,
        Michael Ellerman <mpe@...erman.id.au>,
        Miles Chen <miles.chen@...iatek.com>,
        Oscar Salvador <osalvador@...hadventures.net>,
        Paul Mackerras <paulus@...ba.org>,
        Pavel Tatashin <pasha.tatashin@...cle.com>,
        Philippe Ombredanne <pombredanne@...b.com>,
        Rashmica Gupta <rashmica.g@...il.com>,
        Reza Arbab <arbab@...ux.vnet.ibm.com>,
        Souptick Joarder <jrdr.linux@...il.com>,
        Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
        Thomas Gleixner <tglx@...utronix.de>,
        Vlastimil Babka <vbabka@...e.cz>
Subject: Re: [PATCH v1 00/10] mm: online/offline 4MB chunks controlled by
 device driver

On 05/24/18 at 11:14am, David Hildenbrand wrote:
> On 24.05.2018 10:56, Dave Young wrote:
> > Hi,
> > 
> > [snip]
> >>>
> >>>> For kdump and onlining/offlining code, we
> >>>> have to mark pages as offline before a new segment is visible to the system
> >>>> (e.g. as these pages might not be backed by real memory in the hypervisor).
> >>>
> >>> Please expand on the kdump part. That is really confusing because
> >>> hotplug should simply not depend on kdump at all. Moreover why don't you
> >>> simply mark those pages reserved and pull them out from the page
> >>> allocator?
> >>
> >> 1. "hotplug should simply not depend on kdump at all"
> >>
> >> In theory yes. In the current state we already have to trigger kdump to
> >> reload whenever we add/remove a memory block.
> >>
> >>
> >> 2. kdump part
> >>
> >> Whenever we offline a page and tell the hypervisor about it ("unplug"),
> >> we should not assume that we can read that page again. Now, if dumping
> >> tools assume they can read all memory that is offline, we are in trouble.
> >>
> >> It is the same thing as we already have with Pg_hwpoison. Just a
> >> different meaning - "don't touch this page, it is offline" compared to
> >> "don't touch this page, hw is broken".
> > 
> > Does that means in case an offline no kdump reload as mentioned in 1)?
> > 
> > If we have the offline event and reload kdump, I assume the memory state
> > is refreshed so kdump will not read the memory offlined, am I missing
> > something?
> 
> If a whole section is offline: yes. (ACPI hotplug)
> 
> If pages are online but broken ("logically offline" - hwpoison): no
> 
> If single pages are logically offline: no. (Balloon inflation - let's
> call it unplug as that's what some people refer to)
> 
> If only subsections (4MB chunks) are offline: no.
> 
> Exporting memory ranges in a smaller granularity to kdump than section
> size would a) be heavily complicated b) introduce a lot of overhead for
> this tracking data c) make us retrigger kdump way too often.
> 
> So simply marking pages offline in the struct pages and telling kdump
> about it is the straight forward thing to do. And it is fairly easy to
> add and implement as we have the exact same thing in place for hwpoison.

Ok, it is clear enough.   If case fine grained page offline is is like
a hwpoison page so a userspace patch for makedumpfile is needes to
exclude them when copying vmcore.

> 
> > 
> >>
> >> Balloon drivers solve this problem by always allowing to read unplugged
> >> memory. In virtio-mem, this cannot and should even not be guaranteed.
> >>
> > 
> > Hmm, that sounds a bug..
> 
> I can give you a simple example why reading such unplugged (or balloon
> inflated) memory is problematic: Huge page backed guests.
> 
> There is no zero page for huge pages. So if we allow the guest to read
> that memory any time, we cannot guarantee that we actually consume less
> memory in the hypervisor. This is absolutely to be avoided.
> 
> Existing balloon drivers don't support huge page backed guests. (well
> you can inflate, but the hypervisor cannot madvise() 4k on a huge page,
> resulting in no action being performed). This scenario is to be
> supported with virtio-mem.
> 
> 
> So yes, this is actually a bug in e.g. virtio-balloon implementations:
> 
> With "VIRTIO_BALLOON_F_MUST_TELL_HOST" we have to tell the hypervisor
> before we access a page again. kdump cannot do this and does not care,
> so this page is silently accessed and dumped. One of the main problems
> why extending virtio-balloon hypervisor implementations to support
> host-enforced R/W protection is impossible.

I'm not sure I got all virt related background, but still thank you
for the detailed explanation.  This is the first time I heard about
this, nobody complained before :(

> 
> > 
> >> And what we have to do to make this work is actually pretty simple: Just
> >> like Pg_hwpoison, track per page if it is online and provide this
> >> information to kdump.
> >>
> >>
> > 
> > Thanks
> > Dave
> > 
> 
> 
> -- 
> 
> Thanks,
> 
> David / dhildenb

Thanks
Dave

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ