[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YI/CWg6PrMxcCT2D@kernel.org>
Date: Mon, 3 May 2021 12:28:58 +0300
From: Mike Rapoport <rppt@...nel.org>
To: David Hildenbrand <david@...hat.com>
Cc: linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Alexey Dobriyan <adobriyan@...il.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Oscar Salvador <osalvador@...e.de>,
Michal Hocko <mhocko@...e.com>, Roman Gushchin <guro@...com>,
Alex Shi <alex.shi@...ux.alibaba.com>,
Steven Price <steven.price@....com>,
Mike Kravetz <mike.kravetz@...cle.com>,
Aili Yao <yaoaili@...gsoft.com>, Jiri Bohac <jbohac@...e.cz>,
"K. Y. Srinivasan" <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Wei Liu <wei.liu@...nel.org>,
Naoya Horiguchi <naoya.horiguchi@....com>,
linux-hyperv@...r.kernel.org,
virtualization@...ts.linux-foundation.org,
linux-fsdevel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v1 7/7] fs/proc/kcore: use page_offline_(freeze|unfreeze)
On Mon, May 03, 2021 at 10:28:36AM +0200, David Hildenbrand wrote:
> On 02.05.21 08:34, Mike Rapoport wrote:
> > On Thu, Apr 29, 2021 at 02:25:19PM +0200, David Hildenbrand wrote:
> > > Let's properly synchronize with drivers that set PageOffline(). Unfreeze
> > > every now and then, so drivers that want to set PageOffline() can make
> > > progress.
> > >
> > > Signed-off-by: David Hildenbrand <david@...hat.com>
> > > ---
> > > fs/proc/kcore.c | 15 +++++++++++++++
> > > 1 file changed, 15 insertions(+)
> > >
> > > diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> > > index 92ff1e4436cb..3d7531f47389 100644
> > > --- a/fs/proc/kcore.c
> > > +++ b/fs/proc/kcore.c
> > > @@ -311,6 +311,7 @@ static void append_kcore_note(char *notes, size_t *i, const char *name,
> > > static ssize_t
> > > read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> > > {
> > > + size_t page_offline_frozen = 0;
> > > char *buf = file->private_data;
> > > size_t phdrs_offset, notes_offset, data_offset;
> > > size_t phdrs_len, notes_len;
> > > @@ -509,6 +510,18 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos)
> > > pfn = __pa(start) >> PAGE_SHIFT;
> > > page = pfn_to_online_page(pfn);
> >
> > Can't this race with page offlining for the first time we get here?
>
>
> To clarify, we have three types of offline pages in the kernel ...
>
> a) Pages part of an offline memory section; the memap is stale and not
> trustworthy. pfn_to_online_page() checks that. We *can* protect against
> memory offlining using get_online_mems()/put_online_mems(), but usually
> avoid doing so as the race window is very small (and a problem all over the
> kernel we basically never hit) and locking is rather expensive. In the
> future, we might switch to rcu to handle that more efficiently and avoiding
> these possible races.
>
> b) PageOffline(): logically offline pages contained in an online memory
> section with a sane memmap. virtio-mem calls these pages "fake offline";
> something like a "temporary" memory hole. The new mechanism I propose will
> be used to handle synchronization as races can be more severe, e.g., when
> reading actual page content here.
>
> c) Soft offline pages: hwpoisoned pages that are not actually harmful yet,
> but could become harmful in the future. So we better try to remove the page
> from the page allcoator and try to migrate away existing users.
>
>
> So page_offline_* handle "b) PageOffline()" only. There is a tiny race
> between pfn_to_online_page(pfn) and looking at the memmap as we have in many
> cases already throughout the kernel, to be tackled in the future.
Right, but here you anyway add locking, so why exclude the first iteration?
> (A better name for PageOffline() might make sense; PageSoftOffline() would
> be catchy but interferes with c). PageLogicallyOffline() is ugly;
> PageFakeOffline() might do)
>
> > > + /*
> > > + * Don't race against drivers that set PageOffline()
> > > + * and expect no further page access.
> > > + */
> > > + if (page_offline_frozen == MAX_ORDER_NR_PAGES) {
> > > + page_offline_unfreeze();
> > > + page_offline_frozen = 0;
> > > + cond_resched();
> > > + }
> > > + if (!page_offline_frozen++)
> > > + page_offline_freeze();
> > > +
BTW, did you consider something like
if (page_offline_frozen++ % MAX_ORDER_NR_PAGES == 0) {
page_offline_unfreeze();
cond_resched();
page_offline_freeze();
}
We don't seem to care about page_offline_frozen overflows here, do we?
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists