[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230627112220.229240-1-david@redhat.com>
Date: Tue, 27 Jun 2023 13:22:15 +0200
From: David Hildenbrand <david@...hat.com>
To: linux-kernel@...r.kernel.org
Cc: linux-mm@...ck.org, virtualization@...ts.linux-foundation.org,
David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
John Hubbard <jhubbard@...dia.com>,
Oscar Salvador <osalvador@...e.de>,
Michal Hocko <mhocko@...e.com>,
Jason Wang <jasowang@...hat.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: [PATCH v1 0/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals
As raised by John Hubbard [1], offline_and_remove_memory() failing on
fatal signals can be sub-optimal for out-of-tree drivers: dying user space
might be the last one holding a device node open.
As that device node gets closed, the driver might unplug the device
and trigger offline_and_remove_memory() to unplug previously
hotplugged device memory. This, however, will fail reliably when fatal
signals are pending on the dying process, turning the device unusable until
the machine gets rebooted.
That can be optizied easily by ignoring fatal signals. In fact, checking
for fatal signals in the case of offline_and_remove_memory() doesn't
make too much sense; the check makes sense when offlining is triggered
directly via sysfs. However, we actually do want a way to not end up
stuck in offline_and_remove_memory() forever.
What offline_and_remove_memory() users actually want is fail after some
given timeout and not care about fatal signals.
So let's implement that, optimizing virtio-mem along the way.
Cc: Andrew Morton <akpm@...ux-foundation.org>
Cc: "Michael S. Tsirkin" <mst@...hat.com>
Cc: John Hubbard <jhubbard@...dia.com>
Cc: Oscar Salvador <osalvador@...e.de>
Cc: Michal Hocko <mhocko@...e.com>
Cc: Jason Wang <jasowang@...hat.com>
Cc: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
[1] https://lkml.kernel.org/r/20230620011719.155379-1-jhubbard@nvidia.com
David Hildenbrand (5):
mm/memory_hotplug: check for fatal signals only in offline_pages()
virtio-mem: convert most offline_and_remove_memory() errors to -EBUSY
mm/memory_hotplug: make offline_and_remove_memory() timeout instead of
failing on fatal signals
virtio-mem: set the timeout for offline_and_remove_memory() to 10
seconds
virtio-mem: check if the config changed before (fake) offlining memory
drivers/virtio/virtio_mem.c | 22 +++++++++++++--
include/linux/memory_hotplug.h | 2 +-
mm/memory_hotplug.c | 50 ++++++++++++++++++++++++++++++++--
3 files changed, 68 insertions(+), 6 deletions(-)
base-commit: 6995e2de6891c724bfeb2db33d7b87775f913ad1
--
2.40.1
Powered by blists - more mailing lists