[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZJrtO105xKpzbR9g@dhcp22.suse.cz>
Date: Tue, 27 Jun 2023 16:07:55 +0200
From: Michal Hocko <mhocko@...e.com>
To: David Hildenbrand <david@...hat.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
virtualization@...ts.linux-foundation.org,
Andrew Morton <akpm@...ux-foundation.org>,
"Michael S. Tsirkin" <mst@...hat.com>,
John Hubbard <jhubbard@...dia.com>,
Oscar Salvador <osalvador@...e.de>,
Jason Wang <jasowang@...hat.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Subject: Re: [PATCH v1 1/5] mm/memory_hotplug: check for fatal signals only
in offline_pages()
On Tue 27-06-23 15:28:29, David Hildenbrand wrote:
> On 27.06.23 14:34, Michal Hocko wrote:
> > On Tue 27-06-23 13:22:16, David Hildenbrand wrote:
> > > Let's check for fatal signals only. That looks cleaner and still keeps
> > > the documented use case for manual user-space triggered memory offlining
> > > working. From Documentation/admin-guide/mm/memory-hotplug.rst:
> > >
> > > % timeout $TIMEOUT offline_block | failure_handling
> > >
> > > In fact, we even document there: "the offlining context can be terminated
> > > by sending a fatal signal".
> >
> > We should be fixing documentation instead. This could break users who do
> > have a SIGALRM signal hander installed.
>
> You mean because timeout will send a SIGALRM, which is not considered fatal
> in case a signal handler is installed?
Correct.
> At least the "traditional" tools I am aware of don't set a timeout at all
> (crossing fingers that they never end up stuck):
> * chmem
> * QEMU guest agent
> * powerpc-utils
>
> libdaxctl also doesn't seem to implement an easy-to-spot timeout for memory
> offlining, but it also doesn't configure SIGALRM.
>
>
> Of course, that doesn't mean that there isn't somewhere a program that does
> that; I merely assume that it would be pretty unlikely to find such a
> program.
>
> But no strong opinion: we can also keep it like that, update the doc and add
> a comment why this one here is different than most other signal backoff
> checks.
Well, the existing signal handling approach is there for way too long to
be sure. I personally would prefer fatal_signal_pending as that reflects
more what we do elsewhere but here we are. Historical baggage...
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists