[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210415142720.7bebde2a@slime>
Date: Thu, 15 Apr 2021 14:27:20 +0800
From: xiaojun.zhao141@...il.com
To: Josef Bacik <josef@...icpanda.com>
Cc: xiaojun.zhao141@...il.com, Miroslav Benes <mbenes@...e.cz>,
linux-kernel@...r.kernel.org, live-patching@...r.kernel.org
Subject: Re: the qemu-nbd process automatically exit with the commit
43347d56c 'livepatch: send a fake signal to all blocking tasks'
On Wed, 14 Apr 2021 13:21:37 -0400
Josef Bacik <josef@...icpanda.com> wrote:
> On 4/14/21 11:21 AM, xiaojun.zhao141@...il.com wrote:
> > On Wed, 14 Apr 2021 13:27:43 +0200 (CEST)
> > Miroslav Benes <mbenes@...e.cz> wrote:
> >
> >> Hi,
> >>
> >> On Wed, 14 Apr 2021, xiaojun.zhao141@...il.com wrote:
> >>
> >>> I found the qemu-nbd process(started with qemu-nbd -t -c /dev/nbd0
> >>> nbd.qcow2) will automatically exit when I patched for functions of
> >>> the nbd with livepatch.
> >>>
> >>> The nbd relative source:
> >>> static int nbd_start_device_ioctl(struct nbd_device *nbd, struct
> >>> block_device *bdev)
> >>> { struct nbd_config *config =
> >>> nbd->config; int
> >>> ret;
> >>> ret =
> >>> nbd_start_device(nbd); if
> >>> (ret) return
> >>> ret;
> >>> if
> >>> (max_part) bdev->bd_invalidated =
> >>> 1;
> >>> mutex_unlock(&nbd->config_lock); ret =
> >>> wait_event_interruptible(config->recv_wq,
> >>> atomic_read(&config->recv_threads) == 0); if
> >>> (ret)
> >>> sock_shutdown(nbd);
> >>> flush_workqueue(nbd->recv_workq);
> >>> mutex_lock(&nbd->config_lock);
> >>> nbd_bdev_reset(bdev);
> >>> /* user requested, ignore socket errors
> >>> */ if (test_bit(NBD_RT_DISCONNECT_REQUESTED,
> >>> &config->runtime_flags)) ret =
> >>> 0; if (test_bit(NBD_RT_TIMEDOUT,
> >>> &config->runtime_flags)) ret =
> >>> -ETIMEDOUT; return
> >>> ret; }
> >>
> >> So my understanding is that ndb spawns a number
> >> (config->recv_threads) of workqueue jobs and then waits for them to
> >> finish. It waits interruptedly. Now, any signal would make
> >> wait_event_interruptible() to return -ERESTARTSYS. Livepatch fake
> >> signal is no exception there. The error is then propagated back to
> >> the userspace. Unless a user requested a disconnection or there is
> >> timeout set. How does the userspace then reacts to it? Is
> >> _interruptible there because the userspace sends a signal in case
> >> of NBD_RT_DISCONNECT_REQUESTED set? How does the userspace handles
> >> ordinary signals? This all sounds a bit strange, but I may be
> >> missing something easily.
> >>
> >>> When the nbd waits for atomic_read(&config->recv_threads) == 0,
> >>> the klp will send a fake signal to it then the qemu-nbd process
> >>> exits. And the signal of sysfs to control this action was removed
> >>> in the commit 10b3d52790e 'livepatch: Remove signal sysfs
> >>> attribute'. Are there other ways to control this action? How?
> >>
> >> No, there is no way currently. We send a fake signal automatically.
> >>
> >> Regards
> >> Miroslav
> > It occurs IO error of the nbd device when I use livepatch of the
> > nbd, and I guess that any livepatch on other kernel source maybe
> > cause the IO error. Well, now I decide to workaround for this
> > problem by adding a livepatch for the klp to disable a automatic
> > fake signal.
>
> Would wait_event_killable() fix this problem? I'm not sure any
> client implementations depend on being able to send other signals to
> the client process, so it should be safe from that standpoint. Not
> sure if the livepatch thing would still get an error at that point
> tho. Thanks,
> Josef
Yes, I tested that wait_event_killable() can fix this problem.
Thanks.
Powered by blists - more mailing lists