[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<SN6PR02MB4157630C523459A75C83C498D44B2@SN6PR02MB4157.namprd02.prod.outlook.com>
Date: Tue, 29 Oct 2024 23:44:44 +0000
From: Michael Kelley <mhklinux@...look.com>
To: Dexuan Cui <decui@...rosoft.com>, "K. Y. Srinivasan" <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>, Vitaly
Kuznetsov <vkuznets@...hat.com>, Greg Kroah-Hartman
<gregkh@...uxfoundation.org>, "open list:Hyper-V/Azure CORE AND DRIVERS"
<linux-hyperv@...r.kernel.org>, open list <linux-kernel@...r.kernel.org>
CC: "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: RE: [PATCH] Drivers: hv: kvp/vss: Avoid accessing a ringbuffer not
initialized yet
From: Dexuan Cui <decui@...rosoft.com> Sent: Monday, September 9, 2024 9:47 AM
>
> If the KVP (or VSS) daemon starts before the VMBus channel's ringbuffer is
> fully initialized, we can hit the panic below:
>
> hv_utils: Registering HyperV Utility Driver
> hv_vmbus: registering driver hv_utils
> ...
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> CPU: 44 UID: 0 PID: 2552 Comm: hv_kvp_daemon Tainted: G E 6.11.0-rc3+ #1
> RIP: 0010:hv_pkt_iter_first+0x12/0xd0
> Call Trace:
> ...
> vmbus_recvpacket
> hv_kvp_onchannelcallback
> vmbus_on_event
> tasklet_action_common
> tasklet_action
> handle_softirqs
> irq_exit_rcu
> sysvec_hyperv_stimer0
> </IRQ>
> <TASK>
> asm_sysvec_hyperv_stimer0
> ...
> kvp_register_done
> hvt_op_read
> vfs_read
> ksys_read
> __x64_sys_read
>
> This can happen because the KVP/VSS channel callback can be invoked
> even before the channel is fully opened:
> 1) as soon as hv_kvp_init() -> hvutil_transport_init() creates
> /dev/vmbus/hv_kvp, the kvp daemon can open the device file immediately and
> register itself to the driver by writing a message KVP_OP_REGISTER1 to the
> file (which is handled by kvp_on_msg() ->kvp_handle_handshake()) and
> reading the file for the driver's response, which is handled by
> hvt_op_read(), which calls hvt->on_read(), i.e. kvp_register_done().
>
> 2) the problem with kvp_register_done() is that it can cause the
> channel callback to be called even before the channel is fully opened,
> and when the channel callback is starting to run, util_probe()->
> vmbus_open() may have not initialized the ringbuffer yet, so the
> callback can hit the panic of NULL pointer dereference.
>
> To reproduce the panic consistently, we can add a "ssleep(10)" for KVP in
> __vmbus_open(), just before the first hv_ringbuffer_init(), and then we
> unload and reload the driver hv_utils, and run the daemon manually within
> the 10 seconds.
>
> Fix the panic by checking the channel state in the channel callback.
> To avoid the race condition with __vmbus_open(), we disable and enable
> the channel callback temporarily in __vmbus_open().
>
> The channel callbacks of the other VMBus devices don't need to check
> the channel state since they can't run before the channels are fully
> initialized.
>
> Note: we would also need to fix the fcopy driver code, but that has
> been removed in commit ec314f61e4fc ("Drivers: hv: Remove fcopy driver") in
> the mainline kernel since v6.10. For old 6.x LTS kernels, and the 5.x
> and 4.x LTS kernels, the fcopy driver needs to be fixed when the
> fix is backported to the stable kernel branches.
An alternate approach occurs to me. util_probe() does these three
things in order:
1) Allocates the receive buffer
2) Calls the util_init() function, which for KVP and VSS creates the char dev
3) Sets up the VMBus channel, including calling vmbus_open()
What if the order of #2 and #3 were swapped in util_probe()? I
don't immediately see any interdependency between #2 and #3
for KVP and VSS, nor for Shutdown and Timesync. With the swap,
the VMBus channel would be fully open by the time the /dev entry
appears and the user space daemon can do anything.
I haven't though too deeply about this, so maybe there's a problem
somewhere. But if not, it seems a lot cleaner.
Michael
>
> Fixes: e0fa3e5e7df6 ("Drivers: hv: utils: fix a race on userspace daemons registration")
> Cc: stable@...r.kernel.org
> Signed-off-by: Dexuan Cui <decui@...rosoft.com>
> ---
> drivers/hv/channel.c | 11 +++++++++++
> drivers/hv/hv_kvp.c | 3 +++
> drivers/hv/hv_snapshot.c | 3 +++
> 3 files changed, 17 insertions(+)
>
> diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
> index fb8cd8469328..685e407a3fdf 100644
> --- a/drivers/hv/channel.c
> +++ b/drivers/hv/channel.c
> @@ -657,6 +657,14 @@ static int __vmbus_open(struct vmbus_channel
> *newchannel,
> return -ENOMEM;
> }
>
> + /*
> + * The channel callbacks of KVP/VSS may run before __vmbus_open()
> + * finishes (see kvp_register_done() -> ... -> kvp_poll_wrapper()), so
> + * they check newchannel->state to tell the ringbuffer has been fully
> + * initialized or not. Disable and enable the tasklet to avoid the race.
> + */
> + tasklet_disable(&newchannel->callback_event);
> +
> newchannel->state = CHANNEL_OPENING_STATE;
> newchannel->onchannel_callback = onchannelcallback;
> newchannel->channel_callback_context = context;
> @@ -750,6 +758,8 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
> }
>
> newchannel->state = CHANNEL_OPENED_STATE;
> + tasklet_enable(&newchannel->callback_event);
> +
> kfree(open_info);
> return 0;
>
> @@ -766,6 +776,7 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
> hv_ringbuffer_cleanup(&newchannel->inbound);
> vmbus_free_requestor(&newchannel->requestor);
> newchannel->state = CHANNEL_OPEN_STATE;
> + tasklet_enable(&newchannel->callback_event);
> return err;
> }
>
> diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
> index d35b60c06114..ec098067e579 100644
> --- a/drivers/hv/hv_kvp.c
> +++ b/drivers/hv/hv_kvp.c
> @@ -662,6 +662,9 @@ void hv_kvp_onchannelcallback(void *context)
> if (kvp_transaction.state > HVUTIL_READY)
> return;
>
> + if (channel->state != CHANNEL_OPENED_STATE)
> + return;
> +
> if (vmbus_recvpacket(channel, recv_buffer, HV_HYP_PAGE_SIZE * 4, &recvlen, &requestid)) {
> pr_err_ratelimited("KVP request received. Could not read into recv buf\n");
> return;
> diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c
> index 0d2184be1691..f7924c2fc62e 100644
> --- a/drivers/hv/hv_snapshot.c
> +++ b/drivers/hv/hv_snapshot.c
> @@ -301,6 +301,9 @@ void hv_vss_onchannelcallback(void *context)
> if (vss_transaction.state > HVUTIL_READY)
> return;
>
> + if (channel->state != CHANNEL_OPENED_STATE)
> + return;
> +
> if (vmbus_recvpacket(channel, recv_buffer, VSS_MAX_PKT_SIZE, &recvlen, &requestid)) {
> pr_err_ratelimited("VSS request received. Could not read into recv buf\n");
> return;
> --
> 2.25.1
Powered by blists - more mailing lists