[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM5PR21MB17495741194946995BA2F7D5CAA79@DM5PR21MB1749.namprd21.prod.outlook.com>
Date: Thu, 9 Jun 2022 13:59:02 +0000
From: Haiyang Zhang <haiyangz@...rosoft.com>
To: "Michael Kelley (LINUX)" <mikelley@...rosoft.com>,
Saurabh Sengar <ssengar@...ux.microsoft.com>,
KY Srinivasan <kys@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>,
Dexuan Cui <decui@...rosoft.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Saurabh Singh Sengar <ssengar@...rosoft.com>
Subject: RE: [PATCH] Drivers: hv: vmbus: Add cpu read lock
> -----Original Message-----
> From: Michael Kelley (LINUX) <mikelley@...rosoft.com>
> Sent: Thursday, June 9, 2022 9:51 AM
> To: Saurabh Sengar <ssengar@...ux.microsoft.com>; KY Srinivasan
> <kys@...rosoft.com>; Haiyang Zhang <haiyangz@...rosoft.com>; Stephen
> Hemminger <sthemmin@...rosoft.com>; wei.liu@...nel.org; Dexuan Cui
> <decui@...rosoft.com>; linux-hyperv@...r.kernel.org; linux-
> kernel@...r.kernel.org; Saurabh Singh Sengar <ssengar@...rosoft.com>
> Subject: RE: [PATCH] Drivers: hv: vmbus: Add cpu read lock
>
> From: Saurabh Sengar <ssengar@...ux.microsoft.com> Sent: Wednesday, June
> 8, 2022 10:27 PM
> >
> > Add cpus_read_lock to prevent CPUs from going offline between query and
> > actual use of cpumask. cpumask_of_node is first queried, and based on it
> > used later, in case any CPU goes offline between these two events, it can
> > potentially cause an infinite loop of retries.
> >
> > Signed-off-by: Saurabh Sengar <ssengar@...ux.microsoft.com>
> > ---
> > drivers/hv/channel_mgmt.c | 4 ++++
> > 1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index 85a2142..6a88b7e 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -749,6 +749,9 @@ static void init_vp_index(struct vmbus_channel
> *channel)
> > return;
> > }
> >
> > + /* No CPUs should come up or down during this. */
> > + cpus_read_lock();
> > +
> > for (i = 1; i <= ncpu + 1; i++) {
> > while (true) {
> > numa_node = next_numa_node_id++;
> > @@ -781,6 +784,7 @@ static void init_vp_index(struct vmbus_channel
> *channel)
> > break;
> > }
> >
> > + cpus_read_unlock();
> > channel->target_cpu = target_cpu;
> >
> > free_cpumask_var(available_mask);
> > --
> > 1.8.3.1
>
> This patch was motivated because I suggested a potential issue here during
> a separate conversation with Saurabh, but it turns out I was wrong. :-(
>
> init_vp_index() is only called from vmbus_process_offer(), and the
> cpus_read_lock() is already held when init_vp_index() is called. So the
> issue doesn't exist, and this patch isn't needed.
>
> However, looking at vmbus_process_offer(), there appears to be a
> different problem in that cpus_read_unlock() is not called when taking
> the error return because the sub_channel_index is zero.
>
> Michael
>
} else {
/*
* Check to see if this is a valid sub-channel.
*/
if (newchannel->offermsg.offer.sub_channel_index == 0) {
mutex_unlock(&vmbus_connection.channel_mutex);
/*
* Don't call free_channel(), because newchannel->kobj
* is not initialized yet.
*/
kfree(newchannel);
WARN_ON_ONCE(1);
return;
}
If this happens, it should be a host bug. Yes, I also think the cpus_read_unlock()
is missing in this error path.
Thanks,
- Haiyang
Powered by blists - more mailing lists