[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <PU1P153MB0169FE9BF7EC648490AB19F6BFBC0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM>
Date: Sat, 31 Aug 2019 02:54:51 +0000
From: Dexuan Cui <decui@...rosoft.com>
To: Michael Kelley <mikelley@...rosoft.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Sasha Levin <Alexander.Levin@...rosoft.com>,
"sashal@...nel.org" <sashal@...nel.org>,
Haiyang Zhang <haiyangz@...rosoft.com>,
KY Srinivasan <kys@...rosoft.com>,
"tglx@...utronix.de" <tglx@...utronix.de>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH v3 11/12] Drivers: hv: vmbus: Suspend after cleaning up
hv_sock and sub channels
> From: Michael Kelley <mikelley@...rosoft.com>
> Sent: Friday, August 23, 2019 1:17 PM
>
> From: Dexuan Cui Sent: Monday, August 19, 2019 6:52 PM
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > @@ -499,6 +499,8 @@ static void vmbus_add_channel_work(struct
> work_struct *work)
> > return;
> >
> > err_deq_chan:
> > + WARN_ON_ONCE(1);
> > +
>
> Why this change? I was thinking maybe it's a debug statement that got
> left in.
This is indeed a debug statement. :-)
I was not so sure if the error handling is absolutely correct there.
I think I can remove this WARN_ON_ONCE() since almost all of the possible
error paths have a pr_err(). We can make an extra patch to fix any bug in
the existing error handling code. BTW, it should be pretty unlikely to reach
the "err_deq_chan label" in practice.
> > @@ -1021,6 +1044,11 @@ static void vmbus_onoffer_rescind(struct
> > vmbus_channel_message_header *hdr)
> > }
> > mutex_unlock(&vmbus_connection.channel_mutex);
> > }
> > +
> > + /* The "channel" may have been freed. Do not access it any longer. */
> > +
> > + if (clean_up_chan_for_suspend)
> > + check_ready_for_suspend_event();
> > }
>
> Having to add the above lines twice is a bit clumsy, but the problem is
> the overall structure of the vmbus_onoffer_rescind. The early return in
> the case of a rescind_callback function is a bit weird, but I guess it makes
> sense since from what I can tell, only uio and hv_sock have rescind callback
> functions. Some minor restructuring might be warranted, but I don't feel
> strongly about it.
I agree. We should restructure vmbus_onoffer_rescind(), but I think we can
do it in a separate patch. Here in this patch I'm focused on the minimal
required change for hibernation.
> > + /*
> > + * Wait until all the sub-channels and hv_sock channels have been
> > + * cleaned up. Sub-channels should be destroyed upon suspend,
> otherwise
> > + * they would conflict with the new sub-channels that will be created
> > + * in the resume path. hv_sock channels should also be destroyed, but
> > + * a hv_sock channel of an established hv_sock connection can not be
> > + * really destroyed since it may still be referenced by the userspace
> > + * application, so we just force the hv_sock channel to be rescinded
> > + * by vmbus_force_channel_rescinded(), and the userspace application
> > + * will thoroughly destroy the channel after hibernation.
> > + */
> > + if (atomic_read(&vmbus_connection.nr_chan_close_on_suspend) > 0)
>
> At first glance, the above line seemed useless to me. You could just do the
> wait_for_completion() unconditionally. But is the intent to handle the case
> where the VM never had any sub-channels or hv_sock channels, and so
> nr_chan_close_on_suspend never went above 0?
Yes.
> If so, a comment might be helpful.
> > +wait_for_completion(&vmbus_connection.ready_for_suspend_event);
Ok. I'll add a commnt for this in v4.
Thanks,
-- Dexuan
Powered by blists - more mailing lists