[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CY4PR2101MB0804C7E84D48FCDFB61F7569A0D00@CY4PR2101MB0804.namprd21.prod.outlook.com>
Date: Tue, 27 Nov 2018 05:22:20 +0000
From: KY Srinivasan <kys@...rosoft.com>
To: Greg KH <gregkh@...uxfoundation.org>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
"olaf@...fle.de" <olaf@...fle.de>,
"apw@...onical.com" <apw@...onical.com>,
"jasowang@...hat.com" <jasowang@...hat.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Michael Kelley <mikelley@...rosoft.com>,
vkuznets <vkuznets@...hat.com>,
Haiyang Zhang <haiyangz@...rosoft.com>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: RE: [PATCH 2/2] Drivers: hv: vmbus: offload the handling of channels
to two workqueues
> -----Original Message-----
> From: Greg KH <gregkh@...uxfoundation.org>
> Sent: Monday, November 26, 2018 11:35 AM
> To: KY Srinivasan <kys@...rosoft.com>
> Cc: linux-kernel@...r.kernel.org; devel@...uxdriverproject.org;
> olaf@...fle.de; apw@...onical.com; jasowang@...hat.com; Stephen
> Hemminger <sthemmin@...rosoft.com>; Michael Kelley
> <mikelley@...rosoft.com>; vkuznets <vkuznets@...hat.com>; Haiyang
> Zhang <haiyangz@...rosoft.com>; stable@...r.kernel.org
> Subject: Re: [PATCH 2/2] Drivers: hv: vmbus: offload the handling of channels
> to two workqueues
>
> On Mon, Nov 26, 2018 at 02:29:57AM +0000, kys@...uxonhyperv.com wrote:
> > From: Dexuan Cui <decui@...rosoft.com>
> >
> > vmbus_process_offer() mustn't call channel->sc_creation_callback()
> > directly for sub-channels, because sc_creation_callback() ->
> > vmbus_open() may never get the host's response to the
> > OPEN_CHANNEL message (the host may rescind a channel at any time,
> > e.g. in the case of hot removing a NIC), and vmbus_onoffer_rescind()
> > may not wake up the vmbus_open() as it's blocked due to a non-zero
> > vmbus_connection.offer_in_progress, and finally we have a deadlock.
> >
> > The above is also true for primary channels, if the related device
> > drivers use sync probing mode by default.
> >
> > And, usually the handling of primary channels and sub-channels can
> > depend on each other, so we should offload them to different
> > workqueues to avoid possible deadlock, e.g. in sync-probing mode,
> > NIC1's netvsc_subchan_work() can race with NIC2's netvsc_probe() ->
> > rtnl_lock(), and causes deadlock: the former gets the rtnl_lock
> > and waits for all the sub-channels to appear, but the latter
> > can't get the rtnl_lock and this blocks the handling of sub-channels.
> >
> > The patch can fix the multiple-NIC deadlock described above for
> > v3.x kernels (e.g. RHEL 7.x) which don't support async-probing
> > of devices, and v4.4, v4.9, v4.14 and v4.18 which support async-probing
> > but don't enable async-probing for Hyper-V drivers (yet).
> >
> > The patch can also fix the hang issue in sub-channel's handling described
> > above for all versions of kernels, including v4.19 and v4.20-rc3.
> >
> > So the patch should be applied to all the existing kernels.
> >
> > Fixes: 8195b1396ec8 ("hv_netvsc: fix deadlock on hotplug")
> > Cc: stable@...r.kernel.org
> > Cc: Stephen Hemminger <sthemmin@...rosoft.com>
> > Cc: K. Y. Srinivasan <kys@...rosoft.com>
> > Cc: Haiyang Zhang <haiyangz@...rosoft.com>
> > Signed-off-by: Dexuan Cui <decui@...rosoft.com>
> > Signed-off-by: K. Y. Srinivasan <kys@...rosoft.com>
> > ---
> > drivers/hv/channel_mgmt.c | 188 +++++++++++++++++++++++++---------
> ----
> > drivers/hv/connection.c | 24 ++++-
> > drivers/hv/hyperv_vmbus.h | 7 ++
> > include/linux/hyperv.h | 7 ++
> > 4 files changed, 161 insertions(+), 65 deletions(-)
>
> As Sasha pointed out, this patch does not even apply :(
Sorry about that. These patches applied cleanly on my tree (misc-next).
This series is to be applied on top of
patch 0001-Drivers-hv-vmbus-Remove-the-useless-API-vmbus_get_ou.patch
While the patch 0001-Drivers-hv-vmbus-Remove-the-useless-API-vmbus_get_ou.patch
has been committed to the char-misc-testing branch, it is not in the misc-linus branch and
that is the reason for this problem.
Regards,
K. Y
>
Powered by blists - more mailing lists