[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CY4PR21MB06290283219FC78C45688DC4D74B0@CY4PR21MB0629.namprd21.prod.outlook.com>
Date:   Sun, 24 Nov 2019 22:19:46 +0000
From:   Michael Kelley <mikelley@...rosoft.com>
To:     Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Dexuan Cui <decui@...rosoft.com>
CC:     KY Srinivasan <kys@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        "sashal@...nel.org" <sashal@...nel.org>,
        "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Sasha Levin <Alexander.Levin@...rosoft.com>
Subject: RE: [PATCH v2 2/4] PCI: hv: Add the support of hibernation
From: Lorenzo Pieralisi <lorenzo.pieralisi@....com>  Sent: Thursday, November 21, 2019 3:44 AM
> 
> On Thu, Nov 21, 2019 at 12:50:17AM +0000, Dexuan Cui wrote:
> > > From: Lorenzo Pieralisi <lorenzo.pieralisi@....com>
> > > Sent: Wednesday, November 20, 2019 9:20 AM
> > >
> > > On Tue, Nov 19, 2019 at 11:16:56PM -0800, Dexuan Cui wrote:
> > > > Implement the suspend/resume callbacks.
> > > >
> > > > We must make sure there is no pending work items before we call
> > > > vmbus_close().
> > >
> > > Where ? Why ? Imagine a developer reading this log to try to understand
> > > why you made this change, do you really think this commit log is
> > > informative in its current form ?
> > >
> > > I am not asking a book but this is a significant feature please make
> > > an effort to explain it (I can update the log for you but please
> > > write one and I shall do it).
> > >
> > > Lorenzo
> >
> > Sorry for being sloppy on this patch's changelog! Can you please use the
> > below? I can also post v3 with the new changelog if that's better.
> 
> As you wish but more importantly get hyper-V maintainers to ACK these
> changes since time is running out for v5.5.
> 
> Lorenzo
> 
> > PCI: hv: Add the support of hibernation
> >
> > hv_pci_suspend() runs in a process context as a callback in dpm_suspend().
> > When it starts to run, the channel callback hv_pci_onchannelcallback(),
> > which runs in a tasklet context, can be still running concurrently and
> > scheduling new work items onto hbus->wq in hv_pci_devices_present() and
> > hv_pci_eject_device(), and the work item handlers can access the vmbus
> > channel, which can be being closed by hv_pci_suspend(), e.g. the work item
> > handler pci_devices_present_work() -> new_pcichild_device() writes to
> > the vmbus channel.
> >
> > To eliminate the race, hv_pci_suspend() disables the channel callback
> > tasklet, sets hbus->state to hv_pcibus_removing, and re-enables the tasklet.
> >
> > This way, when hv_pci_suspend() proceeds, it knows that no new work item
> > can be scheduled, and then it flushes hbus->wq and safely closes the vmbus
> > channel.
> >
> > Thanks,
> > -- Dexuan
FWIW, I'd like to see the above level of detail also as comments in the code
Itself so that whoever next looks at the code sees the explanation directly
without having to review the commit logs.
Also, the commit message doesn't say what the commit actually does and
why.  I'd suggest the commit message along these lines:
Add suspend() and resume() functions so that Hyper-V virtual PCI devices are
handled properly when the VM hibernates and resumes from hibernation.
Note that the suspend() function must make sure there are no pending work
items before calling vmbus_close(), since it runs in a process context as a
callback in dpm_suspend().  When it starts to run, the channel callback
hv_pci_onchannelcallback(), which runs in a tasklet context, can be still running
concurrently and scheduling new work items onto hbus->wq in
hv_pci_devices_present() and hv_pci_eject_device(), and the work item 
handlers can access the vmbus channel, which can be being closed by
hv_pci_suspend(), e.g. the work item handler pci_devices_present_work() ->
new_pcichild_device() writes to the vmbus channel.
To eliminate the race, hv_pci_suspend() disables the channel callback
tasklet, sets hbus->state to hv_pcibus_removing, and re-enables the tasklet.
This way, when hv_pci_suspend() proceeds, it knows that no new work item
can be scheduled, and then it flushes hbus->wq and safely closes the vmbus
channel.
Michael
Powered by blists - more mailing lists
 
