[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CACGkMEvc+eA7KdJJAtjNPwqve8CwLZYzAmMhf0RWwQ-GwonaUw@mail.gmail.com>
Date: Tue, 5 Nov 2024 11:09:41 +0800
From: Jason Wang <jasowang@...hat.com>
To: Qiang Zhang <qiang4.zhang@...ux.intel.com>
Cc: "Michael S. Tsirkin" <mst@...hat.com>, Paolo Bonzini <pbonzini@...hat.com>,
Stefan Hajnoczi <stefanha@...hat.com>, Eugenio Pérez <eperezma@...hat.com>,
Xuan Zhuo <xuanzhuo@...ux.alibaba.com>, Jens Axboe <axboe@...nel.dk>,
Olivia Mackall <olivia@...enic.com>, Herbert Xu <herbert@...dor.apana.org.au>,
Amit Shah <amit@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>, Gonglei <arei.gonglei@...wei.com>,
"David S. Miller" <davem@...emloft.net>, Viresh Kumar <viresh.kumar@...aro.org>,
"Chen, Jian Jun" <jian.jun.chen@...el.com>, Andi Shyti <andi.shyti@...nel.org>,
Andrew Lunn <andrew+netdev@...n.ch>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
"James E.J. Bottomley" <James.Bottomley@...senpartnership.com>,
"Martin K. Petersen" <martin.petersen@...cle.com>, David Hildenbrand <david@...hat.com>,
Gerd Hoffmann <kraxel@...hat.com>, Anton Yakovlev <anton.yakovlev@...nsynergy.com>,
Jaroslav Kysela <perex@...ex.cz>, Takashi Iwai <tiwai@...e.com>, Qiang Zhang <qiang4.zhang@...el.com>,
virtualization@...ts.linux.dev, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org,
linux-i2c@...r.kernel.org, netdev@...r.kernel.org, linux-scsi@...r.kernel.org,
linux-sound@...r.kernel.org
Subject: Re: [PATCH v2] virtio: only reset device and restore status if needed
in device resume
On Fri, Nov 1, 2024 at 1:23 PM Qiang Zhang <qiang4.zhang@...ux.intel.com> wrote:
>
> On Fri, Nov 01, 2024 at 10:11:11AM +0800, Jason Wang wrote:
> > On Fri, Nov 1, 2024 at 9:54 AM <qiang4.zhang@...ux.intel.com> wrote:
> > >
> > > From: Qiang Zhang <qiang4.zhang@...el.com>
> > >
> > > Virtio core unconditionally reset and restore status for all virtio
> > > devices before calling restore method. This breaks some virtio drivers
> > > which don't need to do anything in suspend and resume because they
> > > just want to keep device state retained.
> >
> > The challenge is how can driver know device doesn't need rest.
>
> Hi,
>
> Per my understanding to PM, in the suspend flow, device drivers need to
> 1. First manage/stop accesses from upper level software and
> 2. Store the volatile context into in-memory data structures.
> 3. Put devices into some low power (suspended) state.
> The resume process does the reverse.
> If a device context won't loose after entering some low power state
> (optional), it's OK to skip step 2.
>
> For virtio devices, spec doesn't define whether their states will lost
> after platform entering suspended state.
This is exactly what suspend patch tries to define.
> So to work with different
> hypervisors, virtio drivers typically trigger a reset in suspend/resume
> flow. This works fine for virtio devices if following conditions are met:
> - Device state can be totally recoverable.
> - There isn't any working behaviour expected in suspended state, i.e. the
> suspended state should be sub-state of reset.
> However, the first point may be hard to implement from driver side for some
> devices. The second point may be unacceptable for some kind of devices.
>
> For your question, for devices whose suspended state is alike reset state,
> the hypervisor have the flexibility to retain its state or not, kernel
> driver can unconditionally reset it with proper re-initialization to
> accomplish better compatibility. For others, hypervisor *must* retain
> device state and driver just keeps using it.
Right, so my question is how did the driver know the behaviour of a
device? We usually do that via a feature bit.
Note that the thing that matters here is the migration compatibility.
>
> >
> > For example, PCI has no_soft_reset which has been done in the commit
> > "virtio: Add support for no-reset virtio PCI PM".
> >
> > And there's a ongoing long discussion of adding suspend support in the
> > virtio spec, then driver know it's safe to suspend/resume without
> > reset.
>
> That's great! Hopefully it can fill the gap.
> Currently, I think we can safely move the reset to drivers' freeze methods,
> virtio core has no reason to take it as a common action required by all
> devices. And the reset operation can be optional skipped if driver have
> hints from device that it can retain state.
The problem here is whether the device can be resumed without "soft
reset" seems a general feature which could be either the knowledge of
1) virtio core (a feature bit or not)
or
2) transport layer (like PCI)
>
> >
> > >
> > > Virtio GPIO is a typical example. GPIO states should be kept unchanged
> > > after suspend and resume (e.g. output pins keep driving the output) and
> > > Virtio GPIO driver does nothing in freeze and restore methods. But the
> > > reset operation in virtio_device_restore breaks this.
> >
> > Is this mandated by GPIO or virtio spec? If yes, let's quote the revelant part.
>
> No. But in actual hardware design (e.g. Intel PCH GPIO), or from the
> requirement perspective, GPIO pin state can be (should support) retained
> in suspended state.
> If Virtio GPIO is used to let VM operate such physical GPIO chip indirectly,
> it can't be reset in suspend and resume. Meanwhile the hypervisor will
> retain pin states after suspension.
>
> >
> > >
> > > Since some devices need reset in suspend and resume while some needn't,
> > > create a new helper function for the original reset and status restore
> > > logic so that virtio drivers can invoke it in their restore method
> > > if necessary.
> >
> > How are those drivers classified?
>
> I think this depends whether hypervisor will keep devices state in platform
> suspend process.
So the problem is that the actual implementation (hypervisor, physical
device or mediation) is transparent to the driver. Driver needs a
general way to know whether it's safe (or not) to reset during the
suspend/resume.
> I think hypervisor should because suspend and reset are
> conceptually two different things.
Probably, but rest is and doing software state load/save is common
practice for devices that will lose their state during PM.
Thanks
>
>
> Thanks
> Qiang
>
Powered by blists - more mailing lists