lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1317668908.11991.20.camel@dagon.hellion.org.uk>
Date:	Mon, 3 Oct 2011 20:08:28 +0100
From:	Ian Campbell <Ian.Campbell@...citrix.com>
To:	Thomas Gleixner <tglx@...utronix.de>
CC:	Jeremy Fitzhardinge <Jeremy.Fitzhardinge@...rix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
	xen-devel <xen-devel@...ts.xensource.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	"Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: xen: IPI interrupts not resumed early enough on suspend/resume

On Mon, 2011-10-03 at 19:42 +0100, Thomas Gleixner wrote:
> On Mon, 3 Oct 2011, Ian Campbell wrote:
> > I can see a few options for how I might go about solving this in a
> > non-hacky way, which approach do you think would be preferable:
> 
> The question is whether you need to disable the IPI interrupt at
> all. If not, we have a flag for that.

We already that flag for these (I think that was why it was added even).
The issue is that in the resuming domain on the other side event
channels all start off masked and something needs to unmask them.

> >       * Add "IRQF_RESUME_EARLY", driven from syscore_resume, and use it
> >         for these interrupts.
> 
> That's the preferable solution, as we could use that for PPC as well,
> unless we can move stuff around, so we disable stuff later.

OK

> >       * register syscore ops for the Xen event channel subsystem to
> >         unmask the IPIs earlier (would probably look a lot like the code
> >         removed by 676dc3cf5bc3).
> 
> I'd like to avoid that.

Sure.

> >       * add syscore_ops to Xen smp subsystem to unmask the specific IPIs
> >         (which it binds at start of day) earlier.
> >       * push dpm_(suspend|resume)_noirq down into stop machine region
> 
> Where is stomp machine used?

It is used by the xen PV suspend handler which runs in that context in
order to quiesce non-boot CPUs (which Xen does not unplug like native
does).

> >       * use something other than stop_machine to quiesce system and move
> >         to cpu0 for suspend (doesn't seem sensible to reproduce that
> >         functionality).
> 
> We already shut down the nonboot cpus on suspend. We could do that
> _before_ we disable devices and the interrupts.

Xen PV suspend uses many of the PM/suspend core code paths but it does
not have the bit which shuts down non-boot CPUs.

It was a while ago but IIRC Xen used to unplug the secondary processors
and it was found to lead to larger latencies in the migration and
checkpointing cases (which at their core are a suspend/resume). The
disaster recovery folks in particular care about this latency since they
want to do rolling checkpoints many times a second.

Ian.

>  
> Raphael ?
> 
> Thanks,
> 
> 	tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ