lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210520145708.GK4441@paulmck-ThinkPad-P17-Gen-1>
Date:   Thu, 20 May 2021 07:57:08 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Sergey Senozhatsky <senozhatsky@...omium.org>
Cc:     Josh Triplett <josh@...htriplett.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Suleiman Souhlal <suleiman@...gle.com>, rcu@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] rcu/tree: consider time a VM was suspended

On Thu, May 20, 2021 at 02:50:49PM +0900, Sergey Senozhatsky wrote:
> On (21/05/18 16:15), Paul E. McKenney wrote:
> > 
> > In the shorter term...  PVCLOCK_GUEST_STOPPED is mostly for things like
> > guest migration and debugger breakpoints, correct?
> 
> Our use case is a bit different. We suspend VM when user puts the host
> system into sleep (which can happen multiple times a day).

OK, that is an interesting use case that I don't see.

> > Either way, I am wondering if rcu_cpu_stall_reset() should take a lighter
> > touch.  Right now, it effectively disables all stalls for the current grace
> > period. Why not make it restart the stall timeout when the stoppage is detected?
> 
> Sounds good. I can cook a patch and run some tests.
> Or do you want to send a patch?

Given that you have the test setup, things might go faster if you do
the patch, especially taking timezones into consideration.  Of course,
if you run into difficulties, you know where to find me.

> > The strange thing is that unless something is updating the jiffies counter
> > to make it look like the system was up during the stoppage time interval,
> > there should be no reason to tell RCU anything.  Is the jiffies counter
> > updated in this manner?  (Not seeing it right offhand, but I don't claim
> > to be familiar with this code.)
> 
> VCPUs are not resumed all at once. It's up to the host to schedule VCPUs
> for execution. So, for example, when we resume VCPU-3 and it discovers
> this_cpu PVCLOCK_GUEST_STOPPED, other VCPUs, e.g. VCPU-0, can already be
> resumed, up and running processing timer interrupts and adding ticks to
> jiffies.
> 
> I can reproduce it.
> While VCPU-2 has PVCLOCK_GUEST_STOPPED set (resuming) and is in
> check_cpu_stall(), the VCPU-3 is executing:
> 
> 	apic_timer_interrupt()
> 	 tick_irq_enter()
> 	  tick_do_update_jiffies64()
> 	   do_timer()

OK, but the normal grace period time is way less than one second, and
the stall timeout in mainline is 21 seconds, so that would be a -lot-
of jiffies of skew.  Or does the restarting really take that long a time?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ