lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 9 Aug 2014 00:04:24 +0530 From: Amit Shah <amit.shah@...hat.com> To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> Cc: linux-kernel@...r.kernel.org, riel@...hat.com, mingo@...nel.org, laijs@...fujitsu.com, dipankar@...ibm.com, akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com, josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de, peterz@...radead.org, rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com, dvhart@...ux.intel.com, fweisbec@...il.com, oleg@...hat.com, sbw@....edu Subject: Re: [PATCH tip/core/rcu 1/2] rcu: Parallelize and economize NOCB kthread wakeups On (Fri) 08 Aug 2014 [11:18:35], Paul E. McKenney wrote: > On Fri, Aug 08, 2014 at 11:07:10PM +0530, Amit Shah wrote: > > On (Fri) 08 Aug 2014 [09:25:02], Paul E. McKenney wrote: > > > On Fri, Aug 08, 2014 at 02:10:56PM +0530, Amit Shah wrote: > > > > On Friday 11 July 2014 07:05 PM, Paul E. McKenney wrote: > > > > >From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> > > > > > > > > > >An 80-CPU system with a context-switch-heavy workload can require so > > > > >many NOCB kthread wakeups that the RCU grace-period kthreads spend several > > > > >tens of percent of a CPU just awakening things. This clearly will not > > > > >scale well: If you add enough CPUs, the RCU grace-period kthreads would > > > > >get behind, increasing grace-period latency. > > > > > > > > > >To avoid this problem, this commit divides the NOCB kthreads into leaders > > > > >and followers, where the grace-period kthreads awaken the leaders each of > > > > >whom in turn awakens its followers. By default, the number of groups of > > > > >kthreads is the square root of the number of CPUs, but this default may > > > > >be overridden using the rcutree.rcu_nocb_leader_stride boot parameter. > > > > >This reduces the number of wakeups done per grace period by the RCU > > > > >grace-period kthread by the square root of the number of CPUs, but of > > > > >course by shifting those wakeups to the leaders. In addition, because > > > > >the leaders do grace periods on behalf of their respective followers, > > > > >the number of wakeups of the followers decreases by up to a factor of two. > > > > >Instead of being awakened once when new callbacks arrive and again > > > > >at the end of the grace period, the followers are awakened only at > > > > >the end of the grace period. > > > > > > > > > >For a numerical example, in a 4096-CPU system, the grace-period kthread > > > > >would awaken 64 leaders, each of which would awaken its 63 followers > > > > >at the end of the grace period. This compares favorably with the 79 > > > > >wakeups for the grace-period kthread on an 80-CPU system. > > > > > > > > > >Reported-by: Rik van Riel <riel@...hat.com> > > > > >Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com> > > > > > > > > This patch causes KVM guest boot to not proceed after a while. > > > > .config is attached, and boot messages are appeneded. This commit > > > > was pointed to by bisect, and reverting on current master (while > > > > addressing a trivial conflict) makes the boot work again. > > > > > > > > The qemu cmdline is > > > > > > > > ./x86_64-softmmu/qemu-system-x86_64 -m 512 -smp 2 -cpu > > > > host,+kvmclock,+x2apic -enable-kvm -kernel > > > > ~/src/linux/arch/x86/boot/bzImage /guests/f11-auto.qcow2 -append > > > > 'root=/dev/sda2 console=ttyS0 console=tty0' -snapshot -serial stdio > > > > > > I cannot reproduce this. I am at commit a7d7a143d0b4c, in case that > > > makes a difference. > > > > Yea; I'm at that commit too. And the version of qemu doesn't matter; > > happens on F20's qemu-kvm-1.6.2-7.fc20.x86_64 as well as qemu.git > > compiled locally. > > > > > There are some things in your dmesg that look quite strange to me, though. > > > > > > You have "--smp 2" above, but in your dmesg I see the following: > > > > > > [ 0.000000] setup_percpu: NR_CPUS:4 nr_cpumask_bits:4 > > > nr_cpu_ids:1 nr_node_ids:1 > > > > > > So your run somehow only has one CPU. RCU agrees that there is only > > > one CPU: > > > > Yea; indeed. There are MTRR warnings too; attaching the boot log of > > failed run and diff to the successful run (rcu-good-notime.txt). > > My qemu runs don't have those MTRR warnings, for whatever that is worth. > > > The failed run is on commit a7d7a143d0b4cb1914705884ca5c25e322dba693 > > and the successful run has these reverted on top: > > > > 187497fa5e9e9383820d33e48b87f8200a747c2a > > b58cc46c5f6b57f1c814e374dbc47176e6b4938e > > fbce7497ee5af800a1c350c73f3c3f103cb27a15 > > OK. Strange set of commits. The last one is the one that causes the failure, the above two are just the context fixups needed for a clean revert of the last one. > > That is rcu-bad-notime.txt. > > > > > [ 0.000000] Preemptible hierarchical RCU implementation. > > > [ 0.000000] RCU debugfs-based tracing is enabled. > > > [ 0.000000] RCU lockdep checking is enabled. > > > [ 0.000000] Additional per-CPU info printed with stalls. > > > [ 0.000000] RCU restricting CPUs from NR_CPUS=4 to nr_cpu_ids=1. > > > [ 0.000000] Offload RCU callbacks from all CPUs > > > [ 0.000000] Offload RCU callbacks from CPUs: 0. > > > [ 0.000000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1 > > > [ 0.000000] NO_HZ: Full dynticks CPUs: 1-3. > > > > > > But NO_HZ thinks that there are four. This appears to be due to NO_HZ > > > looking at the compile-time constants, and I doubt that this would cause > > > a problem. But if there really is a CPU 1 that RCU doesn't know about, > > > and it queues a callback, that callback will never be invoked, and you > > > could easily see hangs. > > > > > > Give that your .config says CONFIG_NR_CPUS=4 and your qemu says "--smp 2", > > > why does nr_cpu_ids think that there is only one CPU? Are you running > > > this on a non-x86_64 CPU so that qemu only does UP or some such? > > > > No; this is "Intel(R) Core(TM) i7-2640M CPU @ 2.80GHz" on a ThinkPad > > T420s. > > Running in 64-bit mode, right? Yep. 3.15.7-200.fc20.x86_64 on the host. > > In my attached boot logs, RCU does detect two cpus. Here's the diff > > between them. I recompiled to remove the timing info so the diffs are > > comparable: <snip> > > mtrr: your CPUs had inconsistent MTRRdefType settings > > mtrr: probably your BIOS does not setup all CPUs. > > mtrr: corrected configuration. > > +ACPI: Added _OSI(Module Device) > > +ACPI: Added _OSI(Processor Device) > > +ACPI: Added _OSI(3.0 _SCP Extensions) > > +ACPI: Added _OSI(Processor Aggregator Device) > > +ACPI: Interpreter enabled > > +ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S1_] (20140724/hwxface-580) > > +ACPI Exception: AE_NOT_FOUND, While evaluating Sleep State [\_S2_] (20140724/hwxface-580) > > +ACPI: (supports S0 S3 S4 S5) > > +ACPI: Using IOAPIC for interrupt routing > > +PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug > > +ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff]) > > +acpi PNP0A03:00: _OSC: OS supports [Segments MSI] > > +acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM > > > > <followed by more bootup messages> > > Hmmm... What happens if you boot a7d7a143d0b4cb1914705884ca5c25e322dba693 > with the kernel parameter "acpi=off"? That doesn't change anything - still hangs. I intend to look at this more on Monday, though - turning in for today. In the meantime, if there's anything else you'd like me to try, please let me know. Thanks, Amit -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists