lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 14 Jun 2011 16:33:08 +0800
From:	"Alex,Shi" <alex.shi@...el.com>
To:	"Li, Shaohua" <shaohua.li@...el.com>
Cc:	"paulmck@...ux.vnet.ibm.com" <paulmck@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...e.hu>,
	lkml <linux-kernel@...r.kernel.org>,
	"Chen, Tim C" <tim.c.chen@...el.com>
Subject: Re: rcu: performance regression

On Tue, 2011-06-14 at 13:26 +0800, Li, Shaohua wrote:
> Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
> introduced performance regression. In our AIM7 test, this commit caused
> about 40% regression.
> The commit runs rcu callbacks in a kthread instead of softirq. We
> observed high rate of context switch which is caused by this. Out test
> system has 64 CPUs and HZ is 1000, so we saw more than 64k context
> switch per second which is caused by the rcu thread.
> I also did trace and found when rcy thread is woken up, most time the
> thread doesn't handle any callbacks actually, it just initializes new gp
> or end one gp or similar.
> From my understanding, the purpose to make rcu runs in kthread is to
> speed up rcu callbacks run (with help of rtmutex PI), not for end gp and
> so on, which runs pretty fast actually and doesn't need boost.
> To verify my findings, I had below debug patch applied. It still handles
> rcu callbacks in kthread if there is any pending callbacks, but other
> things are still running in softirq. this completely solved our
> regression. I thought this can still boost callbacks run. but I'm not
> expert in the area, so please help.

This commit also cause hackbench process mode performance dropping, and
Shaohua's patch do recovered this. But in hackbench testing, the vmstat
show context switch have some reduce. And perf tool show
root_domain->cpupri->prio_to_cpu[]->lock has contention with the commit.


   11.53%        hackbench  [kernel]                   [k] 
                  |
                  --- _raw_spin_lock_irqsave
                      cpupri_set
                      __enqueue_rt_entity
                      enqueue_rt_entity
                      enqueue_task_rt
                      enqueue_task
                      activate_task
                      ttwu_activate
                      ttwu_do_activate.clone.3
                      try_to_wake_up
                      wake_up_process
                      invoke_rcu_cpu_kthread
                      rcu_check_callbacks
                      update_process_times
                      tick_sched_timer
                      __run_hrtimer

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ