lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150915135354.GA2905@mtj.duckdns.org>
Date:	Tue, 15 Sep 2015 09:53:54 -0400
From:	Tejun Heo <tj@...nel.org>
To:	Christian Borntraeger <borntraeger@...ibm.com>
Cc:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	"linux-kernel@...r.kernel.org >> Linux Kernel Mailing List" 
	<linux-kernel@...r.kernel.org>, KVM list <kvm@...r.kernel.org>,
	Oleg Nesterov <oleg@...hat.com>,
	Paul McKenney <paulmck@...ux.vnet.ibm.com>
Subject: Re: [4.2] commit d59cfc09c32 (sched, cgroup: replace
 signal_struct->group_rwsem with a global percpu_rwsem) causes regression for
 libvirt/kvm

Hello,

On Tue, Sep 15, 2015 at 03:36:34PM +0200, Christian Borntraeger wrote:
> >> The problem seems to be that the newly used percpu_rwsem does a
> >> rcu_synchronize_sched_expedited for all write downs/ups.
> > 
> > Can you try:
> > 
> > git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev.2015.09.11ab
> 
> yes, dev.2015.09.11a seems to help, thanks. Getting rid of the expedited hammer was
> really helpful - I guess.

Ah, that's nice.  I mentioned this in the original patchset but
percpu_rwsem as previously implemented could be too heavy on the
writer side for this path and I was planning to implement rwsem based
lglock if this blows up.  That said, if Oleg's changes makes the issue
go away, all the better.

> > those include Oleg's rework of the percpu rwsem which should hopefully
> > improve things somewhat.
> > 
> > But yes, pounding a global lock on a big machine will always suck.
> 
> By hacking out the fast path I actually degraded percpu rwsem to a real global lock, but
> things were still a lot faster. 
> I am wondering why the old code behaved in such fatal ways. Is there some interaction 
> between waiting for a reschedule in the synchronize_sched writer and some fork code 
> actually waiting for the read side to get the lock together with some rescheduling going
> on waiting for a lock that fork holds? lockdep does not give me an hints so I have no clue :-(

percpu_rwsem is a global lock.  My rough suspicion is that probably
the writer locking path was taking too long (especially if the kernel
has preemption disabled) making the writers getting backed up badly
starving the readers.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ