lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 19 Mar 2019 19:29:00 -0700 From: Subhra Mazumdar <subhra.mazumdar@...cle.com> To: Julien Desfossez <jdesfossez@...italocean.com>, Peter Zijlstra <peterz@...radead.org>, mingo@...nel.org, tglx@...utronix.de, pjt@...gle.com, tim.c.chen@...ux.intel.com, torvalds@...ux-foundation.org Cc: linux-kernel@...r.kernel.org, fweisbec@...il.com, keescook@...omium.org, kerrnel@...gle.com, Vineeth Pillai <vpillai@...italocean.com>, Nishanth Aravamudan <naravamudan@...italocean.com> Subject: Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access On 3/18/19 8:41 AM, Julien Desfossez wrote: > The case where we try to acquire the lock on 2 runqueues belonging to 2 > different cores requires the rq_lockp wrapper as well otherwise we > frequently deadlock in there. > > This fixes the crash reported in > 1552577311-8218-1-git-send-email-jdesfossez@...italocean.com > > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 76fee56..71bb71f 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -2078,7 +2078,7 @@ static inline void double_rq_lock(struct rq *rq1, struct rq *rq2) > raw_spin_lock(rq_lockp(rq1)); > __acquire(rq2->lock); /* Fake it out ;) */ > } else { > - if (rq1 < rq2) { > + if (rq_lockp(rq1) < rq_lockp(rq2)) { > raw_spin_lock(rq_lockp(rq1)); > raw_spin_lock_nested(rq_lockp(rq2), SINGLE_DEPTH_NESTING); > } else { With this fix and my previous NULL pointer fix my stress tests are surviving. I re-ran my 2 DB instance setup on 44 core 2 socket system by putting each DB instance in separate core scheduling group. The numbers look much worse now. users baseline %stdev %idle core_sched %stdev %idle 16 1 0.3 66 -73.4% 136.8 82 24 1 1.6 54 -95.8% 133.2 81 32 1 1.5 42 -97.5% 124.3 89 I also notice that if I enable a bunch of debug configs related to mutexes, spin locks, lockdep etc. (which I did earlier to debug the dead lock), it opens up a can of worms with multiple crashes.
Powered by blists - more mailing lists