linux-kernel - Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <15f3f7e6-5dce-6bbf-30af-7cffbd7bb0c3@oracle.com>
Date:   Tue, 19 Mar 2019 19:29:00 -0700
From:   Subhra Mazumdar <subhra.mazumdar@...cle.com>
To:     Julien Desfossez <jdesfossez@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>, mingo@...nel.org,
        tglx@...utronix.de, pjt@...gle.com, tim.c.chen@...ux.intel.com,
        torvalds@...ux-foundation.org
Cc:     linux-kernel@...r.kernel.org, fweisbec@...il.com,
        keescook@...omium.org, kerrnel@...gle.com,
        Vineeth Pillai <vpillai@...italocean.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>
Subject: Re: [RFC][PATCH 03/16] sched: Wrap rq::lock access


On 3/18/19 8:41 AM, Julien Desfossez wrote:
> The case where we try to acquire the lock on 2 runqueues belonging to 2
> different cores requires the rq_lockp wrapper as well otherwise we
> frequently deadlock in there.
>
> This fixes the crash reported in
> 1552577311-8218-1-git-send-email-jdesfossez@...italocean.com
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 76fee56..71bb71f 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -2078,7 +2078,7 @@ static inline void double_rq_lock(struct rq *rq1, struct rq *rq2)
>   		raw_spin_lock(rq_lockp(rq1));
>   		__acquire(rq2->lock);	/* Fake it out ;) */
>   	} else {
> -		if (rq1 < rq2) {
> +		if (rq_lockp(rq1) < rq_lockp(rq2)) {
>   			raw_spin_lock(rq_lockp(rq1));
>   			raw_spin_lock_nested(rq_lockp(rq2), SINGLE_DEPTH_NESTING);
>   		} else {
With this fix and my previous NULL pointer fix my stress tests are 
surviving. I
re-ran my 2 DB instance setup on 44 core 2 socket system by putting each DB
instance in separate core scheduling group. The numbers look much worse 
now.

users  baseline  %stdev  %idle  core_sched  %stdev %idle
16     1         0.3     66     -73.4%      136.8 82
24     1         1.6     54     -95.8%      133.2 81
32     1         1.5     42     -97.5%      124.3 89

I also notice that if I enable a bunch of debug configs related to 
mutexes, spin
locks, lockdep etc. (which I did earlier to debug the dead lock), it 
opens up a
can of worms with multiple crashes.