linux-kernel - Re: [RFC PATCH 14/16] irq: Add support for core-wide protection of IRQ and softirq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ed837e01-043b-e19b-293c-30d44df6f3a8@linux.intel.com>
Date:   Fri, 10 Jul 2020 20:19:24 +0800
From:   "Li, Aubrey" <aubrey.li@...ux.intel.com>
To:     Vineeth Remanan Pillai <vpillai@...italocean.com>,
        Nishanth Aravamudan <naravamudan@...italocean.com>,
        Julien Desfossez <jdesfossez@...italocean.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>, mingo@...nel.org,
        tglx@...utronix.de, pjt@...gle.com, torvalds@...ux-foundation.org
Cc:     "Joel Fernandes (Google)" <joel@...lfernandes.org>,
        linux-kernel@...r.kernel.org, subhra.mazumdar@...cle.com,
        fweisbec@...il.com, keescook@...omium.org, kerrnel@...gle.com,
        Phil Auld <pauld@...hat.com>, Aaron Lu <aaron.lwe@...il.com>,
        Aubrey Li <aubrey.intel@...il.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Pawan Gupta <pawan.kumar.gupta@...ux.intel.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Joel Fernandes <joelaf@...gle.com>, vineethrp@...il.com,
        Chen Yu <yu.c.chen@...el.com>,
        Christian Brauner <christian.brauner@...ntu.com>,
        Tim Chen <tim.c.chen@...el.com>,
        "Paul E . McKenney" <paulmck@...nel.org>
Subject: Re: [RFC PATCH 14/16] irq: Add support for core-wide protection of
 IRQ and softirq

Hi Joel/Vineeth,

On 2020/7/1 5:32, Vineeth Remanan Pillai wrote:
> From: "Joel Fernandes (Google)" <joel@...lfernandes.org>
> 
> With current core scheduling patchset, non-threaded IRQ and softirq
> victims can leak data from its hyperthread to a sibling hyperthread
> running an attacker.
> 
> For MDS, it is possible for the IRQ and softirq handlers to leak data to
> either host or guest attackers. For L1TF, it is possible to leak to
> guest attackers. There is no possible mitigation involving flushing of
> buffers to avoid this since the execution of attacker and victims happen
> concurrently on 2 or more HTs.
> 
> The solution in this patch is to monitor the outer-most core-wide
> irq_enter() and irq_exit() executed by any sibling. In between these
> two, we mark the core to be in a special core-wide IRQ state.
> 
> In the IRQ entry, if we detect that the sibling is running untrusted
> code, we send a reschedule IPI so that the sibling transitions through
> the sibling's irq_exit() to do any waiting there, till the IRQ being
> protected finishes.
> 
> We also monitor the per-CPU outer-most irq_exit(). If during the per-cpu
> outer-most irq_exit(), the core is still in the special core-wide IRQ
> state, we perform a busy-wait till the core exits this state. This
> combination of per-cpu and core-wide IRQ states helps to handle any
> combination of irq_entry()s and irq_exit()s happening on all of the
> siblings of the core in any order.
> 
> Lastly, we also check in the schedule loop if we are about to schedule
> an untrusted process while the core is in such a state. This is possible
> if a trusted thread enters the scheduler by way of yielding CPU. This
> would involve no transitions through the irq_exit() point to do any
> waiting, so we have to explicitly do the waiting there.
> 
> Every attempt is made to prevent a busy-wait unnecessarily, and in
> testing on real-world ChromeOS usecases, it has not shown a performance
> drop. In ChromeOS, with this and the rest of the core scheduling
> patchset, we see around a 300% improvement in key press latencies into
> Google docs when Camera streaming is running simulatenously (90th
> percentile latency of ~150ms drops to ~50ms).
> 
> This fetaure is controlled by the build time config option
> CONFIG_SCHED_CORE_IRQ_PAUSE and is enabled by default. There is also a
> kernel boot parameter 'sched_core_irq_pause' to enable/disable the
> feature at boot time. Default is enabled at boot time.

We saw a lot of soft lockups on the screen when we tested v6.

[  186.527883] watchdog: BUG: soft lockup - CPU#86 stuck for 22s! [uperf:5551]
[  186.535884] watchdog: BUG: soft lockup - CPU#87 stuck for 22s! [uperf:5444]
[  186.555883] watchdog: BUG: soft lockup - CPU#89 stuck for 22s! [uperf:5547]
[  187.547884] rcu: INFO: rcu_sched self-detected stall on CPU
[  187.553760] rcu: 	40-....: (14997 ticks this GP) idle=49a/1/0x4000000000000002 softirq=1711/1711 fqs=7279 
[  187.564685] NMI watchdog: Watchdog detected hard LOCKUP on cpu 14
[  187.564723] NMI watchdog: Watchdog detected hard LOCKUP on cpu 38

The problem is gone when we reverted this patch. We are running multiple
uperf threads(equal to cpu number) in a cgroup with coresched enabled.
This is 100% reproducible on our side.

Just wonder if anything already known before we dig into it.

Thanks,
-Aubrey