linux-kernel - Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <67afde12-3fed-4298-9c5e-fbb4819c52a8@arm.com>
Date: Tue, 2 Apr 2024 17:32:49 +0100
From: Robin Murphy <robin.murphy@....com>
To: Will Deacon <will@...nel.org>
Cc: Tyler Hicks <code@...icks.com>, Jason Gunthorpe <jgg@...pe.ca>,
 Jerry Snitselaar <jsnitsel@...hat.com>,
 linux-arm-kernel@...ts.infradead.org, iommu@...ts.linux.dev,
 linux-kernel@...r.kernel.org, Dexuan Cui <decui@...rosoft.com>,
 Easwar Hariharan <eahariha@...ux.microsoft.com>
Subject: Re: Why is the ARM SMMU v1/v2 put into bypass mode on kexec?

On 2024-03-22 3:51 pm, Will Deacon wrote:
> On Tue, Mar 19, 2024 at 06:17:39PM +0000, Robin Murphy wrote:
>> In terms of the shutdown behaviour, I think it actually works out as-is. For
>> the normal case we haven't touched GBPA, so we are truly returning to the
>> boot-time condition; in the unexpected case where SMMUEN was already enabled
>> then we'll go into an explicit GPBA abort state, but that seems a
>> not-unreasonable compromise for not preserving the entire boot-time Stream
>> Table etc., whose presence kind of implies it wouldn't have been bypassing
>> everything anyway.
>>
>> The more I look at the remaining aspect of disable_bypass for controlling
>> broken-DT behaviour the more I suspect it can't actually be useful either
>> way, especially not since default domains. I have no memory of what my
>> original reasoning might have been, so I'm inclined to just rip that all out
>> and let probe fail. I see no reason these days not to expect a broken DT to
>> leads to a broken system, especially not now with DTSchema validation.
> 
> That sounds reasonable to me, although we may end up having to back it
> out if we regress systems with borked firmware :(
> 
>> Then there's just the kdump warning it suppresses, of which I also have no
>> idea why it's there either, but apparently that one's on you :P
> 
> I think _that_ one is because the previous (crashed) kernel won't have
> torn anything down, so there could be active DMA using translations in
> the SMMU. In that case, the crashkernel (which is running from some
> carveout) may find the SMMU enabled, but it really can't stick it into
> bypass mode because that's likely to corrupt random memory. So in that
> case, we do stick it into abort before we reinitialise it and then we
> disabling fault reporting altogether to avoid the log spam:
> 
> 	if (is_kdump_kernel())
> 		enables &= ~(CR0_EVTQEN | CR0_PRIQEN)

Oh, I know why we do what we do for the kdump situation in general - it 
was merely the matter of why we chose to demand that the user explicitly 
tells us to do what we know is the right thing (and scream at them if 
they don't), rather than to just go ahead and do the right thing anyway.

(the significance of disable_bypass for kdump is after we turn the SMMU 
back on from GBPA Abort state - we don't want any ongoing traffic being 
able to inadvertently bypass via an STE config either)

Cheers,
Robin.