[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241004180405.555194-1-yang@os.amperecomputing.com>
Date: Fri, 4 Oct 2024 11:04:05 -0700
From: Yang Shi <yang@...amperecomputing.com>
To: jgg@...pe.ca,
nicolinc@...dia.com,
james.morse@....com,
will@...nel.org,
robin.murphy@....com
Cc: yang@...amperecomputing.com,
linux-arm-kernel@...ts.infradead.org,
iommu@...ts.linux.dev,
linux-kernel@...r.kernel.org
Subject: [v3 PATCH] iommu/arm-smmu-v3: Fix L1 stream table index calculation for 32-bit sid size
The commit ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()")
calculated the last index of L1 stream table by 1 << smmu->sid_bits. 1
is 32 bit value.
However some platforms, for example, AmpereOne and the platforms with
ARM MMU-700, have 32-bit stream id size. This resulted in ouf-of-bound shift.
The disassembly of shift is:
ldr w2, [x19, 828] //, smmu_7(D)->sid_bits
mov w20, 1
lsl w20, w20, w2
According to ARM spec, if the registers are 32 bit, the instruction actually
does:
dest = src << (shift % 32)
So it actually shifted by zero bit.
The out-of-bound shift is also undefined behavior according to C
language standard.
This caused v6.12-rc1 failed to boot on such platforms.
UBSAN also reported:
UBSAN: shift-out-of-bounds in drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:3628:29
shift exponent 32 is too large for 32-bit type 'int'
Using 64 bit immediate when doing shift can solve the problem. The
disassembly after the fix looks like:
ldr w20, [x19, 828] //, smmu_7(D)->sid_bits
mov x0, 1
lsl x0, x0, x20
There are a couple of problematic places, extracted the shift into a helper.
Fixes: ce410410f1a7 ("iommu/arm-smmu-v3: Add arm_smmu_strtab_l1/2_idx()")
Tested-by: James Morse <james.morse@....com>
Reviewed-by: Jason Gunthorpe <jgg@...dia.com>
Signed-off-by: Yang Shi <yang@...amperecomputing.com>
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 +++++++++++-----
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 +++++
2 files changed, 16 insertions(+), 5 deletions(-)
v3: * Some trivial modification to the commit log per Robin Murphy.
* Used "num_sids" instead of "max_sids" per Robin Murphy.
* Returned u64 type for arm_smmu_strtab_num_sids() per Nicolin Chen.
* Checked size in arm_smmu_init_strtab_linear() in order to avoid
overflow per Jason Gunthorpe.
* Collected r-b tag from Jason Gunthorpe.
v2: * Extracted the shift into a helper per Jason Gunthorpe.
* Covered more places per Nicolin Chen and Jason Gunthorpe.
* Used 1ULL instead of 1UL to guarantee 64 bit per Robin Murphy.
* Made the subject more general since this is not AmpereOne specific
problem per the report from James Morse.
* Collected t-b tag from James Morse.
* Added Fixes tag in commit log.
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 737c5b882355..9d4fc91d9258 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -3624,8 +3624,9 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
{
u32 l1size;
struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+ u64 num_sids = arm_smmu_strtab_num_sids(smmu);
unsigned int last_sid_idx =
- arm_smmu_strtab_l1_idx((1 << smmu->sid_bits) - 1);
+ arm_smmu_strtab_l1_idx(num_sids - 1);
/* Calculate the L1 size, capped to the SIDSIZE. */
cfg->l2.num_l1_ents = min(last_sid_idx + 1, STRTAB_MAX_L1_ENTRIES);
@@ -3655,20 +3656,25 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu)
static int arm_smmu_init_strtab_linear(struct arm_smmu_device *smmu)
{
- u32 size;
+ u64 size;
struct arm_smmu_strtab_cfg *cfg = &smmu->strtab_cfg;
+ u64 num_sids = arm_smmu_strtab_num_sids(smmu);
+
+ size = num_sids * sizeof(struct arm_smmu_ste);
+ /* The max size for dmam_alloc_coherent() is 32-bit */
+ if (size > SIZE_MAX)
+ return -EINVAL;
- size = (1 << smmu->sid_bits) * sizeof(struct arm_smmu_ste);
cfg->linear.table = dmam_alloc_coherent(smmu->dev, size,
&cfg->linear.ste_dma,
GFP_KERNEL);
if (!cfg->linear.table) {
dev_err(smmu->dev,
- "failed to allocate linear stream table (%u bytes)\n",
+ "failed to allocate linear stream table (%llu bytes)\n",
size);
return -ENOMEM;
}
- cfg->linear.num_ents = 1 << smmu->sid_bits;
+ cfg->linear.num_ents = num_sids;
arm_smmu_init_initial_stes(cfg->linear.table, cfg->linear.num_ents);
return 0;
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 1e9952ca989f..c8ceddc5e8ef 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -853,6 +853,11 @@ struct arm_smmu_master_domain {
ioasid_t ssid;
};
+static inline u64 arm_smmu_strtab_num_sids(struct arm_smmu_device *smmu)
+{
+ return (1ULL << smmu->sid_bits);
+}
+
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
{
return container_of(dom, struct arm_smmu_domain, domain);
--
2.41.0
Powered by blists - more mailing lists