[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44b77976-0efe-02d0-3c4e-4e54f91c4fe7@arm.com>
Date: Tue, 6 Nov 2018 20:20:35 +0000
From: Robin Murphy <robin.murphy@....com>
To: Geert Uytterhoeven <geert@...ux-m68k.org>,
Christoph Hellwig <hch@....de>,
Will Deacon <will.deacon@....com>,
Joerg Roedel <joro@...tes.org>,
Magnus Damm <magnus.damm@...il.com>
Cc: Linux-Renesas <linux-renesas-soc@...r.kernel.org>,
Linux ARM <linux-arm-kernel@...ts.infradead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux IOMMU <iommu@...ts.linux-foundation.org>
Subject: Re: IOMMU breakage on arm64
Hi Geert,
On 2018-11-06 7:44 pm, Geert Uytterhoeven wrote:
> Hi Christoph et al,
>
> On Tue, Oct 23, 2018 at 1:40 AM Linux Kernel Mailing List
> <linux-kernel@...r.kernel.org> wrote:
>> Commit: b4ebe6063204da58e48600b810a97c29ae9e5d12
>> Parent: 7d21ee4c719f00896767ce19c4c01a56374c2ced
>> Refname: refs/heads/master
>> Web: https://git.kernel.org/torvalds/c/b4ebe6063204da58e48600b810a97c29ae9e5d12
>> Author: Christoph Hellwig <hch@....de>
>> AuthorDate: Thu Sep 20 14:04:08 2018 +0200
>> Committer: Christoph Hellwig <hch@....de>
>> CommitDate: Mon Oct 1 07:28:03 2018 -0700
>>
>> dma-direct: implement complete bus_dma_mask handling
>>
>> Instead of rejecting devices with a too small bus_dma_mask we can handle
>> by taking the bus dma_mask into account for allocations and bounce
>> buffering decisions.
>>
>> Signed-off-by: Christoph Hellwig <hch@....de>
>
> I have bisected the following crash to the above commit:
I think that commit mostly just changes the presentation of my
underlying cockup - see here for what should fix it:
https://patchwork.kernel.org/patch/10670177/
I have a feeling we've ironed out crash-on-early-domain-free bugs in the
SMMU drivers already - arm-smmu certainly has an early return in
arm_smmu_destroy_domain_context() which should behave exactly like your
diff below, while I think arm-smmu-v3 gets away with it by virtue of
smmu_domain->cfg being unset, but I'll double-check that when I'm fresh
tomorrow (Jean-Philippe reported SMMUv3 hitting the DMA thing to me
internally, but didn't mention any crash).
Thanks for the report,
Robin.
> ipmmu-vmsa e67b0000.mmu: Cannot accommodate DMA translation for
> IOMMU page tables
> sata_rcar ee300000.sata: Unable to initialize IPMMU context
> iommu: Failed to add device ee300000.sata to group 0: -22
> Unable to handle kernel NULL pointer dereference at virtual
> address 0000000000000038
> Mem abort info:
> ESR = 0x96000004
> Exception class = DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> Data abort info:
> ISV = 0, ISS = 0x00000004
> CM = 0, WnR = 0
> [0000000000000038] user address but active_mm is swapper
> Internal error: Oops: 96000004 [#1] PREEMPT SMP
> CPU: 2 PID: 51 Comm: kworker/2:1 Not tainted
> 4.20.0-rc1-arm64-renesas-dirty #74
> Hardware name: Renesas Salvator-X 2nd version board based on
> r8a7795 ES2.0+ (DT)
> Workqueue: events deferred_probe_work_func
> pstate: 60000005 (nZCv daif -PAN -UAO)
> pc : ipmmu_domain_free+0x1c/0xa0
> lr : ipmmu_domain_free+0x14/0xa0
> sp : ffff000009c9b990
> x29: ffff000009c9b990 x28: 0000000000000000
> x27: ffff000008dff000 x26: 0000000000000000
> x25: ffff000008dd9000 x24: 0000000000000014
> x23: ffff8006fffdbb20 x22: ffff000008dff000
> x21: ffff8006f898e680 x20: ffff8006f8fa6c00
> x19: ffff8006f8fa6c08 x18: 0000000000000037
> x17: 0000000000000020 x16: ffff000008a8f780
> x15: ffff8006fb096f10 x14: ffff8006f93731c8
> x13: 0000000000000000 x12: ffff8006fb096f10
> x11: ffff8006f9372fe8 x10: ffff000008dff708
> x9 : ffff8006fb096f50 x8 : ffff000008dff708
> x7 : ffff0000089f1858 x6 : 0000000000000000
> x5 : 0000000000000000 x4 : ffff8006f89a3000
> x3 : 00008006f7131000 x2 : ffff8006fb2b5700
> x1 : 0000000000000000 x0 : 0000000000000028
> Process kworker/2:1 (pid: 51, stack limit = 0x(____ptrval____))
> Call trace:
> ipmmu_domain_free+0x1c/0xa0
> iommu_group_release+0x48/0x68
> kobject_put+0x74/0xe8
> kobject_del.part.0+0x3c/0x50
> kobject_put+0x60/0xe8
> iommu_group_get_for_dev+0xa8/0x1f0
> ipmmu_add_device+0x1c/0x40
> of_iommu_configure+0x118/0x190
> of_dma_configure+0xcc/0x1f0
> platform_dma_configure+0x18/0x28
> really_probe+0x94/0x2a8
> driver_probe_device+0x58/0x100
> __device_attach_driver+0x90/0xd0
> bus_for_each_drv+0x64/0xc8
> __device_attach+0xd8/0x130
> device_initial_probe+0x10/0x18
> bus_probe_device+0x98/0xa0
> deferred_probe_work_func+0x74/0xb0
> process_one_work+0x294/0x6f0
> worker_thread+0x238/0x460
> kthread+0x120/0x128
> ret_from_fork+0x10/0x1c
> Code: aa0003f3 97ffeb1a d1002274 f85f8261 (f9401c20)
> ---[ end trace 4c46c7fd7cd07245 ]---
>
> Reproducing on v4.20-rc1 requires CONFIG_IPMMU_VMSA=y, and whitelisting
> SATA for IOMMU use (https://patchwork.kernel.org/patch/10544059/).
> For older versions you also have to backport commit 3a0832d093693ede ("arm64:
> dts: renesas: salvator-xs: enable SATA").
>
> Actually there are two issues at hand:
>
> 1) drivers/iommu/io-pgtable-arm.c:__arm_lpae_alloc_pages() allocates
> memory (above 4 GiB) and maps it for DMA use, but it is rejected due
> to:
>
> dma_map_single(dev, pages, size, DMA_TO_DEVICE) != virt_to_phys(pages)
>
>> --- a/include/linux/dma-direct.h
>> +++ b/include/linux/dma-direct.h
>> @@ -27,7 +27,8 @@ static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
>> if (!dev->dma_mask)
>> return false;
>>
>> - return addr + size - 1 <= *dev->dma_mask;
>> + return addr + size - 1 <=
>> + min_not_zero(*dev->dma_mask, dev->bus_dma_mask);
>
> *dev->dma_mask = 0xffffffffff (40-bit)
> dev->bus_dma_mask = 0xffffffff (32-bit)
>
> Hence before, we had (in __arm_lpae_alloc_pages()):
>
> arm-lpae io-pgtable: pages = ffff8006f88f3000
> arm-lpae io-pgtable: dma = 0x00000007388f3000
> arm-lpae io-pgtable: virt_to_phys(pages) = 0x00000007388f3000
>
> After this change, we have:
>
> arm-lpae io-pgtable: pages = ffff8006f882b000
> arm-lpae io-pgtable: dma = 0x0000000074009000
> arm-lpae io-pgtable: virt_to_phys(pages) = 0x000000073882b000
>
> And SATA runs without using the IOMMU.
>
> 2) The Renesas IPMMU driver doesn't handle the above failure well,
> leading to a NULL pointer dereference.
> This can be fixed using the (gmail-whitespace-damaged) patch below:
>
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -635,7 +635,8 @@ static void ipmmu_domain_free(struct iommu_domain
> *io_domain)
> * been detached.
> */
> iommu_put_dma_cookie(io_domain);
> - ipmmu_domain_destroy_context(domain);
> + if (domain->mmu)
> + ipmmu_domain_destroy_context(domain);
> free_io_pgtable_ops(domain->iop);
> kfree(domain);
> }
>
> I expect drivers/iommu/arm-smmu-v3.c and drivers/iommu/arm-smmu.c
> need similar fixes.
> I didn't check all drivers, but e.g. drivers/iommu/amd_iommu.c has
> a similar check.
>
> Does the IOMMU work on other arm64 platforms in v4.20-rc1?
>
> Thanks for your comments!
>
> Gr{oetje,eeting}s,
>
> Geert
>
Powered by blists - more mailing lists