[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c794b22ed0e91356e2c1a39849a4b893@mailhost.ics.forth.gr>
Date: Thu, 12 May 2022 05:32:50 +0300
From: Nick Kossifidis <mick@....forth.gr>
To: Xianting Tian <xianting.tian@...ux.alibaba.com>
Cc: paul.walmsley@...ive.com, palmer@...belt.com,
aou@...s.berkeley.edu, akpm@...ux-foundation.org,
anup@...infault.org, wangkefeng.wang@...wei.com, rppt@...nel.org,
alex@...ti.fr, twd2.me@...il.com, david@...hat.com,
seanjc@...gle.com, petr.pavlu@...e.com, atishp@...osinc.com,
linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org,
Nick Kossifidis <mick@....forth.gr>
Subject: Re: [PATCH] RISC-V: Remove IORESOURCE_BUSY flag for no-map reserved
memory
Hello Xianting,
> ---
> arch/riscv/kernel/setup.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 834eb652a7b9..71f2966b1474 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -214,7 +214,7 @@ static void __init init_resources(void)
>
> if (unlikely(memblock_is_nomap(region))) {
> res->name = "Reserved";
> - res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> + res->flags = IORESOURCE_MEM;
> } else {
> res->name = "System RAM";
> res->flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
The short story:
This makes sense but we should at least mark them as
IORESOURCE_EXCLUSIVE, and also remove IORESOURCE_BUSY from line 192
above
(https://elixir.bootlin.com/linux/v5.18-rc6/source/arch/riscv/kernel/setup.c#L192).
The long story:
The spec says about no-map:
"
Indicates the operating system must not create a virtual mapping
of the region as part of its standard mapping of system memory,
nor permit speculative access to it under any circumstances other
than under the control of the device driver using the region.
"
It basically says that only the driver that uses the region should be
able to create mappings for it and access it, and even that is not
enough to prevent speculative access to the region by someone other than
the driver. The thing is we can't implement this in a simple way because
-to begin with- we don't have a way to pin those regions to specific
devices/drivers, the memory-region binding doesn't say anything about
that. When using devm_ioremap_resource() the first driver to claim the
resource will mark it as busy so other drivers using the same api won't
be able to use it, however the region can still be mapped in other ways
(e.g. through ioremap directly) so using the resource tree to
track/protect the regions marked with no-map isn't enough. They can even
be accessed from userspace through /dev/mem unless we add them to the
resource tree as IORESOURCE_MEM and enable/set
CONFIG_IO_STRICT_DEVMEM/iomem=strict, but even then in case the
corresponding ioresource isn't busy (e.g. hasn't been claimed by a
driver yet through devm_ioremap_resource) we still have to mark them as
IORESOURCE_EXCLUSIVE for iomem_is_exclusive() to do the trick
(https://elixir.bootlin.com/linux/v5.18-rc6/source/kernel/resource.c#L1739)
and prevent access through /dev/mem.
Finally the definition of no-map and the definition of MEMBLOCK_NOMAP
are not the same, all MEMBLOCK_NOMAP says is "don't add to kernel direct
mapping" so ioremap that uses vmalloc can still be used by everyone, in
general it's a mess. It becomes worse, if you mark a reserved-memory
region with no-map and that region overlaps with /memory,
early_init_dt_reserve_memory_arch() will isolate it, mark it as
MEMBLOCK_NOMAP and won't add it to memblock.reserved, if it doesn't
overlap it will just ignore it and still not add it to
memblock.reserved. So if we want to add a reserved-memory region that
doesn't overlap with /memory (a valid scenario allowed by the binding),
there is no way to mark it with no-map.
When I wrote that code I was looking for a way to prevent access to
reserved regions through /dev/mem and by other drivers, regardless of
being part of /memory or not, and since I couldn't mark them with no-map
to track them because of early_init_dt_reserve_memory_arch(), I marked
them as busy and then used them from the driver with ioremap directly.
It was a temporary measure until I had a better approach (e.g. override
ioremap / devmem_is_allowed like other archs do) but I never got the
time to finish it, sorry for the mess !
Regards,
Nick
Powered by blists - more mailing lists