lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGETcx9ApZFvKjEaxvvgsoHDzOq06ZiROZ5npYt+suNdE4KWDg@mail.gmail.com>
Date:   Mon, 6 Feb 2023 19:32:51 -0800
From:   Saravana Kannan <saravanak@...gle.com>
To:     Sam Protsenko <semen.protsenko@...aro.org>
Cc:     Marek Szyprowski <m.szyprowski@...sung.com>,
        Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>,
        Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
        Robin Murphy <robin.murphy@....com>,
        Sumit Semwal <sumit.semwal@...aro.org>,
        Alim Akhtar <alim.akhtar@...sung.com>,
        Janghyuck Kim <janghyuck.kim@...sung.com>,
        Cho KyongHo <pullip.cho@...sung.com>,
        Daniel Mentz <danielmentz@...gle.com>,
        David Virag <virag.david003@...il.com>, iommu@...ts.linux.dev,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-samsung-soc@...r.kernel.org, Rob Herring <robh@...nel.org>
Subject: Re: [PATCH v2 0/6] iommu/exynos: Convert to a module

On Fri, Nov 11, 2022 at 5:30 AM Sam Protsenko
<semen.protsenko@...aro.org> wrote:
>
> On Thu, 10 Nov 2022 at 15:36, Marek Szyprowski <m.szyprowski@...sung.com> wrote:
>
> [snip]

Hi Marek and Sam,

I'm replying to both of your comments in this email.

> > I've finally made Exynos IOMMU working as a module on Exynos5433 based
> > TM2e board. It looks that this will be a bit longer journey that I've
> > initially thought. I've posted a simple update of the fix for the driver
> > initialization sequence, but the real problem is in the platform driver
> > framework and OF helpers.
> >
> > Basically to get it working as a module I had to apply the following
> > changes:
> >
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 3dda62503102..f6921f5fcab6 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -257,7 +257,7 @@ static int deferred_devs_show(struct seq_file *s,
> > void *data)
> >   DEFINE_SHOW_ATTRIBUTE(deferred_devs);
> >
> >   #ifdef CONFIG_MODULES
> > -int driver_deferred_probe_timeout = 10;
> > +int driver_deferred_probe_timeout = 30;
> >   #else
> >   int driver_deferred_probe_timeout;
> >   #endif
> > diff --git a/drivers/of/property.c b/drivers/of/property.c
> > index 967f79b59016..e5df6672fee6 100644
> > --- a/drivers/of/property.c
> > +++ b/drivers/of/property.c
> > @@ -1384,7 +1384,7 @@ static struct device_node *parse_interrupts(struct
> > device_node *np,
> >   static const struct supplier_bindings of_supplier_bindings[] = {
> >          { .parse_prop = parse_clocks, },
> >          { .parse_prop = parse_interconnects, },
> > -       { .parse_prop = parse_iommus, .optional = true, },
> > +       { .parse_prop = parse_iommus, },
> >          { .parse_prop = parse_iommu_maps, .optional = true, },
> >          { .parse_prop = parse_mboxes, },
> >          { .parse_prop = parse_io_channels, },
> >
> > Without that a really nasty things happened.

I have a command line option to do this without code changes. Use
fw_devlink.strict=1. That makes all optional properties into mandatory
ones.

I sent out a series[1] that tried to make fw_devlink.strict=1 the
default and then use the timeout behavior (more details) to handle
cases where iommu and dmas (or any other supplier) are optional on a
specific board. The cover letter of [1] should give some more context.

> > Initialization of the built-in drivers and loading modules takes time,
> > so the default 10s deferred probe timeout is not enough to ensure that
> > the built-in driver won't be probed before the Exynos IOMMU driver is
> > loaded.

The 10 second is the minimum delay from the time we hit late_initcall.
If a driver is registered before the 10s expires, then the timer will
be extended by another 10s. This behavior landed sometime around the
end of May 2022. So it should have been in your tree when you tested
this. I'm surprised this isn't sufficient for your case. Is there
really a 10s gap in your boot sequence where no module is being loaded
and then IOMMU modules get loaded later on? I'm kinda surprised by
this. Is it this long because some serial UART is enabled and it's
slowing down boot? Or something else?

I'm not saying your case isn't valid or we shouldn't extend the
timeout. I'm just trying to understand why the current timer behavior
wasn't able to cover your case.

> Yeah, the whole time-based sync looks nasty... I remember coming
> across the slides by Andrzej Hajda called "Deferred Problem" [1], but
> I guess the proposed solution was never applied. Just hope that
> increasing the timeout is upstreamable solution.
>
> [1] https://events19.linuxfoundation.org/wp-content/uploads/2017/12/Deferred-Problem-Issues-With-Complex-Dependencies-Between-Devices-in-Linux-Kernel-Andrzej-Hajda-Samsung.pdf

Sam, I kinda skimmed the slides right now. Looks like it talks about
device links and why they aren't sufficient and makes an alternate
proposal. fw_devlink is a solution that uses device links and I think
addresses a lot of the issues that were raised about device links.
There's still a bunch of TODOs left, but I think the end goal is the
same. I'm hoping to keep chipping away at it. For now, I've tried to
make the timer a bit more smart about detecting when modules are
getting loaded and extending the timer. fw_devlink also enables
something called sync_state() that's invaluable on a fully modular
system (search lore for references to that to get some idea).

The slides talk about a solution that will allow devices to probe with
limited functionality with whatever suppliers are available and then
reprobe as more suppliers are available. I'm not sure how well that'll
work across the board. It's going to be a bit weird if your phone
display goes off and then comes on again because an IOMMU driver got
loaded (and it can now do DRM playback). For now, I'm not going to
focus on that option because there are enough existing issues/TODOs to
work on for fw_devlink.

> > The second change fixes the problem that driver core probes Exynos IOMMU
> > controllers in parallel to probing the master devices, what results in
> > calling exynos_iommu_of_xlate() and exynos_iommu_probe_device() even on
> > the partially initialized IOMMU controllers or initializing the dma_ops
> > under the already probed and working master device. This was easy to
> > observe especially on the master devices with multiple IOMMU
> > controllers. I wasn't able to solve this concurrency/race issues inside
> > the Exynos IOMMU driver.
> >
> > Frankly speaking I don't know what is the rationale for making the
> > 'iommus' property optional, but this simply doesn't work well with IOMMU
> > driver being a module. CCed Saravana and Rob for this.
> >
>
> The patch which makes 'iommus' optional doesn't provide much of
> insight on reasons in commit message either.

This was the commit text:

    Not all DT bindings are mandatory bindings. Add support for optional DT
    bindings and mark iommus, iommu-map, dmas as optional DT bindings.

I thought it was obvious enough, but I guess I could have done better.
Geert convinced me that iommu's aren't always necessary and devices
could work perfectly well without them or dmas. And he has a bunch of
boards like that. So I went with adding optional and then introducing
fw_devlink.strict.

However, at this point in time, I believe none of them should be
marked as optional because technically any property can be optional
depending on what the firmware has set up and what the driver does. We
should figure this out at runtime on a board level -- which is what
[1] is trying to do. Yeah, not very pretty, but there hasn't been a
better solution that's not "have userspace tell us it's done loading
modules" (that's a "kernel depends on userspace to work correctly"
thing that no one likes). I've fixed some of the issues raised in [1]
in a fw_devlink improvement series[2] and I plan on continuing to work
on this until hopefully [1] can land.

> > Without fixing the above issues, I would add a warning that compiling
> > the driver as a module leads to serious issues.
> >
>
> Nice catch! It doesn't reproduce on my platform, alas. Can I expect
> you to submit those patches? If so, I'll probably just wait for those
> to be applied, and then re-send my modularization series on top of it.
> Does that sounds reasonable?

For now, maybe we could add a config to enable fw_devlink.strict=1 by
default and then select it if you make specific iommu drivers into
modules? And then Geert won't set it for his driver, but you can set
it for your driver?

Thanks,
Saravana

[1] - https://lore.kernel.org/lkml/20220601070707.3946847-1-saravanak@google.com/
[2] - https://lore.kernel.org/lkml/20230207014207.1678715-1-saravanak@google.com/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ