linux-kernel - Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdXEP97-hYKx83Bk02=gofsZMJwiHAsz6KzkKeD3Swxf3A@mail.gmail.com>
Date:   Fri, 12 May 2023 10:03:22 +0200
From:   Geert Uytterhoeven <geert@...ux-m68k.org>
To:     Ulf Hansson <ulf.hansson@...aro.org>
Cc:     Stephen Boyd <sboyd@...nel.org>,
        Tomasz Figa <tomasz.figa@...il.com>,
        Sylwester Nawrocki <s.nawrocki@...sung.com>,
        Will Deacon <will@...nel.org>, Arnd Bergmann <arnd@...db.de>,
        Wolfram Sang <wsa+renesas@...g-engineering.com>,
        Dejin Zheng <zhengdejin5@...il.com>,
        Kai-Heng Feng <kai.heng.feng@...onical.com>,
        Nicholas Piggin <npiggin@...il.com>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Russell King <linux@...linux.org.uk>,
        John Stultz <jstultz@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Tony Lindgren <tony@...mide.com>,
        Krzysztof Kozlowski <krzk@...nel.org>,
        Tero Kristo <tero.kristo@...ux.intel.com>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        linux-arm-kernel@...ts.infradead.org,
        linux-renesas-soc@...r.kernel.org, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/2] iopoll: Do not use timekeeping in read_poll_timeout_atomic()

Hi Ulf,

On Fri, May 12, 2023 at 9:54 AM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> On Thu, 11 May 2023 at 14:44, Geert Uytterhoeven <geert@...ux-m68k.org> wrote:
> > On Thu, May 11, 2023 at 12:27 PM Ulf Hansson <ulf.hansson@...aro.org> wrote:
> > > On Wed, 10 May 2023 at 15:23, Geert Uytterhoeven
> > > <geert+renesas@...der.be> wrote:
> > > > read_poll_timeout_atomic() uses ktime_get() to implement the timeout
> > > > feature, just like its non-atomic counterpart.  However, there are
> > > > several issues with this, due to its use in atomic contexts:
> > > >
> > > >   1. When called in the s2ram path (as typically done by clock or PM
> > > >      domain drivers), timekeeping may be suspended, triggering the
> > > >      WARN_ON(timekeeping_suspended) in ktime_get():
> > > >
> > > >         WARNING: CPU: 0 PID: 654 at kernel/time/timekeeping.c:843 ktime_get+0x28/0x78
> > > >
> > > >      Calling ktime_get_mono_fast_ns() instead of ktime_get() would get
> > > >      rid of that warning.  However, that would break timeout handling,
> > > >      as (at least on systems with an ARM architectured timer), the time
> > > >      returned by ktime_get_mono_fast_ns() does not advance while
> > > >      timekeeping is suspended.
> > > >      Interestingly, (on the same ARM systems) the time returned by
> > > >      ktime_get() does advance while timekeeping is suspended, despite
> > > >      the warning.
> > >
> > > Interesting, looks like we should spend some time to further
> > > investigate this behaviour.
> >
> > Probably, I was a bit surprised by this behavior, too.
> >
> > > >   2. Depending on the actual clock source, and especially before a
> > > >      high-resolution clocksource (e.g. the ARM architectured timer)
> > > >      becomes available, time may not advance in atomic contexts, thus
> > > >      breaking timeout handling.
> > > >
> > > > Fix this by abandoning the idea that one can rely on timekeeping to
> > > > implement timeout handling in all atomic contexts, and switch from a
> > > > global time-based to a locally-estimated timeout handling.  In most
> > > > (all?) cases the timeout condition is exceptional and an error
> > > > condition, hence any additional delays due to underestimating wall clock
> > > > time are irrelevant.
> > >
> > > I wonder if this isn't an oversimplification of the situation. Don't
> > > we have timeout-error-conditions that we expected to happen quite
> > > frequently?
> >
> > We may have some.  But they definitely do not happen when time
> > does not advance, or they would have been mitigated long ago
> > (the loop would never terminate).
>
> Right, I was merely thinking of the case when ktime isn't suspended,
> which of course is the most common case.
>
> >
> > > If so, in these cases, we really don't want to continue looping longer
> > > than actually needed, as then we will remain in the atomic context
> > > longer than necessary.
> > >
> > > I guess some information about how big these additional delays could
> > > be, would help to understand better. Of course, it's not entirely easy
> > > to get that data, but did you run some tests to see how this changes?
> >
> > I did some timings (when timekeeping is available), and the differences
> > are rather minor.  The delay and timeout parameters are in µs, and
> > 1 µs is already a few orders of magnitude larger than the cycle time
> > of a contemporary CPU.
>
> Ohh, I was certainly expecting a bigger spread. If it's in that
> ballpark we should certainly be fine.
>
> I will run some tests at my side too, as I am curious to see the
> behaviour. I will let you know, whatever the result is, of course.
>
> >
> > Under-estimates are due to the time spent in op() (depends on the
> > user, typical use is a hardware device register read), udelay()
> > (architecture/platform-dependent accuracy), and general loop overhead.
>
> Yes, you are right. My main concern is the accuracy of the udelay, but
> I may be totally wrong here.
>
> >
> > > > Signed-off-by: Geert Uytterhoeven <geert+renesas@...der.be>
> > > > ---
> > > > Alternatively, one could use a mixed approach (use both
> > > > ktime_get_mono_fast_ns() and a local (under)estimate, and timeout on the
> > > > earliest occasion), but I think that would complicate things without
> > > > much gain.
> > >
> > > Another option could be to provide two different polling APIs for the
> > > atomic use-case.
> > >
> > > One that keeps using ktime, which is more accurate and generally
> > > favourable - and another, along the lines of what you propose, that
> > > should be used by those that can't rely on timekeeping.
> >
> > At the risk of people picking the wrong one, leading to hard to
> > find bugs?
>
> I agree, If we don't need two APIs, it's certainly better to stick with one.
>
> My main point is that we should not sacrifice "performance" for the
> most common case, just to keep things simple, right?

Most of these loops run just 1 or 2 cycles.
Performance mostly kicks in when timing out, but note that not
calling ktime_get() also reduces loop overhead...

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds