[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200122103553.GN25745@shell.armlinux.org.uk>
Date: Wed, 22 Jan 2020 10:35:53 +0000
From: Russell King - ARM Linux admin <linux@...linux.org.uk>
To: Guido Günther <agx@...xcpu.org>
Cc: Arnd Bergmann <arnd@...db.de>,
Lucas Stach <l.stach@...gutronix.de>,
Christian Gmeiner <christian.gmeiner@...il.com>,
David Airlie <airlied@...ux.ie>,
Daniel Vetter <daniel@...ll.ch>,
Philipp Zabel <p.zabel@...gutronix.de>,
Sam Ravnborg <sam@...nborg.org>, Rob Herring <robh@...nel.org>,
Emil Velikov <emil.velikov@...labora.com>,
The etnaviv authors <etnaviv@...ts.freedesktop.org>,
dri-devel <dri-devel@...ts.freedesktop.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] drm/etnaviv: only reject timeouts with tv_nsec >= 2
seconds
On Wed, Jan 22, 2020 at 11:30:34AM +0100, Guido Günther wrote:
> Hi,
> On Tue, Jan 21, 2020 at 08:05:27PM +0100, Arnd Bergmann wrote:
> > On Tue, Jan 21, 2020 at 5:10 PM Lucas Stach <l.stach@...gutronix.de> wrote:
> > >
> > > Hi Guido,
> > >
> > > On Di, 2020-01-21 at 13:55 +0100, Guido Günther wrote:
> > > > Hi,
> > > > On Tue, Jan 21, 2020 at 12:45:25PM +0100, Arnd Bergmann wrote:
> > > > > As Guido Günther reported, get_abs_timeout() in the etnaviv user space
> > > > > sometimes passes timeouts with nanosecond values larger than 1000000000,
> > > > > which gets rejected after my first patch.
> > > > >
> > > > > To avoid breaking this, while also not allowing completely arbitrary
> > > > > values, set the limit to 1999999999 and use set_normalized_timespec64()
> > > > > to get the correct format before comparing it.
> > > >
> > > > I'm seeing values up to 5 seconds so I need
> > > >
> > > > if (args->timeout.tv_nsec > (5 * NSEC_PER_SEC))
> > > >
> > > > to unbreak rendering. Which seems to match what mesa's get_abs_timeout()
> > > > does and how it's invoked.
> > >
> > > I have not tested this myself yet, only looked at the code. From the
> > > code I quoted earlier, I don't see how we end up with 5 * NSEC_PER_SEC
> > > in the tv_nsec member, even if the timeout passed to get_abs_timeout()
> > > is 5 seconds.
> >
> > I can think of two different ways you'd end up with around five seconds here:
> >
> > a) you have a completely arbitrary 32-bit number through truncation,
> > which is up to 4.2 seconds
> > b) you have the same kind of 32-bit number, but add up to another 999999999
> > nanoseconds, so you get up to 5.2 seconds in the 64-bit field.
>
> I've dumped out some values tv_nsec values with current mesa git on arm64:
>
> [ 33.699652] etnaviv_ioctl_gem_cpu_prep: 4990449401
> [ 33.813081] etnaviv_ioctl_gem_cpu_prep: 5103872445
> [ 33.822936] etnaviv_ioctl_gem_cpu_prep: 5113731286
> [ 33.840963] etnaviv_ioctl_gem_cpu_prep: 5131762726
> [ 33.854120] etnaviv_ioctl_gem_cpu_prep: 5144920127
> [ 33.861426] etnaviv_ioctl_gem_cpu_prep: 5152227527
> [ 33.872666] etnaviv_ioctl_gem_cpu_prep: 5163466968
> [ 33.879485] etnaviv_ioctl_gem_cpu_prep: 5170286808
>
> The problem is that in mesa/libdrm
>
> static inline void get_abs_timeout(struct drm_etnaviv_timespec *tv, uint64_t ns)
> {
> struct timespec t;
> uint32_t s = ns / 1000000000;
> clock_gettime(CLOCK_MONOTONIC, &t);
> tv->tv_sec = t.tv_sec + s;
> tv->tv_nsec = t.tv_nsec + ns - (s * 1000000000);
> ^^^^^^^^^^^^^^^
> this overflows (since `s` is `uint_32t` and hence we substract a way
> too small value with ns = 5000000000 which mesa uses in
> etna_bo_cpu_prep.
> }
>
> So with current mesa/libdrm (which needs to be fixed) we'd have a maximum
>
> t.tv_nsec + ns - (s_max * 1000000000)
>
> 999999999 + 5000000000 - 705032704 = 5294967295
>
> Does that make sense? If so that'd be the possible upper bound for the
> kernel. Note that this only applies to etnaviv_ioctl_gem_cpu_prep. While
> etnaviv_ioctl_wait_fence and etnaviv_ioctl_gem_wait are affected too
> i've not yet seen user space passing in larger values.
Except the fact that the calculation being done above is buggy.
Not only do we end up with tv_sec incremented by 5 seconds, but
we also end up with tv_nsec containing around 5 seconds in
nanoseconds, which means we end up with about a 10 second timeout.
I think it would probably be better for the kernel to print a
warning once when noticing over-large nsec values, suggesting a
userspace upgrade is in order, but continue the existing behaviour.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
Powered by blists - more mailing lists