lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 20 May 2020 22:52:22 +0200
From:   Arnd Bergmann <arnd@...db.de>
To:     Rich Felker <dalias@...c.org>
Cc:     Szabolcs Nagy <szabolcs.nagy@....com>,
        Adhemerval Zanella <adhemerval.zanella@...aro.org>,
        Vincenzo Frascino <vincenzo.frascino@....com>,
        Russell King - ARM Linux <linux@...linux.org.uk>,
        Will Deacon <will@...nel.org>,
        Jack Schmidt <jack.schmidt@....edu>,
        Linux ARM <linux-arm-kernel@...ts.infradead.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Stephen Boyd <sboyd@...nel.org>, nd <nd@....com>,
        Florian Fainelli <f.fainelli@...il.com>,
        Mark Rutland <mark.rutland@....com>,
        Marc Zyngier <maz@...nel.org>
Subject: Re: clock_gettime64 vdso bug on 32-bit arm, rpi-4

On Wed, May 20, 2020 at 7:09 PM Rich Felker <dalias@...c.org> wrote:
>
> On Wed, May 20, 2020 at 12:08:10PM -0400, Rich Felker wrote:
> > On Wed, May 20, 2020 at 04:41:29PM +0100, Szabolcs Nagy wrote:
> > > The 05/19/2020 22:31, Arnd Bergmann wrote:
> > > > On Tue, May 19, 2020 at 10:24 PM Adhemerval Zanella
> > > > <adhemerval.zanella@...aro.org> wrote:
> > > > > On 19/05/2020 16:54, Arnd Bergmann wrote:
> > > note: i could not reproduce it in qemu-system with these configs:
> > >
> > > qemu-system-aarch64 + arm64 kernel + compat vdso
> > > qemu-system-aarch64 + kvm accel (on cortex-a72) + 32bit arm kernel
> > > qemu-system-arm + cpu max + 32bit arm kernel
> > >
> > > so i think it's something specific to that user's setup
> > > (maybe rpi hw bug or gcc miscompiled the vdso or something
> > > with that particular linux, i built my own linux 5.6 because
> > > i did not know the exact kernel version where the bug was seen)
> > >
> > > i don't have access to rpi (or other cortex-a53 where i
> > > can install my own kernel) so this is as far as i got.
> >
> > If we have a binary of the kernel that's known to be failing on the
> > hardware, it would be useful to dump its vdso and examine the
> > disassembly to see if it was miscompiled.
>
> OK, OP posted it and I think we've solved this. See
> https://github.com/richfelker/musl-cross-make/issues/96#issuecomment-631604410

Thanks a lot everyone for figuring this out.

> And my analysis:
>
> <@dalias> see what i just found on the tracker
> <@dalias> patch_vdso/vdso_nullpatch_one in arch/arm/kernel/vdso.c patches out the time32 functions in this case
> <@dalias> but not the time64 one
> <@dalias> this looks like a real kernel bug that's not hw-specific except breaking on all hardware where the patching-out is needed
> <@dalias> we could possibly work around it by refusing to use the time64 vdso unless the time32 one is also present
> <@dalias> yep
> <@dalias> so i think we've solved this. the kernel thought it wasnt using vdso anymore because it patched it out
> <@dalias> but it forgot to patch out the time64 one
> <@dalias> so it stopped updating the data needed for vdso to work

As you mentioned in the issue tracker, the patching was meant as
an optimization and missing it for clock_gettime64 was a mistake but
should by itself not have caused incorrect data to be returned.

I would assume that there is another bug that leads to clock_gettime64
not entering the syscall fallback path as it should but instead returning
bogus data.

Here are some more things I found:

- From reading the linux-5.6 code that was tested, I see that a condition
  that leads to patching out the clock_gettime() vdso should also lead to
  clock_gettime64() falling back to the the syscall after
  __arch_get_hw_counter() returns an error, but for some reason that
  does not happen. Presumably the presence of the patching meant that
  this code path was never much exercised.
  A missing 45939ce292b4 ("ARM: 8957/1: VDSO: Match ARMv8 timer in
  cntvct_functional()") would explain the problem, if it happened on
  linux-5.6-rc7 or earlier. The fix was merged in the final v5.6 though.

- The patching may actually be counterproductive because it means that
   clock_gettime(CLOCK_*COARSE, ...) has to go through the system call
   when it could just return the time of the last timer tick regardless of the
   clocksource.

- We may get bitten by errata handling on 32-bit kernels running on 64-bit
  hardware that has errata workaround in arch/arm64 for compat mode
  but not in native arm kernels. ARM64_ERRATUM_1418040,
  ARM64_ERRATUM_858921 or SUN50I_ERRATUM_UNKNOWN1
  are examples of workaround that are not used on 32-bit kernels running
  on 64-bit hardware.

         Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ