[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250425152733-0ff10421-b716-4a55-9b60-cb0a71769e56@linutronix.de>
Date: Fri, 25 Apr 2025 15:40:55 +0200
From: Thomas Weißschuh <thomas.weissschuh@...utronix.de>
To: Jan Stancek <jstancek@...hat.com>
Cc: Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>, Vincenzo Frascino <vincenzo.frascino@....com>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Anna-Maria Behnsen <anna-maria@...utronix.de>, Frederic Weisbecker <frederic@...nel.org>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, "H. Peter Anvin" <hpa@...or.com>,
Madhavan Srinivasan <maddy@...ux.ibm.com>, Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>, Christophe Leroy <christophe.leroy@...roup.eu>,
Naveen N Rao <naveen@...nel.org>, Heiko Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>, Alexander Gordeev <agordeev@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>, Sven Schnelle <svens@...ux.ibm.com>,
Arnd Bergmann <arnd@...db.de>, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
linux-arch@...r.kernel.org, Nam Cao <namcao@...utronix.de>
Subject: Re: [PATCH 08/19] vdso/gettimeofday: Prepare do_hres_timens() for
introduction of struct vdso_clock
On Fri, Apr 25, 2025 at 12:03:14PM +0200, Jan Stancek wrote:
> On Fri, Apr 25, 2025 at 10:58 AM Thomas Weißschuh
> <thomas.weissschuh@...utronix.de> wrote:
> >
> > On Thu, Apr 24, 2025 at 11:57:02PM +0200, Jan Stancek wrote:
> > > On Thu, Apr 24, 2025 at 5:49 PM Thomas Weißschuh
> > > <thomas.weissschuh@...utronix.de> wrote:
> > > >
> > > > On Thu, Apr 24, 2025 at 04:10:04PM +0200, Jan Stancek wrote:
> > > > > On Mon, Mar 03, 2025 at 12:11:10PM +0100, Thomas Weißschuh wrote:
> > > > > > From: Anna-Maria Behnsen <anna-maria@...utronix.de>
> > > > > >
> > > > > > To support multiple PTP clocks, the VDSO data structure needs to be
> > > > > > reworked. All clock specific data will end up in struct vdso_clock and in
> > > > > > struct vdso_time_data there will be array of it. By now, vdso_clock is
> > > > > > simply a define which maps vdso_clock to vdso_time_data.
> > > > > >
> > > > > > Prepare for the rework of these structures by adding struct vdso_clock
> > > > > > pointer argument to do_hres_timens(), and replace the struct vdso_time_data
> > > > > > pointer with the new pointer arugment whenever applicable.
> > > > > >
> > > > > > No functional change.
> > > > > >
> > > > > > Signed-off-by: Anna-Maria Behnsen <anna-maria@...utronix.de>
> > > > > > Signed-off-by: Nam Cao <namcao@...utronix.de>
> > > > > > Signed-off-by: Thomas Weißschuh <thomas.weissschuh@...utronix.de>
> > > > > > ---
> > > > > > lib/vdso/gettimeofday.c | 35 ++++++++++++++++++-----------------
> > > > > > 1 file changed, 18 insertions(+), 17 deletions(-)
> > > > > >
> > > > >
> > > > > starting with this patch, I'm seeing user-space crashes when using clock_gettime():
> > > > > BAD -> 83a2a6b8cfc5 vdso/gettimeofday: Prepare do_hres_timens() for introduction of struct vdso_clock
> > > > > GOOD -> 64c3613ce31a vdso/gettimeofday: Prepare do_hres() for introduction of struct vdso_clock
> > > > >
> > > > > It appears to be unique to aarch64 with 64k pages, and can be reproduced with
> > > > > LTP clock_gettime03 [1]:
> > > > > command: clock_gettime03 tst_kconfig.c:88: TINFO: Parsing kernel config '/lib/modules/6.15.0-0.rc3.20250423gitbc3372351d0c.30.eln147.aarch64+64k/build/.config'
> > > > > tst_test.c:1903: TINFO: LTP version: 20250130-231-gd02c2aea3
> > > > > tst_test.c:1907: TINFO: Tested kernel: 6.15.0-0.rc3.20250423gitbc3372351d0c.30.eln147.aarch64+64k #1 SMP PREEMPT_DYNAMIC Wed Apr 23 23:23:54 UTC 2025 aarch64
> > > > > tst_kconfig.c:88: TINFO: Parsing kernel config '/lib/modules/6.15.0-0.rc3.20250423gitbc3372351d0c.30.eln147.aarch64+64k/build/.config'
> > > > > tst_test.c:1720: TINFO: Overall timeout per run is 0h 05m 24s
> > > > > clock_gettime03.c:121: TINFO: Testing variant: vDSO or syscall with libc spec
> > > > > clock_gettime03.c:76: TPASS: Offset (CLOCK_MONOTONIC) is correct 10000ms
> > > > > clock_gettime03.c:86: TPASS: Offset (CLOCK_MONOTONIC) is correct 0ms
> > > > > clock_gettime03.c:76: TPASS: Offset (CLOCK_BOOTTIME) is correct 10000ms
> > > > > clock_gettime03.c:86: TPASS: Offset (CLOCK_BOOTTIME) is correct 0ms
> > > > > clock_gettime03.c:76: TPASS: Offset (CLOCK_MONOTONIC) is correct -10000ms
> > > > > clock_gettime03.c:86: TPASS: Offset (CLOCK_MONOTONIC) is correct 0ms
> > > > > clock_gettime03.c:76: TPASS: Offset (CLOCK_BOOTTIME) is correct -10000ms
> > > > > clock_gettime03.c:86: TPASS: Offset (CLOCK_BOOTTIME) is correct 0ms
> > > > > tst_test.c:438: TBROK: Child (233649) killed by signal SIGSEGV
> > > > >
> > > > > or with:
> > > > > --------------------- 8< ----------------------
> > > > > #define _GNU_SOURCE
> > > > > #include <sched.h>
> > > > > #include <time.h>
> > > > > #include <unistd.h> #include <sys/wait.h>
> > > > >
> > > > > int main(void)
> > > > > {
> > > > > struct timespec tp;
> > > > > pid_t child;
> > > > > int status;
> > > > >
> > > > > unshare(CLONE_NEWTIME);
> > > > >
> > > > > child = fork();
> > > > > if (child == 0) {
> > > > > clock_gettime(CLOCK_MONOTONIC_RAW, &tp);
> > > > > }
> > > > >
> > > > > wait(&status);
> > > > > return status;
> > > > > }
> > > > >
> > > > > # ./a.out ; echo $?
> > > > > 139
> > > > > --------------------- >8 ----------------------
> > > > >
> > > > > RPMs and configs can be found at Fedora koji, latest build is at [2] (look for kernel-64k).
> > > >
> > > > Hi Jan,
> > > >
> > > > Thanks for the great error report.
> > > >
> > > > Can you try the following change (on top of v6.15-rc1, should also work with current master)?
> > > >
> > > > diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
> > > > index 93ef801a97ef..867ce53cca94 100644
> > > > --- a/lib/vdso/gettimeofday.c
> > > > +++ b/lib/vdso/gettimeofday.c
> > > > @@ -85,14 +85,18 @@ static __always_inline
> > > > int do_hres_timens(const struct vdso_time_data *vdns, const struct vdso_clock *vcns,
> > > > clockid_t clk, struct __kernel_timespec *ts)
> > > > {
> > > > - const struct vdso_time_data *vd = __arch_get_vdso_u_timens_data(vdns);
> > > > const struct timens_offset *offs = &vcns->offset[clk];
> > > > - const struct vdso_clock *vc = vd->clock_data;
> > > > + const struct vdso_time_data *vd;
> > > > + const struct vdso_clock *vc;
> > > > const struct vdso_timestamp *vdso_ts;
> > > > u64 cycles, ns;
> > > > u32 seq;
> > > > s64 sec;
> > > >
> > > > + vd = vdns - (clk == CLOCK_MONOTONIC_RAW ? CS_RAW : CS_HRES_COARSE);
> > > > + vd = __arch_get_vdso_u_timens_data(vd);
> > > > + vc = vd->clock_data;
> > > > +
> > > > if (clk != CLOCK_MONOTONIC_RAW)
> > > > vc = &vc[CS_HRES_COARSE];
> > > > else
> > > >
> > > >
> > > > I'll do some proper testing tomorrow.
> > >
> > > That does seem to work for the 2 reproducers I have.
> >
> > Thanks for testing.
> >
> > > But why is this change needed?
> >
> > So far the only thing that I can say is that this logic was there before the
> > patch and was removed accidentally, so it should be restored.
> > Why the logic was there in the first place I'll have to investigate.
>
> I think it paired with "vd advancing" based on "clock" in original code:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/vdso/gettimeofday.c?h=v6.14#n264
> and to get back to "base", you needed to subtract same value:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/lib/vdso/gettimeofday.c?h=v6.14#n82
>
> After this series, "vd" isn't manipulated this way, so the removal of
> that subtraction seemed deliberate to me.
Indeed. This change "works" because vsdo_clocksource_ok() rejects the clock and
instead falls backs to the syscall.
> > > Isn't 'vdns' here equal to 'vdso_u_time_data'?
> >
> > That is true, but in a time namespace the namespaced time structure is mapped
> > in place of the normal structure and vice-versa.
> > So __arch_get_vdso_u_timens_data() will get the "real" time datastructure based
> > on a namespaced one.
> >
> > I can't explain the special logic for CLOCK_MONOTONIC_RAW yet.
> > To me it looks wrong to calculate on a 'struct vdso_time_data *' in terms of
> > CS_RAW/CS_HRES_COARSE.
> >
> >
> > Another change that "fixes" the crash for me is:
> >
> > diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
> > index 93ef801a97ef..cdc3988a0ace 100644
> > --- a/lib/vdso/gettimeofday.c
> > +++ b/lib/vdso/gettimeofday.c
> > @@ -93,6 +118,8 @@ int do_hres_timens(const struct vdso_time_data *vdns, const struct vdso_clock *v
> > u32 seq;
> > s64 sec;
> >
> > + OPTIMIZER_HIDE_VAR(vc);
> > +
> > if (clk != CLOCK_MONOTONIC_RAW)
> > vc = &vc[CS_HRES_COARSE];
> > else
> >
> >
> > This is obviously not an actual fix but indicates that something weird is going on.
> > Could you run this second change also through LTP to see if it would pass?
>
> Agreed, this does "fixes" it for me as well.
Some more information:
The crash comes from the address arithmetic in "vc = &vc[CS_RAW]" going wrong.
This should just do "vc = vc + 1", advancing the pointer by sizeof(*vc).
But I get these example values of "vc" before and after the arithmetic:
0x00ffffbc060000 -> 0xfffffffffffd00e8
Which is obviously wrong.
The arithmetic can be fixed with any one of the following changes:
* OPTIMIZER_HIDE_VAR() as above
* Replacement of -mcmodel=tiny with -mcmode=small in arch/arm64/kernel/vdso/Makefile
-mcmodel=tiny is supposed to cover 1MiB programs, the vDSO only needs 300KiB here
* Removal of the (clk != CLOCK_MONOTONIC_RAW) case
Shuffling code around, even without any impact on semantics, sometimes fixes
the issue. Compiling the vDSO with UBSAN didn't show anything.
Thomas
> > > > > [1] https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/clock_gettime/clock_gettime03.c
> > > > > [2] https://koji.fedoraproject.org/koji/buildinfo?buildID=2704401
> > > >
> >
>
Powered by blists - more mailing lists