[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <CNJ1QJAMW2EZ.1K9H3JQS9IKHU@bobo>
Date: Tue, 11 Oct 2022 21:10:29 +1000
From: "Nicholas Piggin" <npiggin@...il.com>
To: "Michael Ellerman" <mpe@...erman.id.au>,
"Jason A. Donenfeld" <Jason@...c4.com>
Cc: "Linus Torvalds" <torvalds@...ux-foundation.org>,
<ajd@...ux.ibm.com>, <aneesh.kumar@...ux.ibm.com>,
<atrajeev@...ux.vnet.ibm.com>, <christophe.leroy@...roup.eu>,
<cuigaosheng1@...wei.com>, <david@...hat.com>,
<farosas@...ux.ibm.com>, <geoff@...radead.org>,
<gustavoars@...nel.org>, <haren@...ux.ibm.com>,
<hbathini@...ux.ibm.com>, <joel@....id.au>, <lihuafei1@...wei.com>,
<linux-kernel@...r.kernel.org>, <linux@...ck-us.net>,
<linuxppc-dev@...ts.ozlabs.org>, <lukas.bulwahn@...il.com>,
<mikey@...ling.org>, <nathan@...nel.org>, <nathanl@...ux.ibm.com>,
<nicholas@...ux.ibm.com>, <pali@...nel.org>, <paul@...l-moore.com>,
<rmclure@...ux.ibm.com>, <ruscur@...sell.cc>, <windhl@....com>,
<wsa+renesas@...g-engineering.com>, <ye.xingchen@....com.cn>,
<yuanjilin@...rlc.com>, <zhengyongjun3@...wei.com>
Subject: Re: [GIT PULL] Please pull powerpc/linux.git powerpc-6.1-1 tag
On Tue Oct 11, 2022 at 7:35 PM AEST, Michael Ellerman wrote:
> "Jason A. Donenfeld" <Jason@...c4.com> writes:
> > On Tue, Oct 11, 2022 at 12:53:17PM +1100, Michael Ellerman wrote:
> >> "Jason A. Donenfeld" <Jason@...c4.com> writes:
> >> > On Mon, Oct 10, 2022 at 01:25:25PM -0600, Jason A. Donenfeld wrote:
> >> >> Hi Michael,
> >> >>
> >> >> On Sun, Oct 09, 2022 at 10:01:39PM +1100, Michael Ellerman wrote:
> >> >> > powerpc updates for 6.1
> >> >> >
> >> >> > - Remove our now never-true definitions for pgd_huge() and p4d_leaf().
> >> >> >
> >> >> > - Add pte_needs_flush() and huge_pmd_needs_flush() for 64-bit.
> >> >> >
> >> >> > - Add support for syscall wrappers.
> >> >> >
> >> >> > - Add support for KFENCE on 64-bit.
> >> >> >
> >> >> > - Update 64-bit HV KVM to use the new guest state entry/exit accounting API.
> >> >> >
> >> >> > - Support execute-only memory when using the Radix MMU (P9 or later).
> >> >> >
> >> >> > - Implement CONFIG_PARAVIRT_TIME_ACCOUNTING for pseries guests.
> >> >> >
> >> >> > - Updates to our linker script to move more data into read-only sections.
> >> >> >
> >> >> > - Allow the VDSO to be randomised on 32-bit.
> >> >> >
> >> >> > - Many other small features and fixes.
> >> >>
> >> >> FYI, something in here broke the wireguard test suite, which runs the
> >> >> iperf3 networking utility. The full log is here [1], but the relevant part
> >> >> is:
> >> >>
> >> >> [+] NS1: iperf3 -Z -t 3 -c 192.168.241.2
> >> >> Connecting to host 192.168.241.2, port 5201
> >> >> iperf3: error - failed to read /dev/urandom: Bad address
> >> >>
> >> >> I'll see if I can narrow it down a bit more and bisect. But just FYI, in
> >> >> case you have an intuition.
> >> >
> >> > Huh. From iov_iter.c:
> >> >
> >> > static int copyout(void __user *to, const void *from, size_t n)
> >> > {
> >> > size_t before = n;
> >> > if (should_fail_usercopy())
> >> > return n;
> >> > if (access_ok(to, n)) {
> >> > instrument_copy_to_user(to, from, n);
> >> > n = raw_copy_to_user(to, from, n);
> >> > if (n == before)
> >> > pr_err("SARU n still %zu pointer is %lx\n", n, (unsigned long)to);
> >> > }
> >> > return n;
> >> > }
> >> >
> >> > I added the pr_err() there to catch the failure:
> >> > [ 3.443506] SARU n still 64 pointer is b78db000
> >> >
> >> > Also I managed to extract the failing portion of iperf3 into something
> >> > smaller:
> >> >
> >> > int temp;
> >> > char *x;
> >> > ssize_t l;
> >> > FILE *f;
> >> > char template[] = "/blah-XXXXXX";
> >> >
> >> > temp = mkstemp(template);
> >> > if (temp < 0)
> >> > panic("mkstemp");
> >> > if (unlink(template) < 0)
> >> > panic("unlink");
> >> > if (ftruncate(temp, 0x20000) < 0)
> >> > panic("ftruncate");
> >> > x = mmap(NULL, 0x20000, PROT_READ|PROT_WRITE, MAP_PRIVATE, temp, 0);
> >> > if (x == MAP_FAILED)
> >> > panic("mmap");
> >> > f = fopen("/dev/urandom", "rb");
> >> > if (!f)
> >> > panic("fopen");
> >> > setbuf(f, NULL);
> >> > if (fread(x, 1, 0x20000, f) != 0x20000)
> >> > panic("fread");
> >>
> >> Does that fail for you reliably?
> >>
> >> It succeeds for me running under qemu ppce500, though I'm not using your
> >> kernel config yet.
> >
> > Yes, every time without fail, across two systems and two qemu builds.
>
> OK. Joel worked out that it only fails when built with musl, so that's
> why it's succeeding for me (built with glibc).
This was independently discovered by several, but we worked out it's
because musl uses ftruncate64 here, while glibc doesn't seem to. And
ftruncate64 got broken by the syscall wrappers patch on ppc32. The
kernel is seeing a 0 length ftruncate call, so the user access sigbuses
and can't copy anything.
This quick hack gets the test program working again. Only very lightly
tested so far...
Thanks,
Nick
---
diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h
index 9840d572da55..9578cc5e4f84 100644
--- a/arch/powerpc/include/asm/syscalls.h
+++ b/arch/powerpc/include/asm/syscalls.h
@@ -89,6 +89,27 @@ long compat_sys_rt_sigreturn(void);
* responsible for combining parameter pairs.
*/
+#ifdef CONFIG_PPC32
+long sys_ppc_pread64(unsigned int fd,
+ char __user *ubuf, compat_size_t count,
+ u32 reg6, u32 pos1, u32 pos2);
+long sys_ppc_pwrite64(unsigned int fd,
+ const char __user *ubuf, compat_size_t count,
+ u32 reg6, u32 pos1, u32 pos2);
+long sys_ppc_readahead(int fd, u32 r4,
+ u32 offset1, u32 offset2, u32 count);
+long sys_ppc_truncate64(const char __user *path, u32 reg4,
+ unsigned long len1, unsigned long len2);
+long sys_ppc_ftruncate64(unsigned int fd, u32 reg4,
+ unsigned long len1, unsigned long len2);
+long sys_ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2,
+ size_t len, int advice);
+long sys_ppc_sync_file_range2(int fd, unsigned int flags,
+ unsigned int offset1,
+ unsigned int offset2,
+ unsigned int nbytes1,
+ unsigned int nbytes2);
+#endif
#ifdef CONFIG_COMPAT
long compat_sys_mmap2(unsigned long addr, size_t len,
unsigned long prot, unsigned long flags,
diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile
index 1f121c188805..d382564034a7 100644
--- a/arch/powerpc/kernel/Makefile
+++ b/arch/powerpc/kernel/Makefile
@@ -73,6 +73,7 @@ obj-y := cputable.o syscalls.o \
obj-y += ptrace/
obj-$(CONFIG_PPC64) += setup_64.o irq_64.o\
paca.o nvram_64.o note.o
+obj-$(CONFIG_PPC32) += sys_ppc32.o
obj-$(CONFIG_COMPAT) += sys_ppc32.o signal_32.o
obj-$(CONFIG_VDSO32) += vdso32_wrapper.o
obj-$(CONFIG_PPC_WATCHDOG) += watchdog.o
diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c
index dcc3c9fd4cfd..f9ce13e6c5eb 100644
--- a/arch/powerpc/kernel/sys_ppc32.c
+++ b/arch/powerpc/kernel/sys_ppc32.c
@@ -47,7 +47,17 @@
#include <asm/syscalls.h>
#include <asm/switch_to.h>
-COMPAT_SYSCALL_DEFINE6(ppc_pread64,
+#ifdef CONFIG_PPC32
+#define PPC32_SYSCALL_DEFINE4 SYSCALL_DEFINE4
+#define PPC32_SYSCALL_DEFINE5 SYSCALL_DEFINE5
+#define PPC32_SYSCALL_DEFINE6 SYSCALL_DEFINE6
+#else
+#define PPC32_SYSCALL_DEFINE4 COMPAT_SYSCALL_DEFINE4
+#define PPC32_SYSCALL_DEFINE5 COMPAT_SYSCALL_DEFINE5
+#define PPC32_SYSCALL_DEFINE6 COMPAT_SYSCALL_DEFINE6
+#endif
+
+PPC32_SYSCALL_DEFINE6(ppc_pread64,
unsigned int, fd,
char __user *, ubuf, compat_size_t, count,
u32, reg6, u32, pos1, u32, pos2)
@@ -55,7 +65,7 @@ COMPAT_SYSCALL_DEFINE6(ppc_pread64,
return ksys_pread64(fd, ubuf, count, merge_64(pos1, pos2));
}
-COMPAT_SYSCALL_DEFINE6(ppc_pwrite64,
+PPC32_SYSCALL_DEFINE6(ppc_pwrite64,
unsigned int, fd,
const char __user *, ubuf, compat_size_t, count,
u32, reg6, u32, pos1, u32, pos2)
@@ -63,28 +73,29 @@ COMPAT_SYSCALL_DEFINE6(ppc_pwrite64,
return ksys_pwrite64(fd, ubuf, count, merge_64(pos1, pos2));
}
-COMPAT_SYSCALL_DEFINE5(ppc_readahead,
+PPC32_SYSCALL_DEFINE5(ppc_readahead,
int, fd, u32, r4,
u32, offset1, u32, offset2, u32, count)
{
return ksys_readahead(fd, merge_64(offset1, offset2), count);
}
-COMPAT_SYSCALL_DEFINE4(ppc_truncate64,
+PPC32_SYSCALL_DEFINE4(ppc_truncate64,
const char __user *, path, u32, reg4,
unsigned long, len1, unsigned long, len2)
{
return ksys_truncate(path, merge_64(len1, len2));
}
-COMPAT_SYSCALL_DEFINE4(ppc_ftruncate64,
+PPC32_SYSCALL_DEFINE4(ppc_ftruncate64,
unsigned int, fd, u32, reg4,
unsigned long, len1, unsigned long, len2)
{
+ printk("ppc_ftruncate64 len1=%lx len2=%lx\n", len1, len2);
return ksys_ftruncate(fd, merge_64(len1, len2));
}
-COMPAT_SYSCALL_DEFINE6(ppc32_fadvise64,
+PPC32_SYSCALL_DEFINE6(ppc32_fadvise64,
int, fd, u32, unused, u32, offset1, u32, offset2,
size_t, len, int, advice)
{
@@ -92,7 +103,7 @@ COMPAT_SYSCALL_DEFINE6(ppc32_fadvise64,
advice);
}
-COMPAT_SYSCALL_DEFINE6(ppc_sync_file_range2,
+PPC32_SYSCALL_DEFINE6(ppc_sync_file_range2,
int, fd, unsigned int, flags,
unsigned int, offset1, unsigned int, offset2,
unsigned int, nbytes1, unsigned int, nbytes2)
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 2bca64f96164..71d6e38e5a3a 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -228,8 +228,8 @@
176 64 rt_sigtimedwait sys_rt_sigtimedwait
177 nospu rt_sigqueueinfo sys_rt_sigqueueinfo compat_sys_rt_sigqueueinfo
178 nospu rt_sigsuspend sys_rt_sigsuspend compat_sys_rt_sigsuspend
-179 common pread64 sys_pread64 compat_sys_ppc_pread64
-180 common pwrite64 sys_pwrite64 compat_sys_ppc_pwrite64
+179 common pread64 sys_ppc_pread64 compat_sys_ppc_pread64
+180 common pwrite64 sys_ppc_pwrite64 compat_sys_ppc_pwrite64
181 common chown sys_chown
182 common getcwd sys_getcwd
183 common capget sys_capget
@@ -242,10 +242,10 @@
188 common putpmsg sys_ni_syscall
189 nospu vfork sys_vfork
190 common ugetrlimit sys_getrlimit compat_sys_getrlimit
-191 common readahead sys_readahead compat_sys_ppc_readahead
+191 common readahead sys_ppc_readahead compat_sys_ppc_readahead
192 32 mmap2 sys_mmap2 compat_sys_mmap2
-193 32 truncate64 sys_truncate64 compat_sys_ppc_truncate64
-194 32 ftruncate64 sys_ftruncate64 compat_sys_ppc_ftruncate64
+193 32 truncate64 sys_ppc_truncate64 compat_sys_ppc_truncate64
+194 32 ftruncate64 sys_ppc_ftruncate64 compat_sys_ppc_ftruncate64
195 32 stat64 sys_stat64
196 32 lstat64 sys_lstat64
197 32 fstat64 sys_fstat64
@@ -288,7 +288,7 @@
230 common io_submit sys_io_submit compat_sys_io_submit
231 common io_cancel sys_io_cancel
232 nospu set_tid_address sys_set_tid_address
-233 common fadvise64 sys_fadvise64 compat_sys_ppc32_fadvise64
+233 common fadvise64 sys_ppc32_fadvise64 compat_sys_ppc32_fadvise64
234 nospu exit_group sys_exit_group
235 nospu lookup_dcookie sys_lookup_dcookie compat_sys_lookup_dcookie
236 common epoll_create sys_epoll_create
@@ -390,7 +390,7 @@
305 common signalfd sys_signalfd compat_sys_signalfd
306 common timerfd_create sys_timerfd_create
307 common eventfd sys_eventfd
-308 common sync_file_range2 sys_sync_file_range2 compat_sys_ppc_sync_file_range2
+308 common sync_file_range2 sys_ppc_sync_file_range2 compat_sys_ppc_sync_file_range2
309 nospu fallocate sys_fallocate compat_sys_fallocate
310 nospu subpage_prot sys_subpage_prot
311 32 timerfd_settime sys_timerfd_settime32
Powered by blists - more mailing lists