netdev - Re: FW: [PATCH 15/31] nds32: System calls handling

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJsyPhx49fwmJZmj99ZsqCn7fTemxGB=e=r_-3+rwx_iQux-TA@mail.gmail.com>
Date:   Wed, 22 Nov 2017 11:13:47 +0800
From:   Vincent Chen <deanbo422@...il.com>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     Greentime <greentime@...estech.com>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-arch <linux-arch@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jason Cooper <jason@...edaemon.net>,
        Marc Zyngier <marc.zyngier@....com>,
        Rob Herring <robh+dt@...nel.org>,
        Networking <netdev@...r.kernel.org>,
        Vincent Chen <vincentc@...estech.com>
Subject: Re: FW: [PATCH 15/31] nds32: System calls handling

2017-11-13 19:42 GMT+08:00 Arnd Bergmann <arnd@...db.de>:
> On Mon, Nov 13, 2017 at 3:51 AM, Vincent Chen <deanbo422@...il.com> wrote:
>>>>On Wed, Nov 8, 2017 at 6:55 AM, Greentime Hu <green.hu@...il.com> wrote:
>>>> From: Greentime Hu <greentime@...estech.com>
>
>>
>>>> +#define __ARCH_WANT_SYS_CLONE
>>>
>>>This seems ok, though it would be nice to have the reverse logic and have architectures opt-out of the generic version when they need to provide their own, rather than having most architectures set it.
>>>
>>
>> Thanks
>> I will provide nds32 SYSCALL_DEFINE_5(clone) in the next version patch.
>
> That's not what I meant, my suggestion is to create a new patch to remove the
> __ARCH_WANT_SYS_CLONE symbol from all architectures, and instead use
> something like __ARCH_HAVE_SYS_CLONE on architectures that do *not*
> want the generic implementation.
>
> nds32 clearly wants the generic implementation here, it just shouldn't have to
> select that symbol to get it.
>
>>>> +/*
>>>> + * Special system call wrappers
>>>> + *
>>>> + * $r0 = syscall number
>>>> + * $r8 = syscall table
>>>> + */
>>>> +       .type   sys_syscall, #function
>>>> +ENTRY(sys_syscall)
>>>> +       addi    $p1, $r0, #-__NR_syscalls
>>>> +       bgtz    $p1, 3f
>>>> +       move    $p1, $r0
>>>> +       move    $r0, $r1
>>>> +       move    $r1, $r2
>>>> +       move    $r2, $r3
>>>> +       move    $r3, $r4
>>>> +       move    $r4, $r5
>>>> +! add for syscall 6 args
>>>> +       lwi     $r5, [$sp + #SP_OFFSET ]
>>>> +       lwi     $r5, [$r5]
>>>> +! ~add for syscall 6 args
>>>> +
>>>> +       lw      $p1, [tbl+$p1<<2]
>>>> .+       jr      $p1
>>>> +3:     b       sys_ni_syscall
>>>> +ENDPROC(sys_syscall)
>>>
>>>Can you explain what this is used for?
>>>
>>
>> This is used to handle syscall(int number, ....).
>> Unlike other architectures,  the system number shall be determined in
>> compile time when issuing system call in nds32.
>> Therefore, we  only can parse the content of syscall(int number, ....)
>> and distribute it to destination handler in kernel space
>> (Other architecture can handle it in user space by glibc's syscall wrapper)
>
> Hmm, I think other architectures that run into this problem use self-modifying
> code for syscall(), but that is also ugly as it requires having a page that
> is both writable and executable.
>
> I think your approach can be tricky for things like seccomp(). It's possible
> that you get all of it right, but if you can come up with a different solution,
> that might be better.
>
> How much would it cost to simply always pass the syscall number
> in a register, and not use the immediate argument at all?
>

After re-evaluating the performance, we decide to use a particular
register to transfer syscall number
instead of immediate argument. So, above sys_syscall will be removed
in the next version patch

Thanks


>>>> --- /dev/null
>>>> +++ b/arch/nds32/kernel/sys_nds32.c
>>>> +
>>>> +long sys_mmap2(unsigned long addr, unsigned long len,
>>>> +              unsigned long prot, unsigned long flags,
>>>> +              unsigned long fd, unsigned long pgoff) {
>>>> +       if (pgoff & (~PAGE_MASK >> 12))
>>>> +               return -EINVAL;
>>>> +
>>>> +       return sys_mmap_pgoff(addr, len, prot, flags, fd,
>>>> +                             pgoff >> (PAGE_SHIFT - 12)); }
>>>> +
>>>> +asmlinkage long sys_fadvise64_64_wrapper(int fd, int advice, loff_t offset,
>>>> +                                        loff_t len) {
>>>> +       return sys_fadvise64_64(fd, offset, len, advice); }
>>>
>>>You should always use SYSCALL_DEFINE*() macros to define entry points for your own syscalls in C code for consistency. I also wonder if we should just move those two into common code, a lot of architectures need the first one in particular.
>>>
>>
>> The sys_fadvise64_64_wrapper is used to reorder the input parameter.
>>
>> In order to solve register alignment problem, we adjust the input
>> parameter order of fadvise64_64 while issuing this syscall.
>> Therefore, we need this wrapper to reorder the input parameter to fit
>> sys_fadvise64_64's API in kernel.
>
> I understand what it's used for, it's just that I would recommend writing it
> differently, as
>
> SYSCALL_DEFINE4(fadvise64_64_wrapper, int, fd, int, advice, loff_t,
> offset, loff_t, len)
> {
>        return sys_fadvise64_64(fd, offset, len, advice);
> }
>
>       Arnd