lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161207103451.GA869@yury-N73SV>
Date:   Wed, 7 Dec 2016 16:04:51 +0530
From:   Yury Norov <ynorov@...iumnetworks.com>
To:     Arnd Bergmann <arnd@...db.de>
CC:     <libc-alpha@...rceware.org>, <linux-arch@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        <szabolcs.nagy@....com>, <heiko.carstens@...ibm.com>,
        <cmetcalf@...hip.com>, <philipp.tomsich@...obroma-systems.com>,
        <joseph@...esourcery.com>, <zhouchengming1@...wei.com>,
        <Prasun.Kapoor@...iumnetworks.com>, <agraf@...e.de>,
        <geert@...ux-m68k.org>, <kilobyte@...band.pl>,
        <manuel.montezelo@...il.com>, <pinskia@...il.com>,
        <linyongting@...wei.com>, <klimov.linux@...il.com>,
        <broonie@...nel.org>, <bamvor.zhangjian@...wei.com>,
        <linux-arm-kernel@...ts.infradead.org>,
        <maxim.kuvyrkov@...aro.org>, <Nathan_Lynch@...tor.com>,
        <schwidefsky@...ibm.com>, <davem@...emloft.net>,
        <christoph.muellner@...obroma-systems.com>
Subject: Re: [Question] New mmap64 syscall?

On Tue, Dec 06, 2016 at 10:20:20PM +0100, Arnd Bergmann wrote:
> On Wednesday, December 7, 2016 12:24:40 AM CET Yury Norov wrote:
> > 3. Introduce new mmap64() syscall like this:
> > sys_mmap64(void *addr, size_t len, int prot, int flags, int fd, struct off_pair *off);
> > (The pointer here because otherwise we have 7 args, if simply pass off_hi and
> > off_lo in registers.)
> 
> This wouldn't have to be a pair, just a pointer to a 64-bit number.
> 
> > With new 64-bit interface we can deprecate mmap2(), and generalize all
> > implementations in kernel.
> > 
> > I think we can discuss it because 64-bit is the default size for off_t 
> > in all new 32-bit architectures. So generic solution may take place.
> > 
> > The last question here is how important to support offsets bigger than
> > 2^44 on 32-bit machines in practice? It may be a case for ARM64 servers,
> > which are looking like main aarch64/ilp32 users. If no, we can leave
> > things as is, and just do nothing.
> 
> If there is a use case for larger than 16TB offsets, we should add
> the call on all architectures, probably using your approach 3. I don't
> think that we should treat it as anything special for arm64 though.

>From this point of view, 16+TB offset is a matter of 16+TB storage,
and it's more than real. The other consideration to add it is that
we have 64-bit support for offsets in syscalls like sys_llseek().
So mmap64() will simply extend this support.

I can prepare this patch. Some implementation details I'd like to
clarify:
Syscall declaration:
SYSCALL_DEFINE6(mmap64, unsigned long, addr, unsigned long, len,
                unsigned long, prot, unsigned long, flags,
                unsigned long, fd, unsigned long long *, offset);

sys_mmap64() deprecates sys_mmap2(), and __ARCH_WANT_MMAP2 is
introduced to keep it enabled for all existing architectures.
All modern arches (aarch64/ilp32 is the first candidate) will have
mmap64() only. The example is set/getrlimit() or renameat() drop
patches (b0da6d44).
                                
On GLIBC side, __OFF_T_MATCHES_OFF64_t will wire mmap() from
linux/generic/wordsize32/mmap.c to mmap64() from linux/mmap64.c. 

mmap64() will first try __NR_mmap64, and if not defined, or ENOSYS
is returned, __NR_mmap2 will be called. This is to let userspace that
supports both mmap2() and mmap64() have full 64-bit offset support, not
44-bit one.

For __NR_mmap2 case, I'd also add the check against offsets more than
2^44, and set errno to EOVERFLOW in that case.

Any thoughts?

Yury.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ