lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAKgNAkhNSWR3uYhYYaxx74fZfJ3JrpfAAPVrK0AFk_cAOUsbDg@mail.gmail.com>
Date:   Sun, 22 Nov 2020 23:37:17 +0100
From:   "Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:     "Alejandro Colomar (man-pages)" <alx.manpages@...il.com>
Cc:     linux-man <linux-man@...r.kernel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        "libc-alpha@...rceware.org" <libc-alpha@...rceware.org>,
        Florian Weimer <fw@...eb.enyo.de>
Subject: Re: [PATCH] lseek.2: SYNOPSIS: Use correct types

Hi Alex,

On Sat, 21 Nov 2020 at 18:45, Alejandro Colomar (man-pages)
<alx.manpages@...il.com> wrote:
>
> Hi Michael,
>
> I'm a bit lost in all the *lseek* pages.
>
> You had a good read some months ago, so you may know it better.
> I don't know which of those functions come from the kernel,
> and which come from glibc (if any).

It always takes me too long to remind myself of the details here :-(.

This time, I'll try to write what I (re)learned.

Inside the kernel (5.9 sources), in fs/read_write.c, we have:

[[
SYSCALL_DEFINE3(lseek, unsigned int, fd, off_t, offset, unsigned int, whence)
{
        return ksys_lseek(fd, offset, whence);
}

#ifdef CONFIG_COMPAT
COMPAT_SYSCALL_DEFINE3(lseek, unsigned int, fd, compat_off_t, offset,
unsigned int, whence)
{
        return ksys_lseek(fd, offset, whence);
}
#endif

#if !defined(CONFIG_64BIT) || defined(CONFIG_COMPAT) || \
        defined(__ARCH_WANT_SYS_LLSEEK)
SYSCALL_DEFINE5(llseek, unsigned int, fd, unsigned long, offset_high,
                unsigned long, offset_low, loff_t __user *, result,
                unsigned int, whence)
{
...
}
#endif
]]

The main pieces of interest here are the first and last
SYSCALL_DEFINEn. The first is the "standard" lseek() system call that
exists on 64-bit and 32-bit architectures.

The problem on 32-bit architectures is that the off_t type is a 32-bit
type, but files can be bigger than 2GB (2**32-1). That's why 32-bit
kernels also provide the llseek() system call. It receives the new
offset in two 32-bit pieces (offset_high, offset_low), and returns the
new offset via a 64-bit off_t argument (result). (I forget the
reason why there are 32-bit and 64-bit "offset" args in the syscall.)

One more thing... In arch/x86/entry/syscalls/syscall_32.tbl,
we see the following line:

[[
140     i386    _llseek                 sys_llseek
]]

This is essentially telling us that 'sys_llseek' (the name generated
by SYSCALL_DEFINE5(llseek...)) is exposed to user-space as system call
number 140, and that system call number will (IIUC) be exposed in
autogenerated headers with the name "__NR__llseek" (i.e., "_llseek").
The "i386" is
telling us that this happens in i386 (32-bit Intel). There is nothing
equivalent on x86-64, because 64 bit systems don't need an _llseek
system call.

Now, in ancient times (let's say Linux 2.2), there was a more
transparent situation (but the effect was the same):

#define __NR__llseek            140

and that system call number was tied to the implementation by this definition
linux-2.2.26/arch/i386/kernel/entry.S:

.long SYMBOL_NAME(sys_llseek)           /* 140 */

==

lseek64() is a C library function.  It takes and returns a 64-bit
offset. It exists to support seeking in large (>2GB) files. Its
implementation is in the glibc source file
sysdeps/unix/sysv/linux/lseek64.c, where it calls _llseek(2)

Returning to the <unistd.h> header file, we have:

[[
#ifndef __USE_FILE_OFFSET64
extern __off_t lseek (int __fd, __off_t __offset, int __whence) __THROW;
#else
# ifdef __REDIRECT_NTH
extern __off64_t __REDIRECT_NTH (lseek,
                                 (int __fd, __off64_t __offset, int __whence),
                                 lseek64);
# else
#  define lseek lseek64
# endif
#endif
#ifdef __USE_LARGEFILE64
extern __off64_t lseek64 (int __fd, __off64_t __offset, int __whence)
     __THROW;
#endif
]]

The name "lseek64" is exposed if _LARGEFILE64_SOURCE (which triggers
__USE_LARGEFILE64) is defined. That name was part of the so-called
Transitional Large FIle Systems (LFS) API (see page 105 in my book),
which existed to support the use of 64-bit file offsets on 32 bit
systems. It provided a set of interfaces with names of the form
"xxxxx64()" (e.g., "lseek64")) which provided for 64-bit offsets;
those names coexisted with the traditional 32-bit APIs (e.g.,
"lseek").

Alternatively, the LFS specified a macro, _FILE_OFFSET_BITS=64 (which
triggers __USE_FILE_OFFSET64) as another way of exposing 64-bit-offset
functionality on 32 bit systems. In this case, the traditional API
names (e.g., "lseek") are redirected to the 64-bit implementations
(e.g., "lseek64");

> In the kernel I only found the lseek, llseek, and 32_llseek

I'd ignore 32_llseek -- I guess that's an arch-specific equivalent of
_llseek/llseek.

> (as you can see in the patch).
> So if any other prototype needs to be updated, please do so.
> Especially, have a look at lseek64(3),
> which I suspect needs the same changes I propose in that patch.

I think that no changes to the types are needed in lseek64(3). But
maybe some of the info in this mail should be captured in that manual
page.

Thanks,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ