lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a1asMLnJtea=JkduiYzr0dF0BTKQzcx5aQVv1zU5dK2FA@mail.gmail.com>
Date:   Mon, 27 Nov 2017 21:41:54 +0100
From:   Arnd Bergmann <arnd@...db.de>
To:     "Eric W. Biederman" <ebiederm@...ssion.com>
Cc:     Paul Eggert <eggert@...ucla.edu>,
        John Stultz <john.stultz@...aro.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        y2038 Mailman List <y2038@...ts.linaro.org>,
        GNU C Library <libc-alpha@...rceware.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-arch <linux-arch@...r.kernel.org>,
        Linux API <linux-api@...r.kernel.org>,
        Albert ARIBAUD <albert.aribaud@...ev.fr>,
        Richard Henderson <rth@...ddle.net>,
        Ivan Kokshaysky <ink@...assic.park.msu.ru>,
        Matt Turner <mattst88@...il.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Ingo Molnar <mingo@...nel.org>,
        Frederic Weisbecker <fweisbec@...il.com>,
        Deepa Dinamani <deepa.kernel@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Oleg Nesterov <oleg@...hat.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Kirill Tkhai <ktkhai@...tuozzo.com>,
        linux-alpha@...r.kernel.org
Subject: Re: [PATCH 3/3] y2038: rusage: use __kernel_old_timeval for process times

On Mon, Nov 27, 2017 at 7:49 PM, Eric W. Biederman
<ebiederm@...ssion.com> wrote:
> Paul Eggert <eggert@...ucla.edu> writes:
>
>> On 11/27/2017 09:00 AM, Arnd Bergmann wrote:
>>> b) Extend the approach taken by the x32 ABI, and use the 64-bit
>>>     native structure layout for rusage on all architectures with new
>>>     system calls that is otherwise compatible. A possible problem here
>>>     is that we end up with incompatible definitions of rusage between
>>>     /usr/include/linux/resource.h and /usr/include/bits/resource.h
>>>
>>> c) Change the definition of struct rusage to be independent of
>>>     time_t. This is the easiest change, as it does not involve new system
>>>     call entry points, but it has the risk of introducing compile-time
>>>     incompatibilities with user space sources that rely on the type
>>>     of ru_utime and ru_stime.
>>>
>>> I'm picking approch c) for its simplicity, but I'd like to hear from
>>> others whether they would prefer a different approach.
>>
>> (c) would break programs like GNU Emacs, which copy ru_utime and ru_stime
>> members into struct timeval variables.

Right. I think I originally had the workaround to have glibc convert
between its own structure and the kernel structure in mind, but then
ended up not including that in the text above. I was going back and
forth on whether it would be needed or not.

>> All in all, (b) sounds like it would be better for programs using glibc, as it's
>> more compatible with what POSIX apps expect. Though I'm not sure what problems
>> are meant by "possible ... incompatible definitions"; perhaps you could
>> elaborate.

I meant that you might have an application that includes
linux/resource.h instead of sys/resource.h but calls the glibc
function, or one that includes sys/resource.h and invokes the
system call directly.

> getrusage is posix and I believe the use of struct timeval is posix as
> well.
>
> So getrusage(3) the libc definition and that defintion must struct
> timeval or the implementation will be non-conforming and it won't be
> just emacs we need to worry about.
>
> The practical question is what do we provide to userspace so that it can
> implement a conforming getrusage?
>
> A 32bit time_t based struct timeval is good for durations up to 136 years
> or so.  Which strongly suggests the range is large enough, except for
> some crazy massively multi-threaded application.  And anything off the
> charts cpu hungry at this point I expect will be 64bit.
>
> It is possible to get a 128 way system with one thread on each core and
> consume 100% of the core for a bit over a year to max out getrusage.  So
> I do think in the long run we care about increasing the size of time_t
> here.  Last I checked applications doing things like that were 64bit in
> the year 2000.

Agreed, this was also a calculation I did.

> Given that userspace is going to be seeing the larger struct rusage in
> any event my inclination for long term maintainability would be to
> introduce the new syscall and have the current one called oldgetrusage
> on 32bit architectures.  Then we won't have to worry about what weird
> things glibc will do when translating the data, and we can handle
> applications with crazy (but possible) runtimes.  Which inclines me to
> (b) as well.

This would actually be the same thing we do for most other syscalls,
regarding the naming, it would become compat_sys_getrusage()
and share the implementation between native 32-bit mode and
compat mode on 64-bit architectures, while sys_getrusage becomes
the function that deals with the 64-bit layout, and would have the
same binary format on both 32-bit and 64-bit native ABIs.

Unfortunately, this opens a new question, as the structure is currently
defined by glibc as:

/* Structure which says how much of each resource has been used.  */

/* The purpose of all the unions is to have the kernel-compatible layout
   while keeping the API type as 'long int', and among machines where
   __syscall_slong_t is not 'long int', this only does the right thing
   for little-endian ones, like x32.  */
struct rusage
  {
    /* Total amount of user time used.  */
    struct timeval ru_utime;
    /* Total amount of system time used.  */
    struct timeval ru_stime;
    /* Maximum resident set size (in kilobytes).  */
    __extension__ union
      {
        long int ru_maxrss;
        __syscall_slong_t __ru_maxrss_word;
      };
    /* Amount of sharing of text segment memory
       with other processes (kilobyte-seconds).  */
    /* Maximum resident set size (in kilobytes).  */
    __extension__ union
      {
        long int ru_ixrss;
        __syscall_slong_t __ru_ixrss_word;
      };
   ...
};

Here, I guess we have to replace __syscall_slong_t with an 'rusage'
specific type that has the same length as time_t, but is independent
of __syscall_slong_t, which is still 32-bit for most 32-bit architectures.

How would we do the big-endian version of that though?

One argument for using c) plus the emulation in glibc is that glibc
has to do emulation anyway, to allow running user space with 64-bit
time_t on older kernels that don't have the new getrusage system
call.

> As for (a) does anyone have a need for process acounting at nsec
> granularity?  Unless we can get that for free that just seems like
> overpromising and a waist to have so much fine granularity.

The kernel does everything in nanoseconds, so we always spend
a few cycles (a lot of cycles on some of the very low-end architectures)
on dividing it by 1000. Moving the division operation to user space
is essentially free, and using the nanoseconds instead of microseconds
might be slightly cheaper. I don't think anyone really needs it though.

      Arnd

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ