lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <47614369.3060906@gmail.com>
Date:	Thu, 13 Dec 2007 15:36:25 +0100
From:	Michael Kerrisk <mtk.manpages@...glemail.com>
To:	Davide Libenzi <davidel@...ilserver.org>,
	Andrew Morton <akpm@...ux-foundation.org>
CC:	lkml <linux-kernel@...r.kernel.org>, tytso@...nk.org,
	Thomas Gleixner <tglx@...utronix.de>, Greg KH <gregkh@...e.de>,
	Christoph Hellwig <hch@....de>,
	Linux Torvalds <torvalds@...ux-foundation.org>
Subject: Tesing of / bugs in new timerfd API

Davide, Andrew,

I applied Davide's v3 patchset (sent into LKML on 25 Nov) against
2.4.24-rc3, and did various tests (all on x86).  Several tests
were done using the program at the foot of this mail.  Various others
were done by cobbling together bits of code that I haven't included
here.

In covering the tests I ran, and as a kind of proof of concept,
I'll treat the draft man page as the test spec.

I think I've found two bugs (look for the string "BUG" below), one
of which is a significant deviation from expected behavior.  See
the comments below.

Cheers,

Michael


> SYNOPSIS
>     #include <sys/timerfd.h>
>
>     int timerfd_create(int clockid, int flags);
>
>     int timerfd_settime(int fd, int flags,
>                         const struct itimerspec *new_value,
>                         struct itimerspec *curr_value);
>
>     int timerfd_gettime(int fd,
>                          struct itimerspec *curr_value);
>
> DESCRIPTION
>     These system calls create and operate on  a  timer  that
>     delivers  timer  expiration  notifications  via  a  file
>     descriptor.  They provide an alternative to the  use  of
>     setitimer(2) or timer_create(3), with the advantage that
>     the file  descriptor  may  be  monitored  by  select(2),
>     poll(2), and epoll(7).
>
>     The  use of these three system calls is analogous to the
>     use of timer_create(3), timer_settime(3), and timer_get-
>     time(3).   (There  is no analog of timer_gettoverrun(3),
>     since that functionality  is  provided  by  read(2),  as
>     described below.)
>
>   timerfd_create()
>     timerfd_create() creates a new timer object, and returns
>     a file  descriptor  that  refers  to  that  timer.   The
>     clockid  argument  specifies  the  clock that is used to
>     mark the progress of  the  timer,  and  must  be  either
>     CLOCK_REALTIME  or CLOCK_MONOTONIC.

Verified.  I've tried creating various combinations of
CLOCK_REALTIME and CLOCK_MONOTONIC timers, both one shot
(it_interval is zero) and repeating (it_interval is non-zero),
and everything works as I would expect.

>     CLOCK_REALTIME is a
>     settable system-wide clock.  CLOCK_MONOTONIC is  a  non-
>     settable  clock  that  is  not affected by discontinuous
>     changes in the system clock  (e.g.,  manual  changes  to
>     system time).  The current value of each of these clocks
>     can be retrieved using clock_gettime(3).
>
>     The flags argument is reserved for future  use.   As  at
>     Linux 2.6.25, this argument must be specified as zero.

(Verified: see the ERRORS section of the man page, below.)

>   timerfd_settime()
>     timerfd_settime()  arms  (starts) or disarms (stops) the
>     timer referred to by the file descriptor fd.

Disarming a timer works fine in my tests.

>     The new_value argument specifies the initial  expiration
>     and  interval  for the timer.  The itimer structure used
>     for this argument contains two fields, each of which  is
>     in turn a structure of type timespec:
>
>       struct timespec {
>           time_t tv_sec;                /* Seconds */
>           long   tv_nsec;               /* Nanoseconds */
>       };
>
>       struct itimerspec {
>           struct timespec it_interval;  /* Interval for
>                                            periodic timer */
>           struct timespec it_value;     /* Initial
>                                            expiration */
>       };
>
>     new_value.it_value  specifies  the initial expiration of
>     the timer, in seconds and nanoseconds.   Setting  either
>     field of new_value.it_value to a non-zero value arms the
>     timer.  Setting both  fields  of  new_value.it_value  to
>     zero disarms the timer.

Verified: setting both  fields  of  new_value.it_value  to
zero disarms the timer.

>     Setting  one  or both fields of new_value.it_interval to
>     non-zero values specifies the  period,  in  seconds  and
>     nanoseconds,  for  repeated  timer expirations after the
>     initial    expiration.

I've tested intervals down to 1 nanosec, which seem to seems to
work as I would expect.  (But see the BUG reported below.)

>     If     both     fields     of
>     new_value.it_interval  are  zero, the timer expires just
>     once, at the time specified by new_value.it_value.

Verified.

>     The flags argument is either  0,  to  start  a  relative
>     timer  (new_value.it_interval  specifies a time relative
>     to the current value of the clock specified by clockid),
>     or   TFD_TIMER_ABSTIME,   to  start  an  absolute  timer
>     (new_value.it_interval specifies an  absolute  time  for
>     the  clock specified by clockid; that is, the timer will
>     expire when the value of that clock  reaches  the  value
>     specified in new_value.it_interval).

Tested: after setting a CLOCK_REALTIME timer with the
TFD_TIMER_ABSTIME flag to expire at some time in the past with
a non-zero interval (e.g., setting 100 seconds in the past, with
a 5 second interval), read() from the file descriptor returns
the correct number of expirations (e.g., 20).

This seems a reasonable thing to do, I suppose.  However, while
playing around to test this, I found what looks like a bug (see
below).

BUG 1:
However, this test exposed what looks like a bug: if I set a
CLOCK_REALTIME clock to expire in the past, with a very small
interval, then the maximum number of expirations that can be
returned via read seems to be limited to 32 bits, even though
we have a 64-bit value for returning this information.
I haven't checked the kernel source to determine where this
bug is.

To demonstrate the bug use the program appended to this mail,
as follows:

# The following starts a CLOCK_REALTIME repeating timer with
# an interval of 1 microsecs (1000 nanosecs), with an initial
# expiration of just under 2^32 nanoseconds in the past.
# The '-a' flags says use TFD_TIMER_ABSTIME for timer_settime() calls.
# The 'r' command reads from the timerfd file descriptor.

$ ./timerfd_test -a -- -4290 0 0 1000
Initial setting for settime:   value=1197543860.000, interval=0.000
./timerfd_test> r           <-- type this immediately
Read: 4292190752            <-- nearly 2^32 expirations, as expected

$ ./timerfd_test -a -- -4290 0 0 1000
Initial setting for settime:   value=1197543900.000, interval=0.000
./timerfd_test> r           <-- type this after 5 secs
Read: 2992244               <-- Looks like the counter rolled over


>     The  curr_value  argument returns a structure containing
>     the setting of the timer that was current at the time of
>     the  call; see the description of timerfd_gettime() fol-
>     lowing.

See the bug described under timerfd_gettime().

>   timerfd_gettime()
>     timerfd_gettime() returns, in curr_value, an  itimerspec
>     that  contains the current setting of the timer referred
>     to by the file descriptor fd.
>
>     The it_value field returns the amount of time until  the
>     timer  will  next expire.  If both fields of this struc-
>     ture are zero, then the  timer  is  currently  disarmed.
>     This  field always contains a relative value, regardless
>     of whether the TFD_TIMER_ABSTIME flag was specified when
>     setting the timer.

BUG 2:
The last sentence does not match the implementation.
(Nor is it consistent with the behavior of POSIX timers.
And I *think* things did work correctly in the original
timerfd() implementation, but I have not gone back to check.)

Suppose that we set an absolute timer to expire 100 seconds
in the future.  Then according to this sentence of the man
page then each subsequent call to timerfd_gettime() should
retrun an itimerspec structure whose it_value steadily
decreases from 100 to 0 (when the timer expires).  (This
is the behavior in the analogous situation with POSIX timers
and with setitimer()/getitimer().)

However, the implementation of timerfd_gettime() always
returns the "time when the timer would next expire", and
this value depends on whether TFD_TIMER_ABSTIME was specified
when setting the timer.

Examples:
$ ./timerfd_test -a -- 100       # -a means TFD_TIMER_ABSTIME
Initial setting for settime:   value=1197550329.140, interval=0.000
./timerfd_test> g                <-- Calls timerfd_gettime()
(elapsed time=  1)
Current value:                 value=1197550329.140, interval=0.000

$ ./timerfd_test -- 100
Initial setting for settime:   value=100.295, interval=0.000
./timerfd_test> g
(elapsed time=  2)
Current value:                 value=18208.440, interval=0.000

In both of the above examples, the "value" retrieved by timer_gettime()
should be a number <= 100.


BUG 2a:
The bug described for timerfd_gettime() also applies for the
'curr_value' returned by a call to timerfd_settime().


>     The it_interval field returns the interval of the timer.

Verified: the above was true in all my tests.

>     If both fields of this  structure  are  zero,  then  the
>     timer  is set to expire just once, at the time specified
>     by curr_value.it_value.

Verified.

>   Operating on a timer file descriptor
>     The file descriptor returned  by  timerfd_create()  sup-
>     ports the following operations:
>
>     read(2)
>            If  the  timer  has  already  expired one or more
>            times since its settings were last modified using
>            timerfd_settime(),  or  since the last successful
>            read(2), then the buffer given to read(2) returns
>            an  unsigned 8-byte integer (uint64_t) containing
>            the number of  expirations  that  have  occurred.

Verified: each call to timerfd_settime() resets the "expiration
count" to 0, even if a previous setting of the timer had already
expired.

>            (The  returned value is in host byte order, i.e.,
>            the native byte order for integers  on  the  host
>            machine.)
>
>            If no timer expirations have occurred at the time
>            of the read(2), then the call either blocks until
>            the  next  timer  expiration,

Verified.

>            or  fails with the
>            error EAGAIN if the file descriptor has been made
>            non-blocking (via the use of the fcntl(2) F_SETFL
>            operation to set the O_NONBLOCK flag).

Verified.

>            A read(2) will fail with the error EINVAL if  the
>            size of the supplied buffer is less than 8 bytes.

Verified.

>     poll(2), select(2) (and similar)
>            The file descriptor is  readable  (the  select(2)
>            readfds argument; the poll(2) POLLIN flag) if one
>            or more timer expirations have occurred.

Verified for both poll() and select().

>            The file descriptor also supports the other file-
>            descriptor    multiplexing    APIs:   pselect(2),
>            ppoll(2), and epoll(7).
>
>     close(2)
>            When the file descriptor is no longer required it
>            should  be  closed.   When  all  file descriptors
>            associated with the same timer object  have  been
>            closed,  the  timer is disarmed and its resources
>            are freed by the kernel.

Not verified.  (No easy way to do that from userspace.)

>   fork(2) semantics
>     After a fork(2), the child inherits a copy of  the  file
>     descriptor   created   by  timerfd_create().   The  file
>     descriptor refers to the same underlying timer object as
>     the  corresponding  file  descriptor  in the parent, and
>     read(2)s in the  child  will  return  information  about
>     expirations of the timer.

Verified.  Reads from a dup(2)ed file descriptor, or a file
descriptor duplicated via a fork(2) can be used to read
expiration information.

>   execve(2) semantics
>     A  file  descriptor  created by timerfd_create() is pre-
>     served across execve(2), and continues to generate timer
>     expirations if the timer was armed.

Verified.  A timerfd file descriptor is preserved across an
exceve(), and continues to generate timer expirations that
can be read(2).

> RETURN VALUE
>     On success, timerfd_create() returns a new file descrip-
>     tor.  On error, -1 is returned and errno is set to indi-
>     cate the error.

Verified.  (And obvious.)

>     timerfd_settime() and timerfd_gettime() return 0 on suc-
>     cess; on error they return -1, and set errno to indicate
>     the error.

Verified.  (And obvious.)

> ERRORS
>     timerfd_create() can fail with the following errors:
>
>     EINVAL The  clockid  argument is neither CLOCK_MONOTONIC
>            nor CLOCK_REALTIME;

(CLOCK_MONOTONIC is 1, CLOCK_REALTIME is 0)
Tested: clockid == 2, clockid == -1;
        timerfd_create() fails with EINVAL, as expected.

>            or flags is invalid.

Currently, 'flags' must be zero.
Tested: flags == 1;
        timerfd_create() fails with EINVAL, as expected.

>     EMFILE The per-process limit of  open  file  descriptors
>            has been reached.

Not tested.

>     ENFILE The system-wide limit on the total number of open
>            files has been reached.

Not tested.

>     ENODEV Could  not  mount  (internal)  anonymous   i-node
>            device.

Not tested.

>     ENOMEM There  was  insufficient  kernel memory to create
>            the timer.

Not tested.

>     timerfd_settime() and timerfd_gettime()  can  fail  with
>     the following errors:
>
>     EBADF  fd is not a valid file descriptor.

Tested for timerfd_gettime() and timerfd_settime().
      An invalid file descriptor (i.e., a file descriptor that
      is not open) yields the error EBADF, as expected.

>     EINVAL fd  is  not  a  valid  timerfd  file  descriptor.

Tested for timerfd_gettime() and timerfd_settime().
      Passing a file descriptor that refers to an object other
      than a timerfd fails with the error EINVAL, as expected.

>            new_value is not properly initialized (one of the
>            tv_nsec   falls   outside   the   range  zero  to
>            999,999,999).

Tested for timerfd_settime().
     Specifying it_value.tv_nsec == 1000000000 or
     it_interval.tv_nsec == 1000000000 yields the error EINVAL,
     as expected.

--- END MAN PAGE TEXT ---

/* timerfd_test.c */

/* Link with -lrt */

#define _GNU_SOURCE
#include <sys/syscall.h>
#include <unistd.h>
#include <time.h>
#if defined(__i386__)
#define __NR_timerfd_create 322
#define __NR_timerfd_settime 325
#define __NR_timerfd_gettime 326
#endif

static int
timerfd_create(int clockid, int flags)
{
    return syscall(__NR_timerfd_create, clockid, flags);
}

static int
timerfd_settime(int fd, int flags, struct itimerspec *new_value,
        struct itimerspec *curr_value)
{
    return syscall(__NR_timerfd_settime, fd, flags,
            new_value, curr_value);
}

static int
timerfd_gettime(int fd, struct itimerspec *curr_value)
{
    return syscall(__NR_timerfd_gettime, fd, curr_value);
}

#define TFD_TIMER_ABSTIME (1 << 0)


#define handle_error(msg) \
        do { perror(msg); exit(EXIT_FAILURE); } while (0)

// #include <sys/timerfd.h>
#include <time.h>
#include <sys/times.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdint.h>     /* Definition of uint64_t */

static void
usage(const char *pname, const char *msg)
{
    if (msg != NULL)
        printf("%s", msg);
    fprintf(stderr, "Usage: %s [options] value-sec [value-nsec "
            "[intvl-sec [intvl-nsec]]]\n", pname);
    fprintf(stderr, "Options are:\n");
    fprintf(stderr, "\t-a    Use absolute timer\n");
    fprintf(stderr, "\t-m    Use CLOCK_MONOTONIC "
            "(instead of CLOCK_REALTIME)\n");

    exit(EXIT_FAILURE);
} /* usage  */

static void
display_help(void)
{
    printf("n val-sec val-nsec intvl-sec intvl-nsec\n"
           "                  Reset timer value & interval\n");
    printf("g                 Get timer value\n");
    printf("r                 Read from file descriptor\n");
    printf("t                 Print elapsed time\n");
} /* display_help */


#define MAX_LINE 1024

static void
print_itimerspec(struct itimerspec *its)
{
    printf("value=%ld.%03ld, interval=%ld.%03ld",
            (long) its->it_value.tv_sec,
            (long) its->it_value.tv_nsec / 1000000,
            (long) its->it_interval.tv_sec,
            (long) its->it_interval.tv_nsec / 1000000);
} /* print_itimerspec */


int
main(int argc, char *argv[])
{
    struct itimerspec new_value, curr_value;
    int fd, flags;
    struct timespec now;
    int s, opt, use_abs_timer, use_monotonic;
    long arg1, arg2, arg3, arg4;
    uint64_t exp;
    time_t start;
    char line[MAX_LINE];
    int num_read, clockid;
    char cmd;

    use_abs_timer = 0;
    use_monotonic = 0;

    while ((opt = getopt(argc, argv, "am")) != -1) {
        switch (opt) {
        case 'a':
            use_abs_timer = 1;
            break;

        case 'm':
            use_monotonic = 1;
            break;

        default:
            usage(argv[1], NULL);
            break;
        } /* switch */
    } /* while */

    clockid = (use_monotonic ? CLOCK_MONOTONIC : CLOCK_REALTIME);

    if (optind + 1 > argc)
        usage(argv[0], NULL);

    if (clock_gettime(clockid, &now) == -1)
        handle_error("clock_gettime");

    flags = use_abs_timer ? TFD_TIMER_ABSTIME : 0;

    /* Create a timer with initial expiration and interval
       as specified in command line */

    if (use_abs_timer) {
        new_value.it_value.tv_sec = now.tv_sec + atoi(argv[optind]);
        new_value.it_value.tv_nsec = (argc > optind + 1) ?
                        atol(argv[optind + 1]) : now.tv_nsec;
    } else {
        new_value.it_value.tv_sec = atoi(argv[optind]);
        new_value.it_value.tv_nsec = (argc > optind + 1) ?
                       atol(argv[optind + 1]) : now.tv_nsec;
    }

    new_value.it_interval.tv_sec = (argc > optind + 2) ?
                        atol(argv[optind + 2]) : 0;
    new_value.it_interval.tv_nsec = (argc > optind + 3) ?
                        atol(argv[optind + 3]) : 0;

    fd = timerfd_create(clockid, 0);
    if (fd == -1)
        handle_error("timerfd_create");

    printf("Initial setting for settime:   ");
    print_itimerspec(&new_value);
    printf("\n");

    s = timerfd_settime(fd, flags, &new_value, &curr_value);
    if (s == -1)
        handle_error("timerfd_settime");

    start = time(NULL);

    for ( ; ; ) {
        printf("%s> ", argv[0]);
        fflush(stdout);

        if (fgets(line, MAX_LINE, stdin) == NULL)       /* EOF */
            exit(EXIT_SUCCESS);
        line[strlen(line) - 1] = '\0';      /* Remove trailing '\n' */

        if (*line == '\0')
            continue;                       /* Skip blank lines */

        if (line[0] == '?')
            display_help();

        num_read = sscanf(line, " %c %ld %ld %ld %ld",
                          &cmd, &arg1, &arg2, &arg3, &arg4);

        switch (cmd) {
        case 'n':
            if (num_read != 5) {
                printf("Wrong number of arguments\n");
                continue;
            }

            if (use_abs_timer) {
                printf("This is an absolute timer\n");
                if (clock_gettime(clockid, &now) == -1)
                    handle_error("clock_gettime");
                printf("Now:                           ");
                printf("value=%ld.%03ld", (long) now.tv_sec,
                        (long) now.tv_nsec / 1000000);
                printf("\n");

                new_value.it_value.tv_sec = now.tv_sec + arg1;
                new_value.it_value.tv_nsec = now.tv_nsec + arg2;

            } else {
                printf("This is a relative timer\n");
                new_value.it_value.tv_sec = arg1;
                new_value.it_value.tv_nsec = arg2;
            }

            new_value.it_interval.tv_sec = arg3;
            new_value.it_interval.tv_nsec = arg4;

            printf("New setting for settime:       ");
            print_itimerspec(&new_value);
            printf("\n");

            s = timerfd_settime(fd, flags, &new_value, &curr_value);
            if (s == -1) {
                perror("timerfd_settime");
                break;
            }
            printf("Previous setting from settime: ");
            print_itimerspec(&curr_value);
            printf("\n");

            break;

        case 't':
            printf("%ld\n", (long) (time(NULL) - start));
            break;

        case 'g':
            s = timerfd_gettime(fd, &curr_value);
            if (s == -1)
                handle_error("timerfd_gettime");
            printf("(elapsed time=%3ld)\n", (long) (time(NULL) - start));
            printf("Current value:                 ");
            print_itimerspec(&curr_value);
            printf("\n");
            break;

        case 'r':
            s = read(fd, &exp, sizeof(uint64_t));
            if (s != sizeof(uint64_t))
                handle_error("read");
            printf("Read: %lld\n", exp);
            break;
        } /* switch */
    } /* for */

    exit(EXIT_SUCCESS);
}


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ