[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160623044828.GA3379@f23x64.localdomain>
Date:	Wed, 22 Jun 2016 21:48:28 -0700
From:	Darren Hart <dvhart@...radead.org>
To:	Matthieu CASTET <matthieu.castet@...rot.com>
Cc:	linux-kernel@...r.kernel.org,
	Michael Kerrisk <mtk.manpages@...il.com>,
	Darren Hart <dvhart@...ux.intel.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Davidlohr Bueso <dave@...olabs.net>,
	Thomas Gleixner <tglx@...utronix.de>,
	Eric Dumazet <dada1@...mosbay.com>
Subject: Re: futex: Allow FUTEX_CLOCK_REALTIME with FUTEX_WAIT op
On Mon, Jun 20, 2016 at 04:26:52PM +0200, Matthieu CASTET wrote:
> Hi,
> 
> the commit 337f13046ff03717a9e99675284a817527440a49 is saying that it
> change to syscall to an equivalent to FUTEX_WAIT_BITSET |
> FUTEX_CLOCK_REALTIME with a bitset of FUTEX_BITSET_MATCH_ANY.
> 
> It seems wrong to me, because in case of FUTEX_WAIT, in
> "SYSCALL_DEFINE6(futex", we convert relative timeout to absolute
> timeout [1].
> 
> So FUTEX_CLOCK_REALTIME | FUTEX_WAIT is expecting a relative timeout
> when FUTEX_WAIT_BITSET take an absolute timeout.
> 
> To make it work you have to use something like the (untested) attached
> patch.
+Eric Dumazet
Thanks for reporting Matthieu,
FUTEX_WAIT traditionally used a relative timeout with CLOCK_MONOTONIC while
FUTEX_WAIT_BITSET could use either ??? based on the FUTEX_CLOCK_ flag used. The
man page is not particularly clear on this:
http://man7.org/linux/man-pages/man2/futex.2.html
"
The FUTEX_WAIT_BITSET operation also interprets the timeout argument
differently from FUTEX_WAIT.  See the discussion of FUTEX_CLOCK_REALTIME,
above.
"
Matthew Kerrisk:
I think this language could be removed now that we support the
FUTEX_CLOCK_REALTIME flag for both futex ops.
As for the intended behavior of the FUTEX_CLOCK_REALTIME flag:
FUTEX_CLOCK_REALTIME (since Linux 2.6.28)
    This option bit can be employed only with the FUTEX_WAIT_BITSET,
    FUTEX_WAIT_REQUEUE_PI, and FUTEX_WAIT (since Linux 4.5) operations.
(NOTE: FUTEX_WAIT was recently added after the patch in question here)
    If this option is set, the kernel treats timeout as an absolute time based
    on CLOCK_REALTIME.
    If this option is not set, the kernel treats timeout as a relative time,
    measured against the CLOCK_MONOTONIC clock.
This supports your argument Matthieu. The assumption of a relative timeout for
FUTEX_WAIT in SYSCALL_DEFINE6 needs to be updated to account for FUTEX_WAIT now
honoring the FUTEX_CLOCK_REALTIME flag, which treats the timeout as absolute.
However, I don't think the patch below is correct. The existing logic
determines the type of timeout based on the futex_op when it should instead
determine the type of timeout based on the FUTEX_CLOCK_REALTIME flag.
My reading of the man page is that FUTEX_WAIT_BITSET abides by the timeout
interpretation defined by the FUTEX_CLOCK_REALTIME attribute, so
SYSCALL_DEFINE6 was misbehaving for FUTEX_WAIT|FUTEX_CLOCK_REALTIME (where the
timeout should have been treated as absolute) as well as for
FUTEX_WAIT_BITSET|FUTEX_CLOCK_MONOTONIC (where the timeout should have been
treated as relative).
Consider the following:
diff --git a/kernel/futex.c b/kernel/futex.c
index 33664f7..fa2af29 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -3230,7 +3230,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
 			return -EINVAL;
 
 		t = timespec_to_ktime(ts);
-		if (cmd == FUTEX_WAIT)
+		if (!(cmd & FUTEX_CLOCK_REALTIME))
 			t = ktime_add_safe(ktime_get(), t);
 		tp = &t;
 	}
The concern for me is whether the code is incorrect, or if the man page is
incorrect. Does existing userspace code expect the FUTEX_WAIT_BITSET op to
always use an absolute timeout, regardless of the CLOCK used?
> 
> Matthieu
> 
> [1]
>         if (cmd == FUTEX_WAIT)
>             t = ktime_add_safe(ktime_get(), t);
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 33664f7..4bee915 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -3230,7 +3230,7 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, u32, val,
>  			return -EINVAL;
>  
>  		t = timespec_to_ktime(ts);
> -		if (cmd == FUTEX_WAIT)
> +		if (cmd == FUTEX_WAIT && !(op & FUTEX_CLOCK_REALTIME))
>  			t = ktime_add_safe(ktime_get(), t);
>  		tp = &t;
>  	}
-- 
Darren Hart
Intel Open Source Technology Center
Powered by blists - more mailing lists
 
