lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140630144017.2abc48ba@gandalf.local.home>
Date:	Mon, 30 Jun 2014 14:40:17 -0400
From:	Steven Rostedt <rostedt@...dmis.org>
To:	Tony Luck <tony.luck@...el.com>
Cc:	<linux-kernel@...r.kernel.org>, <mingo@...hat.com>,
	<fweisbec@...il.com>,
	"<m.chehab@...sung.com> Xie XiuQi" <xiexiuqi@...wei.com>
Subject: Re: [PATCH] tracing: Fix wraparound problems in "uptime" tracer

On Mon, 30 Jun 2014 11:17:18 -0700
Tony Luck <tony.luck@...el.com> wrote:

> There seem to be no non-racy solutions ... I've been wondering
> about giving up on a generic jiffies_to_nsec() function because
> people might use it in cases where the races might be likley to
> bite them.  For my need, I think that "perfect is the enemy of good":
> 
> 1) The race window is only a few microseconds wide
> 2) It only exists on 32-bit kernels - which are dying out on server
>    systems because they can't handle the amounts of memory on modern
>    machines.
> 3) It opens every 49 days (on a HZ=1000 system)
> 4) I'm logging error events that happen at a "per-month" frequency (or lower)
> 5) If the race does happen - the visible result is that we have a
>    bad time logged against an error event.
> 
> so what about this: ...
> 
> From: Tony Luck <tony.luck@...el.com>
> 
> The "uptime" tracer added in:
>     commit 8aacf017b065a805d27467843490c976835eb4a5
>     tracing: Add "uptime" trace clock that uses jiffies
> has wraparound problems when the system has been up more
> than 1 hour 11 minutes and 34 seconds. It converts jiffies
> to nanoseconds using:
> 	(u64)jiffies_to_usecs(jiffy) * 1000ULL
> but since jiffies_to_usecs() only returns a 32-bit value, it
> truncates at 2^32 microseconds.  An additional problem on 32-bit
> systems is that the argument is "unsigned long", so fixing the
> return value only helps until 2^32 jiffies (49.7 days on a HZ=1000
> system).
> 
> We can't provide a full features jiffies_to_nsec() function in
> any safe way (32-bit systems need locking to read the full 64-bit
> jiffies value).  Just do the best we can here and recognise that
> 32-bit systems may seem some timestamp anomolies if jiffies64
> was in the middle of rolling over a 2^32 boundary.
> 
> Signed-off-by: Tony Luck <tony.luck@...el.com>
> ---
>  kernel/timeconst.bc        |  6 ++++++
>  kernel/trace/trace_clock.c | 10 ++++++++--
>  2 files changed, 14 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/timeconst.bc b/kernel/timeconst.bc
> index 511bdf2cafda..a5fef7a7fb27 100644
> --- a/kernel/timeconst.bc
> +++ b/kernel/timeconst.bc
> @@ -100,6 +100,12 @@ define timeconst(hz) {
>  		print "#define USEC_TO_HZ_DEN\t\t", 1000000/cd, "\n"
>  		print "\n"
>  
> +		obase=10
> +		cd=gcd(hz,1000000000)
> +		print "#define HZ_TO_NSEC_NUM\t\t", 1000000000/cd, "\n"
> +		print "#define HZ_TO_NSEC_DEN\t\t", hz/cd, "\n"
> +		print "\n"
> +
>  		print "#endif /* KERNEL_TIMECONST_H */\n"
>  	}
>  	halt
> diff --git a/kernel/trace/trace_clock.c b/kernel/trace/trace_clock.c
> index 26dc348332b7..dc5b11b9f8a4 100644
> --- a/kernel/trace/trace_clock.c
> +++ b/kernel/trace/trace_clock.c
> @@ -59,13 +59,19 @@ u64 notrace trace_clock(void)
>  
>  /*
>   * trace_jiffy_clock(): Simply use jiffies as a clock counter.
> + * This usage of jiffies_64 isn't safe on 32-bit, but we may be
> + * called from NMI context, and we have no safe way to get a timestamp.
>   */
>  u64 notrace trace_clock_jiffies(void)
>  {
> -	u64 jiffy = jiffies - INITIAL_JIFFIES;
> +	u64 jiffy = jiffies_64 - INITIAL_JIFFIES;
>  
>  	/* Return nsecs */
> -	return (u64)jiffies_to_usecs(jiffy) * 1000ULL;
> +#if !(NSEC_PER_SEC % HZ)
> +	return (NSEC_PER_SEC / HZ) * jiffy;
> +#else
> +	return (jiffy * HZ_TO_NSEC_NUM) / HZ_TO_NSEC_DEN;

Wont this break on 32 bit systems. That is, you can't divide 64bit
integers without using do_div().

-- Steve

> +#endif
>  }
>  
>  /*

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists