linux-kernel - Re: [PATCH 2/2] printk, Add printk.clock kernel parameter

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <568E867D.8060904@redhat.com>
Date:	Thu, 07 Jan 2016 10:38:37 -0500
From:	Prarit Bhargava <prarit@...hat.com>
To:	Petr Mladek <pmladek@...e.com>
CC:	linux-kernel@...r.kernel.org, John Stultz <john.stultz@...aro.org>,
	Xunlei Pang <pang.xunlei@...aro.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Baolin Wang <baolin.wang@...aro.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Tejun Heo <tj@...nel.org>,
	Peter Hurley <peter@...leysoftware.com>,
	Vasily Averin <vvs@...tuozzo.com>,
	Joe Perches <joe@...ches.com>
Subject: Re: [PATCH 2/2] printk, Add printk.clock kernel parameter



On 01/07/2016 09:57 AM, Petr Mladek wrote:
> On Wed 2016-01-06 08:00:34, Prarit Bhargava wrote:
>> This patch introduces printk.clock=[local|boot|real|tai] allowing a
>> user to specify an adjusted clock to use with printk timestamps.  The
>> hardware clock, or the existing functionality, is preserved by default.
>>
>> If the timekeeper_lock (used to protect writes from reads) is under
>> contention, the last known good time is output again and "??" is used as
>> the last two digits of time.  This avoids contention issues with a CPU
>> panicking while another CPU is writing time data.
>>
>> ---
>>  include/linux/time64.h |    2 +
>>  kernel/printk/printk.c |  105 ++++++++++++++++++++++++++++++++++++++++++++----
>>  2 files changed, 99 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/time64.h b/include/linux/time64.h
>> index 367d5af..f4a35d9 100644
>> --- a/include/linux/time64.h
>> +++ b/include/linux/time64.h
>> @@ -39,6 +39,8 @@ struct itimerspec64 {
>>  #define TIME64_MAX			((s64)~((u64)1 << 63))
>>  #define KTIME_MAX			((s64)~((u64)1 << 63))
>>  #define KTIME_SEC_MAX			(KTIME_MAX / NSEC_PER_SEC)
>> +/* Maximum value returned by ktime_to_ns() */
>> +#define KTIME_MAX_NS 			((u64)((U64_MAX>>1)))
> 
> Please, use KTIME_NS_MAX to use the same naming scheme as
> KTIME_SEC_MAX.
> 
> Well, I would return 0 when the time is not known. See also below.
> 
>>  #if __BITS_PER_LONG == 64
>>  
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index 2ce8826..ffc05ee 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -419,6 +419,83 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
>>  	return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
>>  }
>>  
>> +static u64 printk_try_real_ns(void)
>> +{
>> +	return ktime_try_real_ns();
>> +}
>> +
>> +static u64 printk_try_boot_ns(void)
>> +{
>> +	return ktime_try_boot_ns();
>> +}
>> +
>> +static u64 printk_try_tai_ns(void)
>> +{
>> +	return ktime_try_tai_ns();
>> +}
> 
> Do we need these wrappers? Why not using ktime_try_xxx_ns() directly?
> You already use local_clock().

Unfortunately ktime_* are inlined functions, and I've mimicked the currently
coding style there.

In any case I'm doing a v2 with different functions.  I'll see if making this
stuff not-inlined is any better.

> 
> 
>> +static int printk_clock_param_set(const char *val,
>> +				  const struct kernel_param *kp)
>> +{
>> +	char *printk_clock_new = strstrip((char *)val);
>> +	int ret;
>> +
>> +	/* check if change is needed */
>> +	if (!strcmp(printk_clock_new, printk_clock))
>> +		return 0;
>> +
>> +	if (!strncmp("local", printk_clock_new, 5)) {
>> +		ret = param_set_charp(printk_clock_new, kp);
>> +		if (ret)
>> +			return ret;
>> +		printk_clock_fn = &local_clock;
>> +	} else if (!strncmp("real", printk_clock_new, 4)) {
>> +		ret = param_set_charp(printk_clock_new, kp);
>> +		if (ret)
>> +			return ret;
>> +		printk_clock_fn = &printk_try_real_ns;
>> +	} else if (!strncmp("boot", printk_clock_new, 4)) {
>> +		ret = param_set_charp(printk_clock_new, kp);
>> +		if (ret)
>> +			return ret;
>> +		printk_clock_fn = &printk_try_boot_ns;
>> +	} else if (!strncmp("tai", printk_clock_new, 3)) {
>> +		ret = param_set_charp(printk_clock_new, kp);
>> +		if (ret)
>> +			return ret;
>> +		printk_clock_fn = &printk_try_tai_ns;
>> +	}
>> +
>> +	/* Did the clock change ? */
>> +	if (strcmp(printk_clock_new, printk_clock))
>> +		return -EINVAL;
> 
> I wonder if we need this extra check. It might be enough to
> print the error message in the final "else" above.

I was trying to catch a local to local change or a real to real change.
But yeah ... I could sort of do that in the final else above.

> 
>> +	pr_info("printk: timestamp set to %s clock.\n", printk_clock);
>> +	return 0;
>> +}
>> +
>> +static struct kernel_param_ops printk_clock_param_ops = {
>> +	.set =		printk_clock_param_set,
>> +	.get =		param_get_charp,
>> +};
>> +
>> +module_param_cb(clock, &printk_clock_param_ops, &printk_clock, 0644);
>> +MODULE_PARM_DESC(clock, "Clock used for printk timestamps.  Can be local (hardware/default), boot, real, or tai.\n");
> 
> I have problems to parse this. It seems that the most common style
> used for possible values is to putthem into brackets. I wonder
> if this is better readable.
> 
> MODULE_PARM_DESC(clock, "Clock used for printk timestamps (local = default, boot, real, tai).");
> 

Ah, thanks! :)  I was wondering if there was a coding style preference there
when I wrote this up.

> 
>> +
>>  /* insert record into the buffer, discard old ones, update heads */
>>  static int log_store(int facility, int level,
>>  		     enum log_flags flags, u64 ts_nsec,
>> @@ -467,7 +544,7 @@ static int log_store(int facility, int level,
>>  	if (ts_nsec > 0)
>>  		msg->ts_nsec = ts_nsec;
>>  	else
>> -		msg->ts_nsec = local_clock();
>> +		msg->ts_nsec = printk_get_timestamp();
> 
> Hmm, one problem is that each clock returns another type of time.
> "local" and "boot" clocks returns the number of ns since the boot.
> While "real" and "tai" clocks returns the number of ns since 1.1.1970
> or so.
> 
> The tools reading the timestamps are confused by this. For example,
> I get this:
> 
> $> echo boot >/sys/module/printk/parameters/clock
> $> echo local >/sys/module/printk/parameters/clock
> $> echo real >/sys/module/printk/parameters/clock
> $> echo tai >/sys/module/printk/parameters/clock
> $> dmesg | tail -6
> [    6.976593] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> [    6.977168] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [   77.500483] printk: timestamp set to boot clock.
> [   81.419957] printk: timestamp set to local clock.
> [1452177961.544909] printk: timestamp set to real clock.
> [1452177965.224824] printk: timestamp set to tai clock.
> $> dmesg -T | tail -6
> [Thu Jan  7 15:44:41 2016] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> [Thu Jan  7 15:44:41 2016] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [Thu Jan  7 15:45:52 2016] printk: timestamp set to boot clock.
> [Thu Jan  7 15:45:56 2016] printk: timestamp set to local clock.
> [Fri Jan 13 06:30:36 2062] printk: timestamp set to real clock.
> [Fri Jan 13 06:30:40 2062] printk: timestamp set to tai clock.
> 
> Please, note that the last messages looks as they are printed in
> a far far future ;-)

Heh :)  I didn't even think of testing that.  I'll have to think about that a
bit more.  systemd logging must assume boot time/local time.  I'll ping those
guys with a patch if/when the next version of this gets accepted.

Thanks for catching this.

> 
> 
>>  	memset(log_dict(msg) + dict_len, 0, pad_len);
>>  	msg->len = size;
>>  
>> @@ -1039,18 +1116,30 @@ module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
>>  
>>  static size_t print_time(u64 ts, char *buf)
>>  {
>> -	unsigned long rem_nsec;
>> +	unsigned long rem;
>> +	size_t size;
>>  
>>  	if (!printk_time)
>>  		return 0;
>>  
>> -	rem_nsec = do_div(ts, 1000000000);
>> +	if (ts == KTIME_MAX_NS)
>> +		ts = last_time_stamp;
>>  
>> -	if (!buf)
>> -		return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts);
>> +	if (buf) {
>> +		size = INT_MAX;
>> +		rem = do_div(ts, 1000000000) / 1000;
>> +	} else {
>> +		size = 0;
>> +		rem = 0;
>> +	}
>>  
>> -	return sprintf(buf, "[%5lu.%06lu] ",
>> -		       (unsigned long)ts, rem_nsec / 1000);
>> +	if (ts != KTIME_MAX_NS) {
>> +		return snprintf(buf, size, "[%5lu.%06lu] ",
>> +			       (unsigned long)ts, rem);
>> +	}
>> +	/* return last_time_stamp with ?? warning */
>> +	return snprintf(buf, size, "[%5lu.%04lu??] ",
>> +		       (unsigned long)ts, rem);
> 
> I am not sure if this is worth the effort. All other tools parsing the
> log will be confused by this. Also there are tools that read the ring
> buffer dirrectly, e.g. "crash" reading a crashdump. They will need to
> get updated to handle KTIME_MAX_NS correctly.

Yep -- the thread contains suggestions on how to avoid all of this by using the
NMI safe ktime_* calls (and implementing new ones).  I'm going to do that which
should nicely clean all of this up.

Thanks for the review Petr,

P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/