[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <568E867D.8060904@redhat.com>
Date: Thu, 07 Jan 2016 10:38:37 -0500
From: Prarit Bhargava <prarit@...hat.com>
To: Petr Mladek <pmladek@...e.com>
CC: linux-kernel@...r.kernel.org, John Stultz <john.stultz@...aro.org>,
Xunlei Pang <pang.xunlei@...aro.org>,
Thomas Gleixner <tglx@...utronix.de>,
Baolin Wang <baolin.wang@...aro.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Tejun Heo <tj@...nel.org>,
Peter Hurley <peter@...leysoftware.com>,
Vasily Averin <vvs@...tuozzo.com>,
Joe Perches <joe@...ches.com>
Subject: Re: [PATCH 2/2] printk, Add printk.clock kernel parameter
On 01/07/2016 09:57 AM, Petr Mladek wrote:
> On Wed 2016-01-06 08:00:34, Prarit Bhargava wrote:
>> This patch introduces printk.clock=[local|boot|real|tai] allowing a
>> user to specify an adjusted clock to use with printk timestamps. The
>> hardware clock, or the existing functionality, is preserved by default.
>>
>> If the timekeeper_lock (used to protect writes from reads) is under
>> contention, the last known good time is output again and "??" is used as
>> the last two digits of time. This avoids contention issues with a CPU
>> panicking while another CPU is writing time data.
>>
>> ---
>> include/linux/time64.h | 2 +
>> kernel/printk/printk.c | 105 ++++++++++++++++++++++++++++++++++++++++++++----
>> 2 files changed, 99 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/time64.h b/include/linux/time64.h
>> index 367d5af..f4a35d9 100644
>> --- a/include/linux/time64.h
>> +++ b/include/linux/time64.h
>> @@ -39,6 +39,8 @@ struct itimerspec64 {
>> #define TIME64_MAX ((s64)~((u64)1 << 63))
>> #define KTIME_MAX ((s64)~((u64)1 << 63))
>> #define KTIME_SEC_MAX (KTIME_MAX / NSEC_PER_SEC)
>> +/* Maximum value returned by ktime_to_ns() */
>> +#define KTIME_MAX_NS ((u64)((U64_MAX>>1)))
>
> Please, use KTIME_NS_MAX to use the same naming scheme as
> KTIME_SEC_MAX.
>
> Well, I would return 0 when the time is not known. See also below.
>
>> #if __BITS_PER_LONG == 64
>>
>> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
>> index 2ce8826..ffc05ee 100644
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -419,6 +419,83 @@ static u32 truncate_msg(u16 *text_len, u16 *trunc_msg_len,
>> return msg_used_size(*text_len + *trunc_msg_len, 0, pad_len);
>> }
>>
>> +static u64 printk_try_real_ns(void)
>> +{
>> + return ktime_try_real_ns();
>> +}
>> +
>> +static u64 printk_try_boot_ns(void)
>> +{
>> + return ktime_try_boot_ns();
>> +}
>> +
>> +static u64 printk_try_tai_ns(void)
>> +{
>> + return ktime_try_tai_ns();
>> +}
>
> Do we need these wrappers? Why not using ktime_try_xxx_ns() directly?
> You already use local_clock().
Unfortunately ktime_* are inlined functions, and I've mimicked the currently
coding style there.
In any case I'm doing a v2 with different functions. I'll see if making this
stuff not-inlined is any better.
>
>
>> +static int printk_clock_param_set(const char *val,
>> + const struct kernel_param *kp)
>> +{
>> + char *printk_clock_new = strstrip((char *)val);
>> + int ret;
>> +
>> + /* check if change is needed */
>> + if (!strcmp(printk_clock_new, printk_clock))
>> + return 0;
>> +
>> + if (!strncmp("local", printk_clock_new, 5)) {
>> + ret = param_set_charp(printk_clock_new, kp);
>> + if (ret)
>> + return ret;
>> + printk_clock_fn = &local_clock;
>> + } else if (!strncmp("real", printk_clock_new, 4)) {
>> + ret = param_set_charp(printk_clock_new, kp);
>> + if (ret)
>> + return ret;
>> + printk_clock_fn = &printk_try_real_ns;
>> + } else if (!strncmp("boot", printk_clock_new, 4)) {
>> + ret = param_set_charp(printk_clock_new, kp);
>> + if (ret)
>> + return ret;
>> + printk_clock_fn = &printk_try_boot_ns;
>> + } else if (!strncmp("tai", printk_clock_new, 3)) {
>> + ret = param_set_charp(printk_clock_new, kp);
>> + if (ret)
>> + return ret;
>> + printk_clock_fn = &printk_try_tai_ns;
>> + }
>> +
>> + /* Did the clock change ? */
>> + if (strcmp(printk_clock_new, printk_clock))
>> + return -EINVAL;
>
> I wonder if we need this extra check. It might be enough to
> print the error message in the final "else" above.
I was trying to catch a local to local change or a real to real change.
But yeah ... I could sort of do that in the final else above.
>
>> + pr_info("printk: timestamp set to %s clock.\n", printk_clock);
>> + return 0;
>> +}
>> +
>> +static struct kernel_param_ops printk_clock_param_ops = {
>> + .set = printk_clock_param_set,
>> + .get = param_get_charp,
>> +};
>> +
>> +module_param_cb(clock, &printk_clock_param_ops, &printk_clock, 0644);
>> +MODULE_PARM_DESC(clock, "Clock used for printk timestamps. Can be local (hardware/default), boot, real, or tai.\n");
>
> I have problems to parse this. It seems that the most common style
> used for possible values is to putthem into brackets. I wonder
> if this is better readable.
>
> MODULE_PARM_DESC(clock, "Clock used for printk timestamps (local = default, boot, real, tai).");
>
Ah, thanks! :) I was wondering if there was a coding style preference there
when I wrote this up.
>
>> +
>> /* insert record into the buffer, discard old ones, update heads */
>> static int log_store(int facility, int level,
>> enum log_flags flags, u64 ts_nsec,
>> @@ -467,7 +544,7 @@ static int log_store(int facility, int level,
>> if (ts_nsec > 0)
>> msg->ts_nsec = ts_nsec;
>> else
>> - msg->ts_nsec = local_clock();
>> + msg->ts_nsec = printk_get_timestamp();
>
> Hmm, one problem is that each clock returns another type of time.
> "local" and "boot" clocks returns the number of ns since the boot.
> While "real" and "tai" clocks returns the number of ns since 1.1.1970
> or so.
>
> The tools reading the timestamps are confused by this. For example,
> I get this:
>
> $> echo boot >/sys/module/printk/parameters/clock
> $> echo local >/sys/module/printk/parameters/clock
> $> echo real >/sys/module/printk/parameters/clock
> $> echo tai >/sys/module/printk/parameters/clock
> $> dmesg | tail -6
> [ 6.976593] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> [ 6.977168] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [ 77.500483] printk: timestamp set to boot clock.
> [ 81.419957] printk: timestamp set to local clock.
> [1452177961.544909] printk: timestamp set to real clock.
> [1452177965.224824] printk: timestamp set to tai clock.
> $> dmesg -T | tail -6
> [Thu Jan 7 15:44:41 2016] e1000: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
> [Thu Jan 7 15:44:41 2016] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
> [Thu Jan 7 15:45:52 2016] printk: timestamp set to boot clock.
> [Thu Jan 7 15:45:56 2016] printk: timestamp set to local clock.
> [Fri Jan 13 06:30:36 2062] printk: timestamp set to real clock.
> [Fri Jan 13 06:30:40 2062] printk: timestamp set to tai clock.
>
> Please, note that the last messages looks as they are printed in
> a far far future ;-)
Heh :) I didn't even think of testing that. I'll have to think about that a
bit more. systemd logging must assume boot time/local time. I'll ping those
guys with a patch if/when the next version of this gets accepted.
Thanks for catching this.
>
>
>> memset(log_dict(msg) + dict_len, 0, pad_len);
>> msg->len = size;
>>
>> @@ -1039,18 +1116,30 @@ module_param_named(time, printk_time, bool, S_IRUGO | S_IWUSR);
>>
>> static size_t print_time(u64 ts, char *buf)
>> {
>> - unsigned long rem_nsec;
>> + unsigned long rem;
>> + size_t size;
>>
>> if (!printk_time)
>> return 0;
>>
>> - rem_nsec = do_div(ts, 1000000000);
>> + if (ts == KTIME_MAX_NS)
>> + ts = last_time_stamp;
>>
>> - if (!buf)
>> - return snprintf(NULL, 0, "[%5lu.000000] ", (unsigned long)ts);
>> + if (buf) {
>> + size = INT_MAX;
>> + rem = do_div(ts, 1000000000) / 1000;
>> + } else {
>> + size = 0;
>> + rem = 0;
>> + }
>>
>> - return sprintf(buf, "[%5lu.%06lu] ",
>> - (unsigned long)ts, rem_nsec / 1000);
>> + if (ts != KTIME_MAX_NS) {
>> + return snprintf(buf, size, "[%5lu.%06lu] ",
>> + (unsigned long)ts, rem);
>> + }
>> + /* return last_time_stamp with ?? warning */
>> + return snprintf(buf, size, "[%5lu.%04lu??] ",
>> + (unsigned long)ts, rem);
>
> I am not sure if this is worth the effort. All other tools parsing the
> log will be confused by this. Also there are tools that read the ring
> buffer dirrectly, e.g. "crash" reading a crashdump. They will need to
> get updated to handle KTIME_MAX_NS correctly.
Yep -- the thread contains suggestions on how to avoid all of this by using the
NMI safe ktime_* calls (and implementing new ones). I'm going to do that which
should nicely clean all of this up.
Thanks for the review Petr,
P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists