linux-kernel - Re: Re: [RFC PATCH 0/5] Add a hash value for each line in /dev/kmsg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Mon, 29 Jul 2013 20:54:51 +0900
From:	Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>
To:	Kay Sievers <kay@...y.org>
Cc:	linux-kernel@...r.kernel.org, yrl.pp-manager.tt@...achi.com,
	akpm@...ux-foundation.org, gregkh@...uxfoundation.org,
	davem@...emloft.net, itoukzo@...data.co.jp
Subject: Re: Re: [RFC PATCH 0/5] Add a hash value for each line in /dev/kmsg

Hello,

(2013/07/26 21:43), Kay Sievers wrote:> On Wed, Jul 3, 2013 at 3:46 AM, Hidehiro Kawai
> <hidehiro.kawai.ez@...achi.com> wrote:
>
>> This patch series adds hash values of printk format strings into
>> each line of /dev/kmsg outputs as follows:
>>
>>         6,154,325061,-,b7db707c@...nel/smp.c:554;Brought up 4 CPUs
>
> /dev/kmsg is to a certain degree a kernel ABI. Having source code
> locations in exported log records might cause people / userspace tools
> to rely on these strings and expect stability here. The kernel though
> cannot make any promises of its source code layout.

All we have to keep as kABI is <hash>@<filename>:<lineno> of the 5th field.
I regard the 5th field including hash as just a hint; it's not guaranteed
either the hash is unique or filename:lineno is unchanged.  Userspace
tools can use the hash to identify the message quickly, but if a hash
collision occurs, the user space need to do message matching in a
traditional way.  Please note that userspace tools can know which ones
collide from a catalog generated at build time.

As for <filename>:<lineno>, it wouldn't be needed for the most of the cases.
So I think I can introduce an option to suppress the output of
<filename>:<lineno> to reduce memory space.

> The hash is supposed to identify the content of a message, but what if
> someone fixes the string? Maybe someone just fixes a one char typo,
> the hash will change and the message will not be recognizable any
> more.

A catalog file which includes hash, location info, and message is
generated at build time.  Combining this information with diff between
two kernel versions, userspace tools will be able to track where
messages moved and which messages changed.  Then, the userspace tool
updates the message DB managed by it.  So I don't think it's a hard
problem.

> As much as "automated" hash creation sounds simple; I really think
> adding explicit "manually" created random message ids to the bunch of
> messages that are interesting is the better option long-term. It
> shouldn't be that many messages, most of the printk output is not
> really useful for automated inspection or to trigger specific actions.

Yes, as far as the use case goes, it may be true.  But it has some
drawbacks.  Please also see my reply to Joe Perches in another thread
(I resent the patches on July 25th).  Also, I heard about the discussion
at the kernel summit 2 years ago.  According to the article of LWN,
it seems that Linus objected your approach (i.e. adding random bit as
message ID).  Were there some agreements on this issue at the kernel summit?

Regards,
-- 
Hidehiro Kawai
Hitachi, Yokohama Research Laboratory
Linux Technology Center

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/