lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b6857aad-4cfc-4961-df54-6e658fca7f75@suse.cz>
Date:   Wed, 3 May 2023 11:28:36 +0200
From:   Vlastimil Babka <vbabka@...e.cz>
To:     Dave Chinner <david@...morbit.com>,
        James Bottomley <James.Bottomley@...senpartnership.com>
Cc:     Kent Overstreet <kent.overstreet@...ux.dev>,
        Suren Baghdasaryan <surenb@...gle.com>,
        akpm@...ux-foundation.org, mhocko@...e.com, hannes@...xchg.org,
        roman.gushchin@...ux.dev, mgorman@...e.de, willy@...radead.org,
        liam.howlett@...cle.com, corbet@....net, void@...ifault.com,
        peterz@...radead.org, juri.lelli@...hat.com, ldufour@...ux.ibm.com,
        catalin.marinas@....com, will@...nel.org, arnd@...db.de,
        tglx@...utronix.de, mingo@...hat.com, dave.hansen@...ux.intel.com,
        x86@...nel.org, peterx@...hat.com, david@...hat.com,
        axboe@...nel.dk, mcgrof@...nel.org, masahiroy@...nel.org,
        nathan@...nel.org, dennis@...nel.org, tj@...nel.org,
        muchun.song@...ux.dev, rppt@...nel.org, paulmck@...nel.org,
        pasha.tatashin@...een.com, yosryahmed@...gle.com,
        yuzhao@...gle.com, dhowells@...hat.com, hughd@...gle.com,
        andreyknvl@...il.com, keescook@...omium.org,
        ndesaulniers@...gle.com, gregkh@...uxfoundation.org,
        ebiggers@...gle.com, ytcoode@...il.com, vincent.guittot@...aro.org,
        dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
        bristot@...hat.com, vschneid@...hat.com, cl@...ux.com,
        penberg@...nel.org, iamjoonsoo.kim@....com, 42.hyeyoo@...il.com,
        glider@...gle.com, elver@...gle.com, dvyukov@...gle.com,
        shakeelb@...gle.com, songmuchun@...edance.com, jbaron@...mai.com,
        rientjes@...gle.com, minchan@...gle.com, kaleshsingh@...gle.com,
        kernel-team@...roid.com, linux-doc@...r.kernel.org,
        linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
        linux-arch@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org, linux-modules@...r.kernel.org,
        kasan-dev@...glegroups.com, cgroups@...r.kernel.org,
        Andy Shevchenko <andy@...nel.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Jason Wang <jasowang@...hat.com>,
        Noralf Tr�nnes <noralf@...nnes.org>,
        Andy Shevchenko <andy.shevchenko@...il.com>
Subject: Re: [PATCH 01/40] lib/string_helpers: Drop space in string_get_size's
 output

On 5/3/23 00:50, Dave Chinner wrote:
> On Tue, May 02, 2023 at 07:42:59AM -0400, James Bottomley wrote:
>> On Mon, 2023-05-01 at 23:17 -0400, Kent Overstreet wrote:
>> > On Mon, May 01, 2023 at 10:22:18PM -0400, James Bottomley wrote:
>> > > It is not used just for debug.  It's used all over the kernel for
>> > > printing out device sizes.  The output mostly goes to the kernel
>> > > print buffer, so it's anyone's guess as to what, if any, tools are
>> > > parsing it, but the concern about breaking log parsers seems to be
>> > > a valid one.
>> > 
>> > Ok, there is sd_print_capacity() - but who in their right mind would
>> > be trying to scrape device sizes, in human readable units,
>> 
>> If you bother to google "kernel log parser", you'll discover it's quite
>> an active area which supports a load of company business models.
> 
> That doesn't mean log messages are unchangable ABI. Indeed, we had
> the whole "printk_index_emit()" addition recently to create
> an external index of printk message formats for such applications to
> use. [*]
> 
>> >  from log messages when it's available in sysfs/procfs (actually, is
>> > it in sysfs? if not, that's an oversight) in more reasonable units?
>> 
>> It's not in sysfs, no.  As aren't a lot of things, which is why log
>> parsing for system monitoring is big business.
> 
> And that big business is why printk_index_emit() exists to allow
> them to easily determine how log messages change format and come and
> go across different kernel versions.
> 
>> > Correct me if I'm wrong, but I've yet to hear about kernel log
>> > messages being consider a stable interface, and this seems a bit out
>> > there.
>> 
>> It might not be listed as stable, but when it's known there's a large
>> ecosystem out there consuming it we shouldn't break it just because you
>> feel like it.
> 
> But we've solved this problem already, yes?
> 
> If the userspace applications are not using the kernel printk format
> index to detect such changes between kernel version, then they
> should be. This makes trivial issues like whether we have a space or
> not between units is completely irrelevant because the entry in the
> printk format index for the log output we emit will match whatever
> is output by the kernel....

If I understand that correctly from the commit changelog, this would have
indeed helped, but if the change was reflected in format string. But with
string_get_size() it's always an %s and the change of the helper's or a
switch to another variant of the helper that would omit the space, wouldn't
be reflected in the format string at all? I guess that would be an argument
for Andy's suggestion for adding a new %pt / %pT which would then be
reflected in the format string. And also more concise to use than using the
helper, fwiw.

> Cheers,
> 
> Dave.
> 
> [*]
> commit 337015573718b161891a3473d25f59273f2e626b
> Author: Chris Down <chris@...isdown.name>
> Date:   Tue Jun 15 17:52:53 2021 +0100
> 
>     printk: Userspace format indexing support
>     
>     We have a number of systems industry-wide that have a subset of their
>     functionality that works as follows:
>     
>     1. Receive a message from local kmsg, serial console, or netconsole;
>     2. Apply a set of rules to classify the message;
>     3. Do something based on this classification (like scheduling a
>        remediation for the machine), rinse, and repeat.
>     
>     As a couple of examples of places we have this implemented just inside
>     Facebook, although this isn't a Facebook-specific problem, we have this
>     inside our netconsole processing (for alarm classification), and as part
>     of our machine health checking. We use these messages to determine
>     fairly important metrics around production health, and it's important
>     that we get them right.
>     
>     While for some kinds of issues we have counters, tracepoints, or metrics
>     with a stable interface which can reliably indicate the issue, in order
>     to react to production issues quickly we need to work with the interface
>     which most kernel developers naturally use when developing: printk.
>     
>     Most production issues come from unexpected phenomena, and as such
>     usually the code in question doesn't have easily usable tracepoints or
>     other counters available for the specific problem being mitigated. We
>     have a number of lines of monitoring defence against problems in
>     production (host metrics, process metrics, service metrics, etc), and
>     where it's not feasible to reliably monitor at another level, this kind
>     of pragmatic netconsole monitoring is essential.
>     
>     As one would expect, monitoring using printk is rather brittle for a
>     number of reasons -- most notably that the message might disappear
>     entirely in a new version of the kernel, or that the message may change
>     in some way that the regex or other classification methods start to
>     silently fail.
>     
>     One factor that makes this even harder is that, under normal operation,
>     many of these messages are never expected to be hit. For example, there
>     may be a rare hardware bug which one wants to detect if it was to ever
>     happen again, but its recurrence is not likely or anticipated. This
>     precludes using something like checking whether the printk in question
>     was printed somewhere fleetwide recently to determine whether the
>     message in question is still present or not, since we don't anticipate
>     that it should be printed anywhere, but still need to monitor for its
>     future presence in the long-term.
>     
>     This class of issue has happened on a number of occasions, causing
>     unhealthy machines with hardware issues to remain in production for
>     longer than ideal. As a recent example, some monitoring around
>     blk_update_request fell out of date and caused semi-broken machines to
>     remain in production for longer than would be desirable.
>     
>     Searching through the codebase to find the message is also extremely
>     fragile, because many of the messages are further constructed beyond
>     their callsite (eg. btrfs_printk and other module-specific wrappers,
>     each with their own functionality). Even if they aren't, guessing the
>     format and formulation of the underlying message based on the aesthetics
>     of the message emitted is not a recipe for success at scale, and our
>     previous issues with fleetwide machine health checking demonstrate as
>     much.
>     
>     This provides a solution to the issue of silently changed or deleted
>     printks: we record pointers to all printk format strings known at
>     compile time into a new .printk_index section, both in vmlinux and
>     modules. At runtime, this can then be iterated by looking at
>     <debugfs>/printk/index/<module>, which emits the following format, both
>     readable by humans and able to be parsed by machines:
>     
>         $ head -1 vmlinux; shuf -n 5 vmlinux
>         # <level[,flags]> filename:line function "format"
>         <5> block/blk-settings.c:661 disk_stack_limits "%s: Warning: Device %s is misaligned\n"
>         <4> kernel/trace/trace.c:8296 trace_create_file "Could not create tracefs '%s' entry\n"
>         <6> arch/x86/kernel/hpet.c:144 _hpet_print_config "hpet: %s(%d):\n"
>         <6> init/do_mounts.c:605 prepare_namespace "Waiting for root device %s...\n"
>         <6> drivers/acpi/osl.c:1410 acpi_no_auto_serialize_setup "ACPI: auto-serialization disabled\n"
>     
>     This mitigates the majority of cases where we have a highly-specific
>     printk which we want to match on, as we can now enumerate and check
>     whether the format changed or the printk callsite disappeared entirely
>     in userspace. This allows us to catch changes to printks we monitor
>     earlier and decide what to do about it before it becomes problematic.
>     
>     There is no additional runtime cost for printk callers or printk itself,
>     and the assembly generated is exactly the same.
>     
>     Signed-off-by: Chris Down <chris@...isdown.name>
>     Cc: Petr Mladek <pmladek@...e.com>
>     Cc: Jessica Yu <jeyu@...nel.org>
>     Cc: Sergey Senozhatsky <sergey.senozhatsky@...il.com>
>     Cc: John Ogness <john.ogness@...utronix.de>
>     Cc: Steven Rostedt <rostedt@...dmis.org>
>     Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
>     Cc: Johannes Weiner <hannes@...xchg.org>
>     Cc: Kees Cook <keescook@...omium.org>
>     Reviewed-by: Petr Mladek <pmladek@...e.com>
>     Tested-by: Petr Mladek <pmladek@...e.com>
>     Reported-by: kernel test robot <lkp@...el.com>
>     Acked-by: Andy Shevchenko <andy.shevchenko@...il.com>
>     Acked-by: Jessica Yu <jeyu@...nel.org> # for module.{c,h}
>     Signed-off-by: Petr Mladek <pmladek@...e.com>
>     Link: https://lore.kernel.org/r/e42070983637ac5e384f17fbdbe86d19c7b212a5.1623775748.git.chris@chrisdown.name
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ