lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4oybtunobxtemenpg2lg7jv4cyl3xoaxrjlqivbhs6zo72hxpu@fqp6estf5mpc>
Date: Tue, 2 Dec 2025 02:18:44 -0800
From: Breno Leitao <leitao@...ian.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Andrew Lunn <andrew+netdev@...n.ch>, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Paolo Abeni <pabeni@...hat.com>, Shuah Khan <shuah@...nel.org>, Simon Horman <horms@...nel.org>, 
	Jonathan Corbet <corbet@....net>, netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-kselftest@...r.kernel.org, linux-doc@...r.kernel.org, gustavold@...il.com, 
	asantostc@...il.com, calvin@...nvd.org, kernel-team@...a.com, 
	Petr Mladek <pmladek@...e.com>
Subject: Re: [PATCH net-next 0/4] (no cover subject)

Hello Jakub,

On Mon, Dec 01, 2025 at 04:36:22PM -0800, Jakub Kicinski wrote:
> On Fri, 28 Nov 2025 06:20:45 -0800 Breno Leitao wrote:
> > This patch series introduces a new configfs attribute that enables sending
> > messages directly through netconsole without going through the kernel's logging
> > infrastructure.
> > 
> > This feature allows users to send custom messages, alerts, or status updates
> > directly to netconsole receivers by writing to
> > /sys/kernel/config/netconsole/<target>/send_msg, without poluting kernel
> > buffers, and sending msgs to the serial, which could be slow.
> > 
> > At Meta this is currently used in two cases right now (through printk by
> > now):
> > 
> >   a) When a new workload enters or leave the machine.
> >   b) From time to time, as a "ping" to make sure the netconsole/machine
> >   is alive.
> > 
> > The implementation reuses the existing message transmission functions
> > (send_msg_udp() and send_ext_msg_udp()) to handle both basic and extended
> > message formats.
> > 
> > Regarding code organization, this version uses forward declarations for
> > send_msg_udp() and send_ext_msg_udp() functions rather than relocating them
> > within the file. While forward declarations do add a small amount of
> > redundancy, they avoid the larger churn that would result from moving entire
> > function definitions.
> 
> The two questions we need to address here are :
>  - why is the message important in the off-host message stream but not
>    important in local dmesg stream. You mention "serial, which could be
>    slow" - we need more details here.

Thanks for the questions, and I would like to share my view of the world. The
way I see and use netconsole at my company (Meta) is a "kernel message"
on steroids, where it provides more information about the system than
what is available in kernel log buffers (dmesg)

These netconsole messages already have extra data, which provides
information to each message, such as:

 * scheduler configuration (for sched_ext contenxt)
 * THP memory configuration
 * Job/workload running
 * CPU id
 * task->curr name
 * etc
 
So, netconsole already sends extra information today that is not visible
on kernel console (dmesg), and this has proved to be super useful, so
useful that 16 entries are not enough and Gustavo need to do a dynamic
allocation instead of limiting it to 16.

On top of that, printk() has a similar mechanism where extra data is not
printed to the console. printk buffers has a dictionary of structured
data attached to the message that is not printed to the screen, but,
sent through netconsole.

This feature (in this patchset) is just one step ahead, giving some more
power to netconsole, where extra information could be sent beyond what
is in dmesg.

>  - why do we need the kernel API, netcons is just a UDP message, which
>    is easy enough to send from user space. A little bit more detail
>    about the advantages would be good to have.

The primary advantage is leveraging the existing configured netconsole
infrastructure. At Meta, for example, we have a "continuous ping"
mechanism configured by our Configuration Management software that
simply runs 'echo "ping" > /dev/kmsg'.

A userspace solution would require deploying a binary to millons of
machines,  parsing /sys/kernel/configfs/netconsole/cmdline0/configs
and sends packets directly.

While certainly feasible, it's less convenient than using the
existing infrastructure (though I may just be looking for the easier
path here).

> The 2nd point is trivial, the first one is what really gives me pause.
> Why do we not care about the logs on host? If the serial is very slow
> presumably it impacts a lot of things, certainly boot speed, so...

This is spot-on - slow serial definitely impacts things like boot speed.

See my constant complains here, about slow boot

	https://lore.kernel.org/all/aGVn%2FSnOvwWewkOW@gmail.com/

And the something similar in reboot/kexec path:

	https://lore.kernel.org/all/sqwajvt7utnt463tzxgwu2yctyn5m6bjwrslsnupfexeml6hkd@v6sqmpbu3vvu/

> perhaps it should be configured to only log messages at a high level?

Chris is actually working on per-console log levels to solve exactly
this problem, so we could filter serial console messages while keeping
everything in other consoles (aka netconsole):

	https://lore.kernel.org/all/cover.1764272407.git.chris@chrisdown.name/

That work has been in progress for years though, and I'm not sure
when/if it'll land upstream. But if it does, we'd be able to have
different log levels per console and then use your suggested approach.

Thanks for the review, and feel free to yell at me if I am missing the
point,
--breno

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ