lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1429225433-11946-1-git-send-email-tj@kernel.org>
Date:	Thu, 16 Apr 2015 19:03:37 -0400
From:	Tejun Heo <tj@...nel.org>
To:	akpm@...ux-foundation.org, davem@...emloft.net
Cc:	linux-kernel@...r.kernel.org, netdev@...r.kernel.org
Subject: [PATCHSET] printk, netconsole: implement reliable netconsole

In a lot of configurations, netconsole is a useful way to collect
system logs; however, all netconsole does is simply emitting UDP
packets for the raw messages and there's no way for the receiver to
find out whether the packets were lost and/or reordered in flight.

printk already keeps log metadata which contains enough information to
make netconsole reliable.  This patchset does the followings.

* Make printk metadata available to console drivers.  A console driver
  can request this mode by setting CON_EXTENDED.  The metadata is
  emitted in the same format as /dev/kmsg.  This also makes all
  logging metadata including facility, loglevel and dictionary
  available to console receivers.

* Implement extended mode support in netconsole.  When enabled,
  netconsole transmits messages with extended header which is enough
  for the receiver to detect missing messages.

* Implement netconsole retransmission support.  Matching rx socket on
  the source port is automatically created for extended targets and
  the log receiver can request retransmission by sending reponse
  packets.  This is completely decoupled from the main write path and
  doesn't make netconsole less robust when things start go south.

* Implement netconsole ack support.  The response packet can
  optionally contain ack which enables emergency transmission timer.
  If acked sequence lags the current sequence for over 10s, netconsole
  repeatedly re-sends unacked messages with increasing interval.  This
  ensures that the receiver has the latest messages and also that all
  messages are transferred even while the kernel is failing as long as
  timer and netpoll are operational.  This too is completely decoupled
  from the main write path and doesn't make netconsole less robust.

* Implement the receiver library and simple receiver using it
  respectively in tools/lib/netconsole/libncrx.a and tools/ncrx/ncrx.
  In a simulated test with heavy packet loss (50%), ncrx logs all
  messages reliably and handle exceptional conditions including
  reboots as expected.

An obvious alternative for reliable loggin would be using a separate
TCP connection in addition to the UDP packets; however, I decided for
UDP based retransmission and ack mechanism for the following reasons.

* kernel side doesn't get simpler by using TCP.  It'd still need to
  transmit extended format messages, which BTW are useful regardless
  of reliable transmission, to match up UDP and TCP messages and
  detect missing ones from TCP send buffer filling up.  Also, the
  timeout and emergency transmission support would still be necessary
  to ensure that messages are transmitted in case of, e.g., network
  stack faiure.  It'd at least be about the same amount of code as the
  UDP based implementation.

* Receiver side might be a bit simpler but not by much.  It'd still
  need to keep track of the UDP based messages and then match them up
  with TCP messages and put messages from both sources in order (each
  stream may miss different ones) and would have to deal with
  reestablishing connections after reboots.  The only part which can
  completely go away would be the actual ack and retransmission part
  and that isn't a lot of logic.

* When the network condition is good, the only thing the UDP based
  implementation adds is occassional ack messages.  TCP based
  implementation would end up transmitting all messages twice which
  still isn't much but kinda silly given that using TCP doesn't lower
  the complexity in meaningful ways.

This patchset contains the following 16 patches.

 0001-printk-guard-the-amount-written-per-line-by-devkmsg_.patch
 0002-printk-factor-out-message-formatting-from-devkmsg_re.patch
 0003-printk-move-LOG_NOCONS-skipping-into-call_console_dr.patch
 0004-printk-implement-support-for-extended-console-driver.patch
 0005-printk-implement-log_seq_range-and-ext_log_from_seq.patch
 0006-netconsole-make-netconsole_target-enabled-a-bool.patch
 0007-netconsole-factor-out-alloc_netconsole_target.patch
 0008-netconsole-punt-disabling-to-workqueue-from-netdevic.patch
 0009-netconsole-replace-target_list_lock-with-console_loc.patch
 0010-netconsole-introduce-netconsole_mutex.patch
 0011-netconsole-consolidate-enable-disable-and-create-des.patch
 0012-netconsole-implement-extended-console-support.patch
 0013-netconsole-implement-retransmission-support-for-exte.patch
 0014-netconsole-implement-ack-handling-and-emergency-tran.patch
 0015-netconsole-implement-netconsole-receiver-library.patch
 0016-netconsole-update-documentation-for-extended-netcons.patch

0001-0005 implement extended console support in printk.

0006-0011 are prep patches for netconsole.

0012-0014 implement extended mode, retransmission and ack support.

0015 implements receiver library, libncrx, and a simple receiver using
the library, ncrx.

0016 updates documentation.

As the patchset touches both printk and netconsole, I'm not sure how
these patches should be routed once acked.  Either -mm or net should
work, I think.

This patchset is on top of linus#master[1] and available in the
following git branch.

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-netconsole-ext

diffstat follows.  Thanks.

 Documentation/networking/netconsole.txt |   95 +++
 drivers/net/netconsole.c                |  800 +++++++++++++++++++++++-----
 include/linux/console.h                 |    1 
 include/linux/printk.h                  |   16 
 kernel/printk/printk.c                  |  411 +++++++++++---
 tools/Makefile                          |   16 
 tools/lib/netconsole/Makefile           |   36 +
 tools/lib/netconsole/ncrx.c             |  906 ++++++++++++++++++++++++++++++++
 tools/lib/netconsole/ncrx.h             |  204 +++++++
 tools/ncrx/Makefile                     |   14 
 tools/ncrx/ncrx.c                       |  143 +++++
 11 files changed, 2419 insertions(+), 223 deletions(-)

--
tejun

[1] 497a5df7bf6f ("Merge tag 'stable/for-linus-4.1-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip")
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ