lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240930201358.2638665-1-aahringo@redhat.com>
Date: Mon, 30 Sep 2024 16:13:46 -0400
From: Alexander Aring <aahringo@...hat.com>
To: teigland@...hat.com
Cc: gfs2@...ts.linux.dev,
	song@...nel.org,
	yukuai3@...wei.com,
	agruenba@...hat.com,
	mark@...heh.com,
	jlbec@...lplan.org,
	joseph.qi@...ux.alibaba.com,
	gregkh@...uxfoundation.org,
	rafael@...nel.org,
	akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org,
	linux-raid@...r.kernel.org,
	ocfs2-devel@...ts.linux.dev,
	netdev@...r.kernel.org,
	vvidic@...entin-vidic.from.hr,
	heming.zhao@...e.com,
	lucien.xin@...il.com,
	donald.hunter@...il.com,
	aahringo@...hat.com
Subject: [PATCHv2 dlm/next 00/12] dlm: net-namespace functionality

Hi,

this patch-series is huge but brings a lot of basic "fun" net-namespace
functionality to DLM. Currently you need a couple of Linux kernel
instances running in e.g. Virtual Machines. With this patch-series I
want to break out of this virtual machine world dealing with multiple
kernels need to boot them all individually, etc. Now you can use DLM in
only one Linux kernel instance and each "node" (previously represented
by a virtual machine) is separate by a net-namespace. Why
net-namespaces? It just fits to the DLM design for now, you need to have
them anyway because the internal DLM socket handling on a per node
basis. What we do additionally is to separate the DLM lockspaces (the
lockspace that is being registered) by net-namespaces as this represents
a "network entity" (node). There might be reasons to introduce a
complete new kind of namespaces (locking namespace?) but I don't want to
do this step now and as I said net-namespaces are required anyway for
the DLM sockets.

You need some new user space tooling as a new netlink net-namespace
aware UAPI is introduced (but can co-exist with configfs that operates
on init_net only). See [0] for more steps, there is a copr repo for the
new tooling and can be enabled by:

$ dnf copr enable aring/nldlm
$ dnf install nldlm

or compile it yourself.

Then there is currently a very simple script [1] to show a 3 nodes cluster
using gfs2 on a multiple loop block devices on a shared loop block device
image (sounds weird but I do something like that). There are currently
some user space synchronization issues that I solve by simple sleeps,
but they are only user space problems.

To test it I recommend some virtual machine "but only one" and run the
[1] script. Afterwards you have in your executed net-namespace the 3
mountpoints /cluster/node1, /cluster/node2/ and /cluster/node3. Any vfs
operations on those mountpoints acts as a per node entity operation.

We can use it for testing, development and also scale testing to have a
large number of nodes joining a lockspace (which seems to be a problem
right now). Instead of running 1000 vms, we can run 1000 net-namespaces
in a more resource limited environment. For me it seems gfs2 can handle
several mounts and still separate the resource according their global
variables. Their data structures e.g. glock hash seems to have in their
key a separation for that (fsid?). However this is still an experimental
feature we might run into issues that requires more separation related
to net-namespaces. However basic testing seems to run just fine.

Limitations

I disable any functionality for the DLM character device that allow
plock handling or do DLM locking from user space. Just don't use any
plock locking in gfs2 for now. But basic vfs operations should work. You
can even sniff DLM traffic on the created "dlmsw" virtual bridge.

- Alex

[0] https://gitlab.com/netcoder/nldlm
[1] https://gitlab.com/netcoder/gfs2ns-examples/-/blob/main/three_nodes

changes since v2:
 - move to ynl and introduce and use netlink yaml spec
 - put the nldlm.h DLM netlink header under UAPI directory
 - fix build issues building with CONFIG_NET disabled
 - fix possible nullpointer deference if lookup of lockspace failed

Alexander Aring (12):
  dlm: introduce dlm_find_lockspace_name()
  dlm: disallow different configs nodeid storages
  dlm: add struct net to dlm_new_lockspace()
  dlm: handle port as __be16 network byte order
  dlm: use dlm_config as only cluster configuration
  dlm: dlm_config_info config fields to unsigned int
  dlm: rename config to configfs
  kobject: add kset_type_create_and_add() helper
  kobject: export generic helper ops
  dlm: separate dlm lockspaces per net-namespace
  dlm: add nldlm net-namespace aware UAPI
  gfs2: separate mount context by net-namespaces

 Documentation/netlink/specs/nldlm.yaml |  438 ++++++++
 drivers/md/md-cluster.c                |    3 +-
 fs/dlm/Makefile                        |    3 +
 fs/dlm/config.c                        | 1291 +++++++++--------------
 fs/dlm/config.h                        |  215 +++-
 fs/dlm/configfs.c                      |  882 ++++++++++++++++
 fs/dlm/configfs.h                      |   19 +
 fs/dlm/debug_fs.c                      |   24 +-
 fs/dlm/dir.c                           |    4 +-
 fs/dlm/dlm_internal.h                  |   24 +-
 fs/dlm/lock.c                          |   64 +-
 fs/dlm/lock.h                          |    3 +-
 fs/dlm/lockspace.c                     |  220 ++--
 fs/dlm/lockspace.h                     |   12 +-
 fs/dlm/lowcomms.c                      |  525 +++++-----
 fs/dlm/lowcomms.h                      |   29 +-
 fs/dlm/main.c                          |    5 -
 fs/dlm/member.c                        |   36 +-
 fs/dlm/midcomms.c                      |  287 ++---
 fs/dlm/midcomms.h                      |   31 +-
 fs/dlm/netlink2.c                      | 1330 ++++++++++++++++++++++++
 fs/dlm/nldlm-kernel.c                  |  290 ++++++
 fs/dlm/nldlm-kernel.h                  |   50 +
 fs/dlm/nldlm.c                         |  847 +++++++++++++++
 fs/dlm/plock.c                         |    2 +-
 fs/dlm/rcom.c                          |   16 +-
 fs/dlm/rcom.h                          |    3 +-
 fs/dlm/recover.c                       |   17 +-
 fs/dlm/user.c                          |   63 +-
 fs/dlm/user.h                          |    2 +-
 fs/gfs2/glock.c                        |    8 +
 fs/gfs2/incore.h                       |    2 +
 fs/gfs2/lock_dlm.c                     |    6 +-
 fs/gfs2/ops_fstype.c                   |    5 +
 fs/gfs2/sys.c                          |   35 +-
 fs/ocfs2/stack_user.c                  |    2 +-
 include/linux/dlm.h                    |    9 +-
 include/linux/kobject.h                |   10 +-
 include/uapi/linux/nldlm.h             |  153 +++
 lib/kobject.c                          |   65 +-
 40 files changed, 5566 insertions(+), 1464 deletions(-)
 create mode 100644 Documentation/netlink/specs/nldlm.yaml
 create mode 100644 fs/dlm/configfs.c
 create mode 100644 fs/dlm/configfs.h
 create mode 100644 fs/dlm/netlink2.c
 create mode 100644 fs/dlm/nldlm-kernel.c
 create mode 100644 fs/dlm/nldlm-kernel.h
 create mode 100644 fs/dlm/nldlm.c
 create mode 100644 include/uapi/linux/nldlm.h

-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ