lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250618031522.3859138-2-konrad.wilk@oracle.com>
Date: Tue, 17 Jun 2025 23:13:09 -0400
From: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
To: allison.henderson@...cle.com, netdev@...r.kernel.org,
        linux-rdma@...r.kernel.org, rds-devel@....oracle.com, tj@...nel.org,
        andrew@...n.ch
Cc: hannes@...xchg.org, mkoutny@...e.com, cgroups@...r.kernel.org,
        Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
Subject: [PATCH net v3] rds: Expose feature parameters via sysfs (and ELF)

We would like to have a programmatic way for applications
to query which of the features defined in include/uapi/linux/rds.h
are actually implemented by the kernel.

The problem is that applications can be built against newer
kernel (or older) and they may have the feature implemented or not.

The lack of a certain feature would signify that the kernel
does not support it. The presence of it signifies the existence
of it.

This would provide the application to query the sysfs and figure
out what is supported on a running system.

This patch would expose this extra sysfs file:

/sys/kernel/rds/features/ioctl_get_tos
/sys/kernel/rds/features/ioctl_set_tos
/sys/kernel/rds/features/socket_cancel_sent_to
/sys/kernel/rds/features/socket_cong_monitor
/sys/kernel/rds/features/socket_free_mr
/sys/kernel/rds/features/socket_get_mr
/sys/kernel/rds/features/socket_get_mr_for_dest
/sys/kernel/rds/features/socket_recverr
/sys/kernel/rds/features/socket_so_rxpath_latency
/sys/kernel/rds/features/socket_so_transport

With the value of 'supported' in them. In the future this value
could change to say 'deprecated' or have other values (for example
different versions) or can be runtime changed.

The choice to use sysfs and this particular way is modeled on the
filesystems usage exposing their features.

Alternative solution such as exposing one file ('features') with
each feature enumerated (which cgroup does) is a bit limited in
that it does not provide means to provide extra content in the future
for each feature. For example if one of the features had three
modes and one wanted to set a particular one at runtime - that
does not exist in cgroup (albeit it can be implemented but it would
be quite hectic to have just one single attribute).

Another solution of using an ioctl to expose a bitmask has the
disadvantage of being less flexible in the future and while it can
have a bit of supported/unsupported, it is not clear how one would
change modes or expose versions. It is most certainly feasible
but it can get seriously complex fast.

As such this mechanism offers the basic support we require
now and offers the flexibility for the future.

Lastly, we also utilize the ELF note macro to expose these via
so that applications that have not yet initialized RDS transport
can inspect the kernel module to see if they have the appropiate
support and choose an alternative protocol if they wish so.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>
---
v3: Add the missing documentation
    Remove the CONFIG_SYSFS #ifdef machinations
    Redo it with each feature as a seperate file in 'features'
    directory.
---
 Documentation/ABI/stable/sysfs-transport-rds | 55 +++++++++++++++++
 net/rds/af_rds.c                             | 62 +++++++++++++++++++-
 2 files changed, 116 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/ABI/stable/sysfs-transport-rds

diff --git a/Documentation/ABI/stable/sysfs-transport-rds b/Documentation/ABI/stable/sysfs-transport-rds
new file mode 100644
index 000000000000..db4de277f717
--- /dev/null
+++ b/Documentation/ABI/stable/sysfs-transport-rds
@@ -0,0 +1,55 @@
+What:          /sys/kernel/rds/features/*
+Date:          June 2025
+KernelVersion: 6.17
+Contact:       rds-devel@....oracle.com
+Description:   This directory contains the features that this kernel
+               has been built with and supports. They correspond
+               to the include/uapi/linux/rds.h features.
+
+	       The intent is for applications compiled against rds.h
+	       to be able to query and find out what features the
+	       driver supports. The current expected value is 'supported'.
+
+	       The features so far are:
+
+	       ioctl_get_tos
+	       ioctl_set_tos
+
+		Allows the user to set on the socket a type of
+		service(tos) value associated forever.
+
+	       socket_cancel_sent_to
+
+		Allows to cancel all pending messages to a given destination.
+
+	       socket_cong_monitor
+
+		RDS provides an explicit monitoring wherein a 64-bit mask value
+		on the socket and each bit corresponds to a group of ports.
+
+		When a congestion update arrives, RDS checks the set of ports
+		that became uncongested against the bit mask.
+
+		If they overlap, a control messages is enqueued on the socket,
+		and the application is woken up.
+
+	       socket_get_mr
+	       socket_get_mr_for_dest
+	       socket_free_mr
+
+		RDS allows a process to register or release memory ranges for
+		RDMA.
+
+	       socket_recverr
+
+		RDS will send RDMA notification messages to the application for
+		any RDMA operation that fails. By default this is off.
+
+	       socket_so_rxpath_latency
+
+		Receive path latency in various stages of receive path.
+
+	       socket_so_transport
+
+		Attach the socket to the underlaying transport (TCP or RDMA)
+		before invoking bind on the socket.
diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
index 8435a20968ef..449b20f236a5 100644
--- a/net/rds/af_rds.c
+++ b/net/rds/af_rds.c
@@ -32,7 +32,9 @@
  */
 #include <linux/module.h>
 #include <linux/errno.h>
+#include <linux/elfnote.h>
 #include <linux/kernel.h>
+#include <linux/kobject.h>
 #include <linux/gfp.h>
 #include <linux/in.h>
 #include <linux/ipv6.h>
@@ -871,6 +873,53 @@ static void rds6_sock_info(struct socket *sock, unsigned int len,
 }
 #endif
 
+static ssize_t supported_show(struct kobject *kobj, struct kobj_attribute *attr,
+			     char *buf)
+{
+	return sysfs_emit(buf, "supported\n");
+}
+
+#define SYSFS_ATTR(_name)					\
+ELFNOTE64("rds." #_name, 0, 1);				\
+static struct kobj_attribute rds_attr_##_name = {		\
+	.attr = {.name = __stringify(_name), .mode = 0444 },	\
+	.show = supported_show,					\
+}
+
+SYSFS_ATTR(ioctl_set_tos);
+SYSFS_ATTR(ioctl_get_tos);
+SYSFS_ATTR(socket_cancel_sent_to);
+SYSFS_ATTR(socket_cong_monitor);
+SYSFS_ATTR(socket_get_mr);
+SYSFS_ATTR(socket_get_mr_for_dest);
+SYSFS_ATTR(socket_free_mr);
+SYSFS_ATTR(socket_recverr);
+SYSFS_ATTR(socket_so_rxpath_latency);
+SYSFS_ATTR(socket_so_transport);
+
+#define ATTR_LIST(_name) &rds_attr_##_name.attr
+
+static struct attribute *rds_feat_attrs[] = {
+	ATTR_LIST(ioctl_set_tos),
+	ATTR_LIST(ioctl_get_tos),
+	ATTR_LIST(socket_cancel_sent_to),
+	ATTR_LIST(socket_cong_monitor),
+	ATTR_LIST(socket_get_mr),
+	ATTR_LIST(socket_get_mr_for_dest),
+	ATTR_LIST(socket_free_mr),
+	ATTR_LIST(socket_recverr),
+	ATTR_LIST(socket_so_rxpath_latency),
+	ATTR_LIST(socket_so_transport),
+	NULL,
+};
+
+static const struct attribute_group rds_feat_group = {
+	.attrs = rds_feat_attrs,
+	.name = "features",
+};
+
+static struct kobject *rds_sysfs_kobj;
+
 static void rds_exit(void)
 {
 	sock_unregister(rds_family_ops.family);
@@ -882,6 +931,8 @@ static void rds_exit(void)
 	rds_stats_exit();
 	rds_page_exit();
 	rds_bind_lock_destroy();
+	sysfs_remove_group(rds_sysfs_kobj, &rds_feat_group);
+	kobject_put(rds_sysfs_kobj);
 	rds_info_deregister_func(RDS_INFO_SOCKETS, rds_sock_info);
 	rds_info_deregister_func(RDS_INFO_RECV_MESSAGES, rds_sock_inc_info);
 #if IS_ENABLED(CONFIG_IPV6)
@@ -923,6 +974,14 @@ static int __init rds_init(void)
 	if (ret)
 		goto out_proto;
 
+	rds_sysfs_kobj = kobject_create_and_add("rds", kernel_kobj);
+	if (!rds_sysfs_kobj)
+		goto out_proto;
+
+	ret = sysfs_create_group(rds_sysfs_kobj, &rds_feat_group);
+	if (ret)
+		goto out_kobject;
+
 	rds_info_register_func(RDS_INFO_SOCKETS, rds_sock_info);
 	rds_info_register_func(RDS_INFO_RECV_MESSAGES, rds_sock_inc_info);
 #if IS_ENABLED(CONFIG_IPV6)
@@ -931,7 +990,8 @@ static int __init rds_init(void)
 #endif
 
 	goto out;
-
+out_kobject:
+	kobject_put(rds_sysfs_kobj);
 out_proto:
 	proto_unregister(&rds_proto);
 out_stats:
-- 
2.43.5


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ