lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aOfp6Ieds0n57POf@devvm11784.nha0.facebook.com>
Date: Thu, 9 Oct 2025 09:59:34 -0700
From: Bobby Eshleman <bobbyeshleman@...il.com>
To: Stefano Garzarella <sgarzare@...hat.com>
Cc: Shuah Khan <shuah@...nel.org>, "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Simon Horman <horms@...nel.org>,
	Stefan Hajnoczi <stefanha@...hat.com>,
	"Michael S. Tsirkin" <mst@...hat.com>,
	Jason Wang <jasowang@...hat.com>,
	Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
	Eugenio Pérez <eperezma@...hat.com>,
	"K. Y. Srinivasan" <kys@...rosoft.com>,
	Haiyang Zhang <haiyangz@...rosoft.com>,
	Wei Liu <wei.liu@...nel.org>, Dexuan Cui <decui@...rosoft.com>,
	Bryan Tan <bryan-bt.tan@...adcom.com>,
	Vishnu Dasa <vishnu.dasa@...adcom.com>,
	Broadcom internal kernel review list <bcm-kernel-feedback-list@...adcom.com>,
	virtualization@...ts.linux.dev, netdev@...r.kernel.org,
	linux-kselftest@...r.kernel.org, linux-kernel@...r.kernel.org,
	kvm@...r.kernel.org, linux-hyperv@...r.kernel.org,
	berrange@...hat.com, Bobby Eshleman <bobbyeshleman@...a.com>
Subject: Re: [PATCH net-next v6 9/9] selftests/vsock: add namespace tests

On Tue, Sep 30, 2025 at 10:58:46AM +0200, Stefano Garzarella wrote:
> On Tue, Sep 16, 2025 at 04:43:53PM -0700, Bobby Eshleman wrote:
> > From: Bobby Eshleman <bobbyeshleman@...a.com>
> > 
> > Add tests for namespace support in vsock. Use socat for basic connection
> > failure tests and vsock_test for full functionality tests when
> > communication is expected to succeed. vsock_test is not used for failure
> > cases because in theory vsock_test could allow connection and some
> > traffic flow but fail on some other case (e.g., fail on MSG_ZEROCOPY).
> > 
> > Tests cover all cases of clients and servers being in all variants of
> > local ns, global ns, host process, and VM process.
> > 
> > Legacy tests are retained and executed in the init ns.
> > 
> > Signed-off-by: Bobby Eshleman <bobbyeshleman@...a.com>
> > 
> > ---
> > Changes in v6:
> > - check for namespace support in vmtest.sh
> > 
> > Changes in v5:
> > - use /proc/sys/net/vsock/ns_mode
> > - clarify logic of tests that reuse the same VM and tests that require
> >  netns setup
> > - fix unassigned BUILD bug
> > ---
> > tools/testing/selftests/vsock/vmtest.sh | 954 ++++++++++++++++++++++++++++----
> > 1 file changed, 849 insertions(+), 105 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/vsock/vmtest.sh b/tools/testing/selftests/vsock/vmtest.sh
> > index 5e36d1068f6f..59621b32cf1a 100755
> > --- a/tools/testing/selftests/vsock/vmtest.sh
> > +++ b/tools/testing/selftests/vsock/vmtest.sh
> > @@ -7,6 +7,7 @@
> > #		* virtme-ng
> > #		* busybox-static (used by virtme-ng)
> > #		* qemu	(used by virtme-ng)
> > +#		* socat
> > 
> > readonly SCRIPT_DIR="$(cd -P -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd -P)"
> > readonly KERNEL_CHECKOUT=$(realpath "${SCRIPT_DIR}"/../../../../)
> > @@ -23,7 +24,7 @@ readonly VSOCK_CID=1234
> > readonly WAIT_PERIOD=3
> > readonly WAIT_PERIOD_MAX=60
> > readonly WAIT_TOTAL=$(( WAIT_PERIOD * WAIT_PERIOD_MAX ))
> > -readonly QEMU_PIDFILE=$(mktemp /tmp/qemu_vsock_vmtest_XXXX.pid)
> > +readonly WAIT_QEMU=5
> > 
> > # virtme-ng offers a netdev for ssh when using "--ssh", but we also need a
> > # control port forwarded for vsock_test.  Because virtme-ng doesn't support
> > @@ -33,23 +34,146 @@ readonly QEMU_PIDFILE=$(mktemp /tmp/qemu_vsock_vmtest_XXXX.pid)
> > # add the kernel cmdline options that virtme-init uses to setup the interface.
> > readonly QEMU_TEST_PORT_FWD="hostfwd=tcp::${TEST_HOST_PORT}-:${TEST_GUEST_PORT}"
> > readonly QEMU_SSH_PORT_FWD="hostfwd=tcp::${SSH_HOST_PORT}-:${SSH_GUEST_PORT}"
> > -readonly QEMU_OPTS="\
> > -	 -netdev user,id=n0,${QEMU_TEST_PORT_FWD},${QEMU_SSH_PORT_FWD} \
> > -	 -device virtio-net-pci,netdev=n0 \
> > -	 -device vhost-vsock-pci,guest-cid=${VSOCK_CID} \
> > -	 --pidfile ${QEMU_PIDFILE} \
> > -"
> 
> I expected this patch to only add new tests, but we are changing a few things.
> Are they related, or can they be moved to another preparation patch?
> 

Unfortunately they are related. For example, this pidfile change is
because this patch introduces testing two VMs up at once (to test if
their CIDs collide), so they each need unique pidfiles.

Definitely can be moved to a preparation patch, will do next rev.

> > readonly KERNEL_CMDLINE="\
> > 	virtme.dhcp net.ifnames=0 biosdevname=0 \
> > 	virtme.ssh virtme_ssh_channel=tcp virtme_ssh_user=$USER \
> > "
> > readonly LOG=$(mktemp /tmp/vsock_vmtest_XXXX.log)
> > -readonly TEST_NAMES=(vm_server_host_client vm_client_host_server vm_loopback)
> > +readonly TEST_NAMES=(
> > +	vm_server_host_client
> > +	vm_client_host_server
> > +	vm_loopback
> > +	host_vsock_ns_mode_ok
> 
> can we use a `ns_` prefix for all ns related tests?
> 

Yep, np.

> > +	host_vsock_ns_mode_write_once_ok
> > +	global_same_cid_fails
> > +	local_same_cid_ok
> > +	global_local_same_cid_ok
> > +	local_global_same_cid_ok
> > +	diff_ns_global_host_connect_to_global_vm_ok
> > +	diff_ns_global_host_connect_to_local_vm_fails
> > +	diff_ns_global_vm_connect_to_global_host_ok
> > +	diff_ns_global_vm_connect_to_local_host_fails
> > +	diff_ns_local_host_connect_to_local_vm_fails
> > +	diff_ns_local_vm_connect_to_local_host_fails
> > +	diff_ns_global_to_local_loopback_local_fails
> > +	diff_ns_local_to_global_loopback_fails
> > +	diff_ns_local_to_local_loopback_fails
> > +	diff_ns_global_to_global_loopback_ok
> > +	same_ns_local_loopback_ok
> > +	same_ns_local_host_connect_to_local_vm_ok
> > +	same_ns_local_vm_connect_to_local_host_ok
> > +)
> > +
> > readonly TEST_DESCS=(
> > +	# vm_server_host_client
> > 	"Run vsock_test in server mode on the VM and in client mode on the host."
> > +
> > +	# vm_client_host_server
> > 	"Run vsock_test in client mode on the VM and in server mode on the host."
> > +
> > +	# vm_loopback
> > 	"Run vsock_test using the loopback transport in the VM."
> > +
> > +	# host_vsock_ns_mode_ok
> > +	"Check /proc/sys/net/vsock/ns_mode strings on the host."
> > +
> > +	# host_vsock_ns_mode_write_once_ok
> > +	"Check /proc/sys/net/vsock/ns_mode is write-once on the host."
> > +
> > +	# global_same_cid_fails
> > +	"Check QEMU fails to start two VMs with same CID in two different global namespaces."
> > +
> > +	# local_same_cid_ok
> > +	"Check QEMU successfully starts two VMs with same CID in two different local namespaces."
> > +
> > +	# global_local_same_cid_ok
> > +	"Check QEMU successfully starts one VM in a global ns and then another VM in a local ns with the same CID."
> > +
> > +	# local_global_same_cid_ok
> > +	"Check QEMU successfully starts one VM in a local ns and then another VM in a global ns with the same CID."
> > +
> > +	# diff_ns_global_host_connect_to_global_vm_ok
> > +	"Run vsock_test client in global ns with server in VM in another global ns."
> > +
> > +	# diff_ns_global_host_connect_to_local_vm_fails
> > +	"Run socat to test a process in a global ns fails to connect to a VM in a local ns."
> > +
> > +	# diff_ns_global_vm_connect_to_global_host_ok
> > +	"Run vsock_test client in VM in a global ns with server in another global ns."
> > +
> > +	# diff_ns_global_vm_connect_to_local_host_fails
> > +	"Run socat to test a VM in a global ns fails to connect to a host process in a local ns."
> > +
> > +	# diff_ns_local_host_connect_to_local_vm_fails
> > +	"Run socat to test a host process in a local ns fails to connect to a VM in another local ns."
> > +
> > +	# diff_ns_local_vm_connect_to_local_host_fails
> > +	"Run socat to test a VM in a local ns fails to connect to a host process in another local ns."
> > +
> > +	# diff_ns_global_to_local_loopback_local_fails
> > +	"Run socat to test a loopback vsock in a global ns fails to connect to a vsock in a local ns."
> > +
> > +	# diff_ns_local_to_global_loopback_fails
> > +	"Run socat to test a loopback vsock in a local ns fails to connect to a vsock in a global ns."
> > +
> > +	# diff_ns_local_to_local_loopback_fails
> > +	"Run socat to test a loopback vsock in a local ns fails to connect to a vsock in another local ns."
> > +
> > +	# diff_ns_global_to_global_loopback_ok
> > +	"Run socat to test a loopback vsock in a global ns successfully connects to a vsock in another global ns."
> > +
> > +	# same_ns_local_loopback_ok
> > +	"Run socat to test a loopback vsock in a local ns successfully connects to a vsock in the same ns."
> > +
> > +	# same_ns_local_host_connect_to_local_vm_ok
> > +	"Run vsock_test client in a local ns with server in VM in same ns."
> > +
> > +	# same_ns_local_vm_connect_to_local_host_ok
> > +	"Run vsock_test client in VM in a local ns with server in same ns."
> 
> Should we run some test to check edge cases like namespace deletion
> during active connections or changing ns mode from global to local while
> running.
> 

Sgtm!

> > +)
> > +
> > +readonly USE_SHARED_VM=(vm_server_host_client vm_client_host_server vm_loopback)
> > +readonly USE_INIT_NETNS=(
> > +	global_same_cid_fails
> > +	local_same_cid_ok
> > +	global_local_same_cid_ok
> > +	local_global_same_cid_ok
> > +	diff_ns_global_host_connect_to_global_vm_ok
> > +	diff_ns_global_host_connect_to_local_vm_fails
> > +	diff_ns_global_vm_connect_to_global_host_ok
> > +	diff_ns_global_vm_connect_to_local_host_fails
> > +	diff_ns_local_host_connect_to_local_vm_fails
> > +	diff_ns_local_vm_connect_to_local_host_fails
> > +	diff_ns_global_to_local_loopback_local_fails
> > +	diff_ns_local_to_global_loopback_fails
> > +	diff_ns_local_to_local_loopback_fails
> > +	diff_ns_global_to_global_loopback_ok
> > +	same_ns_local_loopback_ok
> > +	same_ns_local_host_connect_to_local_vm_ok
> > +	same_ns_local_vm_connect_to_local_host_ok
> > +)
> > +readonly REQUIRES_NETNS=(
> > +	host_vsock_ns_mode_ok
> > +	host_vsock_ns_mode_write_once_ok
> > +	global_same_cid_fails
> > +	local_same_cid_ok
> > +	global_local_same_cid_ok
> > +	local_global_same_cid_ok
> > +	diff_ns_global_host_connect_to_global_vm_ok
> > +	diff_ns_global_host_connect_to_local_vm_fails
> > +	diff_ns_global_vm_connect_to_global_host_ok
> > +	diff_ns_global_vm_connect_to_local_host_fails
> > +	diff_ns_local_host_connect_to_local_vm_fails
> > +	diff_ns_local_vm_connect_to_local_host_fails
> > +	diff_ns_global_to_local_loopback_local_fails
> > +	diff_ns_local_to_global_loopback_fails
> > +	diff_ns_local_to_local_loopback_fails
> > +	diff_ns_global_to_global_loopback_ok
> > +	same_ns_local_loopback_ok
> > +	same_ns_local_host_connect_to_local_vm_ok
> > +	same_ns_local_vm_connect_to_local_host_ok
> > )
> > +readonly MODES=("local" "global")
> 
> What about NS_MODES ?
> 

Roger that.

> > 
> > readonly LOG_LEVEL_DEBUG=0
> > readonly LOG_LEVEL_INFO=1
> > @@ -58,6 +182,12 @@ readonly LOG_LEVEL_ERROR=3
> > 
> > VERBOSE="${LOG_LEVEL_WARN}"
> > 
> > +# Test pass/fail counters
> > +cnt_pass=0
> > +cnt_fail=0
> > +cnt_skip=0
> > +cnt_total=0
> > +
> > usage() {
> > 	local name
> > 	local desc
> > @@ -77,7 +207,7 @@ usage() {
> > 	for ((i = 0; i < ${#TEST_NAMES[@]}; i++)); do
> > 		name=${TEST_NAMES[${i}]}
> > 		desc=${TEST_DESCS[${i}]}
> > -		printf "\t%-35s%-35s\n" "${name}" "${desc}"
> > +		printf "\t%-55s%-35s\n" "${name}" "${desc}"
> > 	done
> > 	echo
> > 
> > @@ -89,21 +219,87 @@ die() {
> > 	exit "${KSFT_FAIL}"
> > }
> > 
> > +add_namespaces() {
> > +	# add namespaces local0, local1, global0, and global1
> > +	for mode in "${MODES[@]}"; do
> > +		ip netns add "${mode}0" 2>/dev/null
> > +		ip netns add "${mode}1" 2>/dev/null
> > +	done
> > +}
> > +
> > +init_namespaces() {
> > +	for mode in "${MODES[@]}"; do
> > +		ns_set_mode "${mode}0" "${mode}"
> > +		ns_set_mode "${mode}1" "${mode}"
> > +
> > +		log_host "set ns ${mode}0 to mode ${mode}"
> > +		log_host "set ns ${mode}1 to mode ${mode}"
> > +
> > +		# we need lo for qemu port forwarding
> > +		ip netns exec "${mode}0" ip link set dev lo up
> > +		ip netns exec "${mode}1" ip link set dev lo up
> > +	done
> > +}
> > +
> > +del_namespaces() {
> > +	for mode in "${MODES[@]}"; do
> > +		ip netns del "${mode}0"
> > +		ip netns del "${mode}1"
> > +		log_host "removed ns ${mode}0"
> > +		log_host "removed ns ${mode}1"
> > +	done &>/dev/null
> > +}
> > +
> > +ns_set_mode() {
> > +	local ns=$1
> > +	local mode=$2
> > +
> > +	echo "${mode}" | ip netns exec "${ns}" \
> > +		tee /proc/sys/net/vsock/ns_mode &>/dev/null
> > +}
> > +
> > vm_ssh() {
> > -	ssh -q -o UserKnownHostsFile=/dev/null -p ${SSH_HOST_PORT} localhost "$@"
> > +	local ns_exec
> > +
> > +	if [[ "${1}" == none ]]; then
> > +		local ns_exec=""
> > +	else
> > +		local ns_exec="ip netns exec ${1}"
> > +	fi
> > +
> > +	shift
> > +
> > +	${ns_exec} ssh -q -o UserKnownHostsFile=/dev/null -p ${SSH_HOST_PORT} localhost $*
> > +
> > 	return $?
> > }
> > 
> > cleanup() {
> > -	if [[ -s "${QEMU_PIDFILE}" ]]; then
> > -		pkill -SIGTERM -F "${QEMU_PIDFILE}" > /dev/null 2>&1
> > -	fi
> > +	del_namespaces
> > +}
> > 
> > -	# If failure occurred during or before qemu start up, then we need
> > -	# to clean this up ourselves.
> > -	if [[ -e "${QEMU_PIDFILE}" ]]; then
> > -		rm "${QEMU_PIDFILE}"
> > -	fi
> > +terminate_pidfiles() {
> > +	local pidfile
> > +
> > +	for pidfile in "$@"; do
> > +		if [[ -s "${pidfile}" ]]; then
> > +			pkill -SIGTERM -F "${pidfile}" 2>&1 > /dev/null
> > +		fi
> > +
> > +		# If failure occurred during or before qemu start up, then we need
> > +		# to clean this up ourselves.
> > +		if [[ -e "${pidfile}" ]]; then
> > +			rm -f "${pidfile}"
> > +		fi
> > +	done
> > +}
> > +
> > +terminate_pids() {
> > +	local pid
> > +
> > +	for pid in "$@"; do
> > +		kill -SIGTERM "${pid}" &>/dev/null || :
> > +	done
> > }
> > 
> > check_args() {
> > @@ -133,7 +329,7 @@ check_args() {
> > }
> > 
> > check_deps() {
> > -	for dep in vng ${QEMU} busybox pkill ssh; do
> > +	for dep in vng ${QEMU} busybox pkill ssh socat; do
> > 		if [[ ! -x $(command -v "${dep}") ]]; then
> > 			echo -e "skip:    dependency ${dep} not found!\n"
> > 			exit "${KSFT_SKIP}"
> > @@ -147,6 +343,20 @@ check_deps() {
> > 	fi
> > }
> > 
> > +check_test_deps() {
> > +	local tname=$1
> > +
> > +	# If the test requires NS support, check if NS support exists
> > +	# using /proc/self/ns
> > +	if [[ "${tname}" =~ "${REQUIRES_NETNS[@]}" ]] &&
> > +	   [[ ! -e /proc/self/ns ]]; then
> > +		log_host "No NS support detected for test ${tname}"
> > +		return 1
> > +	fi
> > +
> > +	return 0
> > +}
> > +
> > check_vng() {
> > 	local tested_versions
> > 	local version
> > @@ -170,6 +380,20 @@ check_vng() {
> > 	fi
> > }
> > 
> > +check_socat() {
> > +	local support_string
> > +
> > +	support_string="$(socat -V)"
> > +
> > +	if [[ "${support_string}" != *"WITH_VSOCK 1"* ]]; then
> > +		die "err: socat is missing vsock support"
> > +	fi
> > +
> > +	if [[ "${support_string}" != *"WITH_UNIX 1"* ]]; then
> > +		die "err: socat is missing unix support"
> > +	fi
> > +}
> > +
> > handle_build() {
> > 	if [[ ! "${BUILD}" -eq 1 ]]; then
> > 		return
> > @@ -194,9 +418,14 @@ handle_build() {
> > }
> > 
> > vm_start() {
> > +	local cid=$1
> > +	local ns=$2
> > +	local pidfile=$3
> > 	local logfile=/dev/null
> > 	local verbose_opt=""
> > +	local qemu_opts=""
> > 	local kernel_opt=""
> > +	local ns_exec=""
> > 	local qemu
> > 
> > 	qemu=$(command -v "${QEMU}")
> > @@ -206,27 +435,37 @@ vm_start() {
> > 		logfile=/dev/stdout
> > 	fi
> > 
> > +	qemu_opts="\
> > +		 -netdev user,id=n0,${QEMU_TEST_PORT_FWD},${QEMU_SSH_PORT_FWD} \
> > +		 -device virtio-net-pci,netdev=n0 \
> > +		${QEMU_OPTS} -device vhost-vsock-pci,guest-cid=${cid} \
> 
> Have we removed QEMU_OPTS, right?
> 
> (I still prefer to have it defined on top, but maybe there is a reason
> to remove it)
> 
> > +		--pidfile ${pidfile}
> > +	"
> > +
> > 	if [[ "${BUILD}" -eq 1 ]]; then
> > 		kernel_opt="${KERNEL_CHECKOUT}"
> > 	fi
> > 
> > -	vng \
> > +	if [[ "${ns}" != "none" ]]; then
> > +		ns_exec="ip netns exec ${ns}"
> > +	fi
> > +
> > +	${ns_exec} vng \
> > 		--run \
> > 		${kernel_opt} \
> > 		${verbose_opt} \
> > -		--qemu-opts="${QEMU_OPTS}" \
> > +		--qemu-opts="${qemu_opts}" \
> > 		--qemu="${qemu}" \
> > 		--user root \
> > 		--append "${KERNEL_CMDLINE}" \
> > 		--rw  &> ${logfile} &
> > 
> > -	if ! timeout ${WAIT_TOTAL} \
> 
> So WAIT_TOTAL is now unused, right?
> 

Yes, its definition can be dropped.

> Can you explain better this change?
> 

It turned out that WAIT_TOTAL (three minutes) was very long wait time to
just see if qemu places the pidfile or not. It's a suitable wait time
for VM boot up, but qemu creates the pidfile far before that. Waiting
three minutes made the cases where we expect the VM to fail bootup to
take a a very long time. Waiting only 5 seconds to check the pidfile
offers a lot of savings.

Sounds like this should also be a different patch.

[..]

> 
> Sorry, there are too many changes and reviewing it is complicated. Can you
> at least divide it into patches to fix pre-existing bugs, patches to support
> namespaces (and use init_ns for the ones we already have), and patches to
> add tests?
> 
> Thanks,
> Stefano
> 

No problem, totally understandable looking at it in hindsight. I'll
break it up and send that out after the window opens.

Thanks again for your patient review!

Best,
Bobby

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ