lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 18 Aug 2022 02:11:35 +0300 From: Andrey Zhadchenko <andrey.zhadchenko@...tuozzo.com> To: Ilya Maximets <i.maximets@....org>, netdev@...r.kernel.org Cc: dev@...nvswitch.org, brauner@...nel.org, edumazet@...gle.com, avagin@...gle.com, alexander.mikhalitsyn@...tuozzo.com, kuba@...nel.org, pabeni@...hat.com, davem@...emloft.net, ptikhomirov@...tuozzo.com, Aaron Conole <aconole@...hat.com> Subject: Re: [ovs-dev] [PATCH net-next 0/1] openvswitch: allow specifying ifindex of new interfaces On 8/18/22 01:15, Ilya Maximets wrote: > On 8/17/22 22:35, Andrey Zhadchenko wrote: >> >> >> On 8/17/22 21:19, Ilya Maximets wrote: >>> On 8/17/22 14:49, Andrey Zhadchenko via dev wrote: >>>> Hi! >>>> >>>> CRIU currently do not support checkpoint/restore of OVS configurations, but >>>> there was several requests for it. For example, >>>> https://github.com/lxc/lxc/issues/2909 >>>> >>>> The main problem is ifindexes of newly created interfaces. We realy need to >>>> preserve them after restore. Current openvswitch API does not allow to >>>> specify ifindex. Most of the time we can just create an interface via >>>> generic netlink requests and plug it into ovs but datapaths (generally any >>>> OVS_VPORT_TYPE_INTERNAL) can only be created via openvswitch requests which >>>> do not support selecting ifindex. >>> >>> Hmm. Assuming you restored network interfaces by re-creating them >>> on a target system, but I'm curious how do you restore the upcall PID? >>> Are you somehow re-creating the netlink socket with the same PID? >>> If that will not be done, no traffic will be able to flow through OVS >>> anyway until you remove/re-add the port in userspace or re-start OVS. >>> Or am I missing something? >>> >>> Best regards, Ilya Maximets. >> >> Yes, CRIU is able to restore socket nl_pid. We get it via NDIAG_PROTO_ALL >> netlink protocol requests (see net/netlink/diag.c) Upcall pid is exported >> by get requests via OVS_VPORT_ATTR_UPCALL_PID. >> So everything is fine here. > > I didn't dig deep into how that works, but sounds interesting. > Thanks for the pointers! > >> >> I should note that I did not test *complicated* setups with ovs-vswitchd, >> mostly basic ones like veth plugging and several containers in network. We >> mainly supported Weave Net k8s SDN which is based on ovs but do not use its >> daemon. >> >> Maybe if this is merged and people start use this we will find more problems >> with checkpoint/restore, but for now the only problem is volatile ifindex. > > Current implementation even with ifindexes sorted out will not work for > at least one reason for recent versions of OVS. Since last year OVS doesn't > use OVS_VPORT_ATTR_UPCALL_PID if kernel supports OVS_DP_ATTR_PER_CPU_PIDS > instead. It's a datapath-wide CPU ID to PID mapping for per-CPU upcall > dispatch mode. It is used by default starting with OVS 2.16. > > So, you need to make sure you're correctly restoring 'user_features' and > the OVS_DP_ATTR_PER_CPU_PIDS. Problem here is that OVS_DP_ATTR_PER_CPU_PIDS > currently not dumped to userpsace via GET request, simply because ovs-vswitchd > has no use for it. So, you'll need to add that as well. Thanks, this is very important! I will make v2 soon. > > And there could be some issues when starting OVS from a checkpoint created > on a system with different number of CPU cores. Traffic will not be broken, > but performance may be affected, and there might be some kernel warnings. Migrating to another type of CPU is a challenge usually due to different CPUID and some other problems (do we handle cpu cgroup values if ncpus changed? no idea honestly). Anyway thanks for pointing that out. > > If you won't restore OVS_DP_ATTR_PER_CPU_PIDS, traffic will not work on > recent versions of OVS, including 2.17 LTS, on more or less recent kernels. > > Another fairly recent addition is OVS_DP_ATTR_MASKS_CACHE_SIZE, which is > not critical, but would be nice to restore as well, if you're not doing > that already. Looks like ovs_dp_cmd_fill_info() already fills it so no need to additionally patch kernel part. Current CRIU implementation does not care about it, but it is not hard to include. > >> >> Best regards, Andrey Zhadchenko >>> >>>> >>>> This patch allows to do so. >>>> For new datapaths I decided to use dp_infindex in header as infindex >>>> because it control ifindex for other requests too. >>>> For internal vports I reused OVS_VPORT_ATTR_IFINDEX. >>>> >>>> The only concern I have is that previously dp_ifindex was not used for >>>> OVS_DP_VMD_NEW requests and some software may not set it to zero. However >>>> we have been running this patch at Virtuozzo for 2 years and have not >>>> encountered this problem. Not sure if it is worth to add new >>>> ovs_datapath_attr instead. >>>> >>>> >>>> As a broader solution, another generic approach is possible: modify >>>> __dev_change_net_namespace() to allow changing ifindexes within the same >>>> netns. Yet we will still need to ignore NETIF_F_NETNS_LOCAL and I am not >>>> sure that all its users are ready for ifindex change. >>>> This will be indeed better for CRIU so we won't need to change every SDN >>>> module to be able to checkpoint/restore it. And probably avoid some bloat. >>>> What do you think of this? >>>> >>>> Andrey Zhadchenko (1): >>>> openvswitch: allow specifying ifindex of new interfaces >>>> >>>> include/uapi/linux/openvswitch.h | 4 ++++ >>>> net/openvswitch/datapath.c | 16 ++++++++++++++-- >>>> net/openvswitch/vport-internal_dev.c | 1 + >>>> net/openvswitch/vport.h | 2 ++ >>>> 4 files changed, 21 insertions(+), 2 deletions(-) >>>> >>> >
Powered by blists - more mailing lists