lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <495de273-9679-5186-3d6c-41f44e9280e4@ovn.org>
Date:   Thu, 18 Aug 2022 00:15:13 +0200
From:   Ilya Maximets <i.maximets@....org>
To:     Andrey Zhadchenko <andrey.zhadchenko@...tuozzo.com>,
        netdev@...r.kernel.org
Cc:     i.maximets@....org, dev@...nvswitch.org, brauner@...nel.org,
        edumazet@...gle.com, avagin@...gle.com,
        alexander.mikhalitsyn@...tuozzo.com, kuba@...nel.org,
        pabeni@...hat.com, davem@...emloft.net, ptikhomirov@...tuozzo.com,
        Aaron Conole <aconole@...hat.com>
Subject: Re: [ovs-dev] [PATCH net-next 0/1] openvswitch: allow specifying
 ifindex of new interfaces

On 8/17/22 22:35, Andrey Zhadchenko wrote:
> 
> 
> On 8/17/22 21:19, Ilya Maximets wrote:
>> On 8/17/22 14:49, Andrey Zhadchenko via dev wrote:
>>> Hi!
>>>
>>> CRIU currently do not support checkpoint/restore of OVS configurations, but
>>> there was several requests for it. For example,
>>> https://github.com/lxc/lxc/issues/2909
>>>
>>> The main problem is ifindexes of newly created interfaces. We realy need to
>>> preserve them after restore. Current openvswitch API does not allow to
>>> specify ifindex. Most of the time we can just create an interface via
>>> generic netlink requests and plug it into ovs but datapaths (generally any
>>> OVS_VPORT_TYPE_INTERNAL) can only be created via openvswitch requests which
>>> do not support selecting ifindex.
>>
>> Hmm.  Assuming you restored network interfaces by re-creating them
>> on a target system, but I'm curious how do you restore the upcall PID?
>> Are you somehow re-creating the netlink socket with the same PID?
>> If that will not be done, no traffic will be able to flow through OVS
>> anyway until you remove/re-add the port in userspace or re-start OVS.
>> Or am I missing something?
>>
>> Best regards, Ilya Maximets.
> 
> Yes, CRIU is able to restore socket nl_pid. We get it via NDIAG_PROTO_ALL
> netlink protocol requests (see net/netlink/diag.c)  Upcall pid is exported
> by get requests via OVS_VPORT_ATTR_UPCALL_PID.
> So everything is fine here.

I didn't dig deep into how that works, but sounds interesting.
Thanks for the pointers!

> 
> I should note that I did not test *complicated* setups with ovs-vswitchd,
> mostly basic ones like veth plugging and several containers in network. We
> mainly supported Weave Net k8s SDN  which is based on ovs but do not use its
> daemon.
> 
> Maybe if this is merged and people start use this we will find more problems
> with checkpoint/restore, but for now the only problem is volatile ifindex.

Current implementation even with ifindexes sorted out will not work for
at least one reason for recent versions of OVS.  Since last year OVS doesn't
use OVS_VPORT_ATTR_UPCALL_PID if kernel supports OVS_DP_ATTR_PER_CPU_PIDS
instead.  It's a datapath-wide CPU ID to PID mapping for per-CPU upcall
dispatch mode.  It is used by default starting with OVS 2.16.

So, you need to make sure you're correctly restoring 'user_features' and
the OVS_DP_ATTR_PER_CPU_PIDS.  Problem here is that OVS_DP_ATTR_PER_CPU_PIDS
currently not dumped to userpsace via GET request, simply because ovs-vswitchd
has no use for it.  So, you'll need to add that as well.

And there could be some issues when starting OVS from a checkpoint created
on a system with different number of CPU cores.  Traffic will not be broken,
but performance may be affected, and there might be some kernel warnings.

If you won't restore OVS_DP_ATTR_PER_CPU_PIDS, traffic will not work on
recent versions of OVS, including 2.17 LTS, on more or less recent kernels.

Another fairly recent addition is OVS_DP_ATTR_MASKS_CACHE_SIZE, which is
not critical, but would be nice to restore as well, if you're not doing
that already.

> 
> Best regards, Andrey Zhadchenko
>>
>>>
>>> This patch allows to do so.
>>> For new datapaths I decided to use dp_infindex in header as infindex
>>> because it control ifindex for other requests too.
>>> For internal vports I reused OVS_VPORT_ATTR_IFINDEX.
>>>
>>> The only concern I have is that previously dp_ifindex was not used for
>>> OVS_DP_VMD_NEW requests and some software may not set it to zero. However
>>> we have been running this patch at Virtuozzo for 2 years and have not
>>> encountered this problem. Not sure if it is worth to add new
>>> ovs_datapath_attr instead.
>>>
>>>
>>> As a broader solution, another generic approach is possible: modify
>>> __dev_change_net_namespace() to allow changing ifindexes within the same
>>> netns. Yet we will still need to ignore NETIF_F_NETNS_LOCAL and I am not
>>> sure that all its users are ready for ifindex change.
>>> This will be indeed better for CRIU so we won't need to change every SDN
>>> module to be able to checkpoint/restore it. And probably avoid some bloat.
>>> What do you think of this?
>>>
>>> Andrey Zhadchenko (1):
>>>    openvswitch: allow specifying ifindex of new interfaces
>>>
>>>   include/uapi/linux/openvswitch.h     |  4 ++++
>>>   net/openvswitch/datapath.c           | 16 ++++++++++++++--
>>>   net/openvswitch/vport-internal_dev.c |  1 +
>>>   net/openvswitch/vport.h              |  2 ++
>>>   4 files changed, 21 insertions(+), 2 deletions(-)
>>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ