[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MN2PR12MB45172880B30787CA283A3A22D8009@MN2PR12MB4517.namprd12.prod.outlook.com>
Date: Thu, 1 Jul 2021 08:50:42 +0000
From: Danielle Ratson <danieller@...dia.com>
To: Vladimir Oltean <vladimir.oltean@....com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"kuba@...nel.org" <kuba@...nel.org>,
"shuah@...nel.org" <shuah@...nel.org>,
Ido Schimmel <idosch@...dia.com>,
Nikolay Aleksandrov <nikolay@...dia.com>,
"gnault@...hat.com" <gnault@...hat.com>,
"baowen.zheng@...igine.com" <baowen.zheng@...igine.com>,
Amit Cohen <amcohen@...dia.com>
Subject: RE: [PATCH net-next] selftests: net: Change the indication for iface
is up
> -----Original Message-----
> From: Vladimir Oltean <vladimir.oltean@....com>
> Sent: Friday, June 25, 2021 12:51 AM
> To: Danielle Ratson <danieller@...dia.com>
> Cc: netdev@...r.kernel.org; davem@...emloft.net; kuba@...nel.org; shuah@...nel.org; Ido Schimmel <idosch@...dia.com>;
> Nikolay Aleksandrov <nikolay@...dia.com>; gnault@...hat.com; baowen.zheng@...igine.com; Amit Cohen
> <amcohen@...dia.com>
> Subject: Re: [PATCH net-next] selftests: net: Change the indication for iface is up
>
> Hi Danielle,
>
> On Thu, Jun 24, 2021 at 06:15:15PM +0300, Danielle Ratson wrote:
> > Currently, the indication that an iface is up, is the mark 'state UP'
> > in the iface info. That situation can be achieved also when the
> > carrier is not ready, and therefore after the state was found as 'up',
> > it can be still changed to 'down'.
> >
> > For example, the below presents a part of a test output with one of
> > the ifaces involved detailed info and timestamps:
> >
> > In setup_wait()
> > 45: swp13: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel master
> > vswp13 state UP mode DEFAULT group default qlen 1000
> > link/ether 7c:fe:90:fc:7d:f1 brd ff:ff:ff:ff:ff:ff promiscuity 0
> > minmtu 0 maxmtu 65535
> > vrf_slave table 1 addrgenmode eui64 numtxqueues 1 numrxqueues 1
> > gso_max_size 65536 gso_max_segs 65535 portname p13 switchid
> > 7cfe90fc7dc0
> > 17:58:27:242634417
> >
> > In dst_mac_match_test()
>
> What is dst_mac_match_test()?
>
> > 45: swp13: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel
> > master vswp13 state DOWN mode DEFAULT group default qlen 1000
> > link/ether 7c:fe:90:fc:7d:f1 brd ff:ff:ff:ff:ff:ff promiscuity 0
> > minmtu 0 maxmtu 65535
> > vrf_slave table 1 addrgenmode eui64 numtxqueues 1 numrxqueues 1
> > gso_max_size 65536 gso_max_segs 65535 portname p13 switchid
> > 7cfe90fc7dc0
> > 17:58:32:254535834
> > TEST: dst_mac match (skip_hw) [FAIL]
> > Did not match on correct filter
> >
> > In src_mac_match_test()
>
> What is src_mac_match_test()?
>
> > 45: swp13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
> > master vswp13 state UP mode DEFAULT group default qlen 1000
> > link/ether 7c:fe:90:fc:7d:f1 brd ff:ff:ff:ff:ff:ff promiscuity 0
> > minmtu 0 maxmtu 65535
> > vrf_slave table 1 addrgenmode eui64 numtxqueues 1 numrxqueues 1
> > gso_max_size 65536 gso_max_segs 65535 portname p13 switchid
> > 7cfe90fc7dc0
> > 17:58:34:446367468
>
> Can you please really show the output of 'ip link show dev swp13 up'?
> The format you are showing is not that and it is really confusing.
>
> > TEST: src_mac match (skip_hw) [ OK ]
> >
> > In dst_ip_match_test()
> > 45: swp13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel
> > master vswp13 state UP mode DEFAULT group default qlen 1000
> > link/ether 7c:fe:90:fc:7d:f1 brd ff:ff:ff:ff:ff:ff promiscuity 0
> > minmtu 0 maxmtu 65535
> > vrf_slave table 1 addrgenmode eui64 numtxqueues 1 numrxqueues 1
> > gso_max_size 65536 gso_max_segs 65535 portname p13 switchid
> > 7cfe90fc7dc0
> > 17:58:35:633518622
> >
> > In the example, after the setup_prepare() phase, the iface state was
> > indeed 'UP' so the setup_wait() phase pass successfully. But since
> > 'LOWER_UP' flag was not set yet, the next print, which was right
> > before the first test case has started, the status turned into 'DOWN'.
>
> Why?
>
> > While, UP is an indicator that the interface has been enabled and
> > running, LOWER_UP is a physical layer link flag. It indicates that an
> > Ethernet cable was plugged in and that the device is connected to the network.
> >
> > Therefore, the existence of the 'LOWER_UP' flag is necessary for
> > indicating that the port is up before testing communication.
>
> Documentation/networking/operstates.rst says:
>
> IF_OPER_UP (6):
> Interface is operational up and can be used.
>
> Additionally, RFC2863 says:
>
> ifOperStatus OBJECT-TYPE
> SYNTAX INTEGER {
> up(1), -- ready to pass packets
>
> You have not proven why the UP operstate is not sufficient and additional checks must be made for link flags. Also you have not
> explained how this fixes your problem.
>
> > Change the indication for iface is up to include the existence of
> > 'LOWER_UP'.
> >
Hi Vladimir,
After a consultation with Ido Schimmel, we came up with the commit message below:
According to Documentation/networking/operstates.rst, the administrative and operational state of an interface can be determined via both the 'ifi_flags' field in the ancillary header of a RTM_NEWLINK message and the 'IFLA_OPERSTATE' attribute in the message.
When a driver signals loss of carrier via netif_carrier_off(), the change is reflected immediately in the 'ifi_flags' field, but not in the 'IFLA_OPERSTATE' attribute. This is because changes in 'IFLA_OPERSTATE'
are performed in a link watch delayed work. From the document:
"Whenever the driver CHANGES one of these flags, a workqueue event is scheduled to translate the flag combination to IFLA_OPERSTATE as follows"
This means that it is possible for user space that is constantly querying the kernel for interface state to see the output below when the following commands are run. Their purpose is to simulate tearing down of a selftest and start of a new one.
Commands:
# ip link set dev veth1 down
# ip link set dev veth0 down
# ip link set dev veth0 up
# ip link set dev veth1 up
Output:
11: veth0@...h1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 06:49:03:9b:96:12 brd ff:ff:ff:ff:ff:ff
11: veth0@...h1: <BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 06:49:03:9b:96:12 brd ff:ff:ff:ff:ff:ff
11: veth0@...h1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000
link/ether 06:49:03:9b:96:12 brd ff:ff:ff:ff:ff:ff
11: veth0@...h1: <BROADCAST,MULTICAST,UP,M-DOWN> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 06:49:03:9b:96:12 brd ff:ff:ff:ff:ff:ff
11: veth0@...h1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 06:49:03:9b:96:12 brd ff:ff:ff:ff:ff:ff
In the third line, the operational state is set by rtnl_fill_ifinfo() to 'IF_OPER_DOWN' because the interface is administratively down, but 'dev->operstate' is still 'IF_OPER_UP' as the delayed work has yet to run. That is why the interface's operational state is reported as 'IF_OPER_UP' in the next line after it was put administratively up again.
The output in the fourth line would make user space believe that the interface is fully operational if only the 'IFLA_OPERSTATE' attribute is considered. This is false as the interface does not have a carrier ('LOWER_UP' is not set) at this stage.
Solve this by determining if an interface is operational solely based on the presence of the 'IFF_UP' and 'IFF_LOWER_UP' bits in the 'ifi_flags'
field.
Hope it answers all your questions and we will send the new commit for net-next.
Thanks.
> > Signed-off-by: Danielle Ratson <danieller@...dia.com>
> > Signed-off-by: Ido Schimmel <idosch@...dia.com>
> > ---
> > tools/testing/selftests/net/forwarding/lib.sh | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/selftests/net/forwarding/lib.sh
> > b/tools/testing/selftests/net/forwarding/lib.sh
> > index 42e28c983d41..a46076b8ebdd 100644
> > --- a/tools/testing/selftests/net/forwarding/lib.sh
> > +++ b/tools/testing/selftests/net/forwarding/lib.sh
> > @@ -399,7 +399,7 @@ setup_wait_dev_with_timeout()
> >
> > for ((i = 1; i <= $max_iterations; ++i)); do
> > ip link show dev $dev up \
> > - | grep 'state UP' &> /dev/null
> > + | grep 'state UP' | grep 'LOWER_UP' &> /dev/null
> > if [[ $? -ne 0 ]]; then
> > sleep 1
> > else
> > --
> > 2.31.1
> >
Powered by blists - more mailing lists