lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5e0249d-b6e1-44fa-147b-e2af65e56f64@ssi.bg>
Date:   Wed, 26 Oct 2022 13:15:27 +0300 (EEST)
From:   Julian Anastasov <ja@....bg>
To:     Ziyang Xuan <william.xuanziyang@...wei.com>
cc:     netdev@...r.kernel.org, linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH net] ipv4: fix source address and gateway mismatch under
 multiple default gateways


	Hello,

On Wed, 26 Oct 2022, Ziyang Xuan wrote:

> We found a problem that source address doesn't match with selected gateway
> under multiple default gateways. The reproducer is as following:
> 
> Setup in client as following:
> 
> $ ip link add link eth2 dev eth2.71 type vlan id 71
> $ ip link add link eth2 dev eth2.72 type vlan id 72
> $ ip addr add 192.168.71.41/24 dev eth2.71
> $ ip addr add 192.168.72.41/24 dev eth2.72
> $ ip link set eth2.71 up
> $ ip link set eth2.72 up
> $ route add -net default gw 192.168.71.1 dev eth2.71
> $ route add -net default gw 192.168.72.1 dev eth2.72

	Second route goes to first position due to the
prepend operation for the route add command. That is
why 192.168.72.41 is selected.

> Add a nameserver configuration in the following file:
> $ cat /etc/resolv.conf
> nameserver 8.8.8.8
> 
> Setup in peer server as following:
> 
> $ ip link add link eth2 dev eth2.71 type vlan id 71
> $ ip link add link eth2 dev eth2.72 type vlan id 72
> $ ip addr add 192.168.71.1/24 dev eth2.71
> $ ip addr add 192.168.72.1/24 dev eth2.72
> $ ip link set eth2.71 up
> $ ip link set eth2.72 up
> 
> Use the following command trigger DNS packet in client:
> $ ping www.baidu.com
> 
> Capture packets with tcpdump in client when ping:
> $ tcpdump -i eth2 -vne
> ...
> 20:30:22.996044 52:54:00:20:23:a9 > 52:54:00:d2:4f:e3, ethertype 802.1Q (0x8100), length 77: vlan 71, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 25407, offset 0, flags [DF], proto UDP (17), length 59)
>     192.168.72.41.42666 > 8.8.8.8.domain: 58562+ A? www.baidu.com. (31)
> ...
> 
> We get the problem that IPv4 saddr "192.168.72.41" do not match with
> selected VLAN device "eth2.71".

	The problem could be that source address is selected
once and later used as source address in following routing lookups.

	And your routing rules do not express the restriction that
both addresses can not be used for specific route. If you have
such restriction which is common, you should use source-specific routes:

1. ip rule to consider table main only for link routes,
no default routes here

ip rule add prio 10 table main

2. source-specific routes:

ip rule add prio 20 from 192.168.71.0/24 table 20
ip route append default via 192.168.71.1 dev eth2.71 src 192.168.71.41 table 20
ip rule add prio 30 from 192.168.72.0/24 table 30
ip route append default via 192.168.72.1 dev eth2.72 src 192.168.72.41 table 30

3. Store default alternative routes not in table main:
ip rule add prio 200 table 200
ip route append default via 192.168.71.1 dev eth2.71 src 192.168.71.41 table 200
ip route append default via 192.168.72.1 dev eth2.72 src 192.168.72.41 table 200

	Above routes should work even without specifying prefsrc.

	As result, table 200 is used only for routing lookups
without specific source address, usually for first packet in
connection, next packets should hit tables 20/30.

	You can check https://ja.ssi.bg/dgd-usage.txt for such
examples, see under 2. Alternative routes and dead gateway detection

> In above scenario, the process does __ip_route_output_key() twice in
> ip_route_connect(), the two processes have chosen different default gateway,
> and the last choice is not the best.
> 
> Add flowi4->saddr and fib_nh_common->nhc_gw.ipv4 matching consideration in
> fib_select_default() to fix that.

	Other setups may not have such restriction, they can
prefer any gateway in reachable state no matter the saddr.

> Fixes: 19baf839ff4a ("[IPV4]: Add LC-Trie FIB lookup algorithm.")
> Signed-off-by: Ziyang Xuan <william.xuanziyang@...wei.com>
> ---
>  net/ipv4/fib_semantics.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
> index e9a7f70a54df..8bd94875a009 100644
> --- a/net/ipv4/fib_semantics.c
> +++ b/net/ipv4/fib_semantics.c
> @@ -2046,6 +2046,7 @@ static void fib_select_default(const struct flowi4 *flp, struct fib_result *res)
>  	int order = -1, last_idx = -1;
>  	struct fib_alias *fa, *fa1 = NULL;
>  	u32 last_prio = res->fi->fib_priority;
> +	u8 prefix, max_prefix = 0;
>  	dscp_t last_dscp = 0;
>  
>  	hlist_for_each_entry_rcu(fa, fa_head, fa_list) {
> @@ -2078,6 +2079,11 @@ static void fib_select_default(const struct flowi4 *flp, struct fib_result *res)
>  		if (!nhc->nhc_gw_family || nhc->nhc_scope != RT_SCOPE_LINK)
>  			continue;
>  
> +		prefix = __ffs(flp->saddr ^ nhc->nhc_gw.ipv4);
> +		if (prefix < max_prefix)
> +			continue;
> +		max_prefix = max_t(u8, prefix, max_prefix);
> +
>  		fib_alias_accessed(fa);
>  
>  		if (!fi) {
> -- 
> 2.25.1

Regards

--
Julian Anastasov <ja@....bg>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ