lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1515691504.1318.9.camel@cohaesio.com>
Date:   Thu, 11 Jan 2018 17:25:16 +0000
From:   "Anders K. Pedersen | Cohaesio" <akp@...aesio.com>
To:     "weiwan@...gle.com" <weiwan@...gle.com>
CC:     "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "netfilter-devel@...r.kernel.org" <netfilter-devel@...r.kernel.org>
Subject: Re: [bisected] Forwarded packets occasionally has loopback output
 interface in Netfilter

On tir, 2017-12-26 at 12:05 +0100, Anders K. Pedersen | Cohaesio wrote:
> Hello,
> 
> On one of our border routers, Netfilter is occasionally logging packets
> with "OUT=lo" (output interface lo) even though the packet should be
> going out via a regular interface. This behavior is present on Linux
> 4.13.0 to 4.14.9, and a bisection of the problem points to
> 
> [95c47f9cf5e028d1ae77dc6c767c1edc8a18025b] ipv4: call dst_dev_put() properly
> 
> as the first bad commit. This commit adds dst_dev_put() calls before
> some dst_release() calls, and dst_dev_put() does
> 
>     dst->dev = dev_net(dst->dev)->loopback_dev;
> 
> (among other things), which fits the problem we're seeing.
> 
> The essential part of our nftables rule set that shows this behavior is
> 
> chain forward {
>                 type filter hook forward priority 0;
> 
>                 meta oif { $internal_interfaces } accept
> 
>                 meta oif lo ip daddr != 127.0.0.0/8 \
>                         log group 0 snaplen 80 prefix "oif-lo" counter
> 
>                 ip saddr { $our_ip_series } \
>                         flow table acct_out \
>                         { meta oif . rt nexthop . ip saddr timeout 12m counter } \
>                         accept
> 
>                 log group 0 snaplen 80 prefix "DROP" counter drop
> }
> 
> The router only does stateless packet filtering and no redirection or
> rewriting of the packets (connection tracking, NAT, ipvs etc. are not
> even compiled for this kernel).
> 
> As a result of this problem we see packets that should be going to an
> internal interface (and thus accepted by the first rule above) being
> logged and dropped by the last rule. Some examples:
> 
> Dec 22 11:57:02 cix4 oif-lo IN=eth10 OUT=lo MAC=90:e2:ba:5c:b6:95:10:f3:11:38:06:77:08:00 SRC=81.170.163.118 DST=212.97.158.33 LEN=1500 TOS=00 PREC=0x00 TTL=116 ID=25932 DF PROTO=TCP SPT=35118 DPT=8443 SEQ=604358330 ACK=1182278705 WINDOW=3295 ACK URGP=0 MARK=0
> Dec 22 11:57:02 cix4 DROP IN=eth10 OUT=lo MAC=90:e2:ba:5c:b6:95:10:f3:11:38:06:77:08:00 SRC=81.170.163.118 DST=212.97.158.33 LEN=1500 TOS=00 PREC=0x00 TTL=116 ID=25932 DF PROTO=TCP SPT=35118 DPT=8443 SEQ=604358330 ACK=1182278705 WINDOW=3295 ACK URGP=0 MARK=0
> 
> Dec 22 12:47:07 cix4 oif-lo IN=eth10 OUT=lo MAC=90:e2:ba:5c:b6:95:0e:86:10:27:99:f3:08:00 SRC=40.101.30.18 DST=212.97.130.32 LEN=245 TOS=00 PREC=0x00 TTL=118 ID=10370 DF PROTO=TCP SPT=443 DPT=44988 SEQ=1141545913 ACK=3844573103 WINDOW=65535 ACK PSH URGP=0 MARK=0
> Dec 22 12:47:07 cix4 DROP IN=eth10 OUT=lo MAC=90:e2:ba:5c:b6:95:0e:86:10:27:99:f3:08:00 SRC=40.101.30.18 DST=212.97.130.32 LEN=245 TOS=00 PREC=0x00 TTL=118 ID=10370 DF PROTO=TCP SPT=443 DPT=44988 SEQ=1141545913 ACK=3844573103 WINDOW=65535 ACK PSH URGP=0 MARK=0
> 
> Dec 22 12:53:56 cix4 oif-lo IN=eth10 OUT=lo MAC=90:e2:ba:5c:b6:95:0e:86:10:27:99:f3:08:00 SRC=40.101.12.34 DST=212.97.130.32 LEN=245 TOS=00 PREC=0x00 TTL=115 ID=27728 DF PROTO=TCP SPT=443 DPT=39724 SEQ=3797156404 ACK=3944234612 WINDOW=65535 ACK PSH URGP=0 MARK=0
> Dec 22 12:53:56 cix4 DROP IN=eth10 OUT=lo MAC=90:e2:ba:5c:b6:95:0e:86:10:27:99:f3:08:00 SRC=40.101.12.34 DST=212.97.130.32 LEN=245 TOS=00 PREC=0x00 TTL=115 ID=27728 DF PROTO=TCP SPT=443 DPT=39724 SEQ=3797156404 ACK=3944234612 WINDOW=65535 ACK PSH URGP=0 MARK=0
> 
> It also happens for outbound traffic, where the packets are logged and
> counted in the acct_out flow table with "meta oif" = "lo", but a
> correct "rt nexthop" - an example:
> 
> Dec 22 12:29:13 cix4 oif-lo IN=team0.20 OUT=lo MAC=3c:fd:fe:15:db:a8:00:24:a8:ff:f0:00:08:00 SRC=212.97.129.25 DST=95.166.119.129 LEN=40 TOS=00 PREC=0x00 TTL=62 ID=19481 DF PROTO=TCP SPT=443 DPT=52560 SEQ=3034827396 ACK=2862814901 WINDOW=12618 ACK URGP=0 MARK=0
> 
> # nft list flow table filter acct_out|tr ',' '\n'|grep lo
>         flow table acct_out {
>  "lo" . 94.101.208.217 . 212.97.129.25 expires 3m17s : counter packets 1 bytes 40
> 
> I don't know if these packets are actually sent out on the correct
> outbound interface thanks to the proper nexthop (the MAC= information
> in the Netfilter log is from the received packet and thus not useful
> here).
> 
> I tried running a tcpdump on the lo interface to see if these packets
> would show up there, but during the three days I had it running, it
> only logged one such packet, while Netfilter logs 20+ outbound packets
> every day, and the one packet logged by tcpdump was *not* logged by
> Netfilter.

Further testing of the individual parts of the first bad commit shows
that the five first additions of the dst_dev_put() call doesn't trigger
the problem, while the last one does (also without the first five), so
the problematic part is:

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 3dee004..d986d80 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1369,6 +1372,7 @@ static bool rt_cache_route(struct fib_nh *nh, struct rtable *rt)
        prev = cmpxchg(p, orig, rt);
        if (prev == orig) {
                if (orig) {
+                       dst_dev_put(&orig->dst);
                        dst_release(&orig->dst);
                        rt_free(orig);
                }

Any help with fixing this would be much appreciated. If there's
anything I should try to debug this further, please let me know.

Regards,
Anders K. Pedersen

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ