netdev - Re: [PATCH net-next] netfilter: conntrack: avoid sending RST to reply out-of-window skb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240307141025.GL4420@breakpoint.cc>
Date: Thu, 7 Mar 2024 15:10:25 +0100
From: Florian Westphal <fw@...len.de>
To: Jason Xing <kerneljasonxing@...il.com>
Cc: Florian Westphal <fw@...len.de>, edumazet@...gle.com,
	pablo@...filter.org, kadlec@...filter.org, kuba@...nel.org,
	pabeni@...hat.com, davem@...emloft.net,
	netfilter-devel@...r.kernel.org, coreteam@...filter.org,
	netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next] netfilter: conntrack: avoid sending RST to
 reply out-of-window skb

Jason Xing <kerneljasonxing@...il.com> wrote:
> On Thu, Mar 7, 2024 at 8:00 PM Florian Westphal <fw@...len.de> wrote:
> >
> > Jason Xing <kerneljasonxing@...il.com> wrote:
> > > > This change disables most of the tcp_in_window() test, this will
> > > > pretend everything is fine even though tcp_in_window says otherwise.
> > >
> > > Thanks for the information. It does make sense.
> > >
> > > What I've done is quite similar to nf_conntrack_tcp_be_liberal sysctl
> > > knob which you also pointed out. It also pretends to ignore those
> > > out-of-window skbs.
> > >
> > > >
> > > > You could:
> > > >  - drop invalid tcp packets in input hook
> > >
> > > How about changing the return value only as below? Only two cases will
> > > be handled:
> > >
> > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c
> > > b/net/netfilter/nf_conntrack_proto_tcp.c
> > > index ae493599a3ef..c88ce4cd041e 100644
> > > --- a/net/netfilter/nf_conntrack_proto_tcp.c
> > > +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> > > @@ -1259,7 +1259,7 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct,
> > >         case NFCT_TCP_INVALID:
> > >                 nf_tcp_handle_invalid(ct, dir, index, skb, state);
> > >                 spin_unlock_bh(&ct->lock);
> > > -               return -NF_ACCEPT;
> > > +               return -NF_DROP;
> >
> > Lets not do this.  conntrack should never drop packets and defer to ruleset
> > whereever possible.
> 
> Hmm, sorry, it is against my understanding.
> 
> If we cannot return -NF_DROP, why have we already added some 'return
> NF_DROP' in the nf_conntrack_handle_packet() function? And why does
> this test statement exist?

Sure we can drop.  But we should only do it if there is no better
alternative.

> nf_conntrack_in()
>   -> nf_conntrack_handle_packet()
>   -> if (ret <= 0) {
>          if (ret == -NF_DROP) NF_CT_STAT_INC_ATOMIC(state->net, drop);

AFAICS this only happens when we receive syn for an existing conntrack
that is being removed already so we'd expect next syn to create a new
connection.  Feel free to send patches that replace drop with -accept
where possible/where it makes sense, but I don't think the
TCP_CONNTRACK_SYN_SENT one can reasonably be avoided.

> My only purpose is not to let the TCP layer sending strange RST to the
> right flow.

AFAIU tcp layer is correct, no?  Out of the blue packet to some listener
socket?

> Besides, resorting to turning on nf_conntrack_tcp_be_liberal sysctl
> knob seems odd to me though it can workaround :S

I don't see a better alternative, other than -p tcp -m conntrack
--ctstate INVALID -j DROP rule, if you wish for tcp stack to not see
such packets.

> I would like to prevent sending such an RST as default behaviour.

I don't see a way to make this work out of the box, without possible
unwanted side effects.

MAYBE we could drop IFF we check that the conntrack entry candidate
that fails sequence validation has NAT translation applied to it, and
thus the '-NF_ACCEPT' packet won't be translated.

Not even compile tested:

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -1256,10 +1256,14 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct,
        case NFCT_TCP_IGNORE:
                spin_unlock_bh(&ct->lock);
                return NF_ACCEPT;
-       case NFCT_TCP_INVALID:
+       case NFCT_TCP_INVALID: {
+               verdict = -NF_ACCEPT;
+               if (ct->status & IPS_NAT_MASK)
+                       res = NF_DROP; /* skb would miss nat transformation */
                nf_tcp_handle_invalid(ct, dir, index, skb, state);
                spin_unlock_bh(&ct->lock);
-               return -NF_ACCEPT;
+               return verdict;
+       }
        case NFCT_TCP_ACCEPT:
                break;
        }

But I don't really see the advantage compared to doing drop decision in
iptables/nftables ruleset.

I also have a hunch that someone will eventually complain about this
change in behavior.