[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55a00270-6750-ca72-f20c-4be84afa0c3b@blackhole.kfki.hu>
Date: Thu, 7 Mar 2024 20:00:43 +0100 (CET)
From: Jozsef Kadlecsik <kadlec@...ckhole.kfki.hu>
To: Jason Xing <kerneljasonxing@...il.com>
cc: Florian Westphal <fw@...len.de>, edumazet@...gle.com,
Pablo Neira Ayuso <pablo@...filter.org>, kuba@...nel.org,
pabeni@...hat.com, David Miller <davem@...emloft.net>,
netfilter-devel@...r.kernel.org, coreteam@...filter.org,
netdev@...r.kernel.org, Jason Xing <kernelxing@...cent.com>
Subject: Re: [PATCH net-next] netfilter: conntrack: avoid sending RST to
reply out-of-window skb
On Thu, 7 Mar 2024, Jason Xing wrote:
> > > Allow me to finish the full sentence: my only purpose is not to let
> > > the TCP layer send strange RST to the _established_ socket due to
> > > receiving strange out-of-window skbs.
> >
> > I don't understand why do you want to modify conntrack at all: conntrack
> > itself does not send RST packets. And the TCP layer don't send RST packets
> > to out of window ones either.
>
> To normal TCP flow, you're right because the TCP layer doesn't send RST
> to out-of-window skbs.
>
> But the DNAT policy on the server should have converted the port of
> incoming skb from A_port to B_port as my description in this patch said.
>
> It actually doesn't happen. The conntrack clears the skb->_nfct value
> after returning -NF_ACCEPT in nf_conntrack_tcp_packet() and then DNAT
> would not convert the A_port to B_port.
The packet is INVALID therefore it's not NATed. I don't think that could
simply be changed in netfilter.
> So the TCP layer is not able to look up the correct socket (client_ip,
> client_port, server_ip, B_port) because A_port doesn't match B_port,
> then an RST would be sent to the client.
Yes, if you let the packet continue to traverse the stack.
> > The only possibility I see for such packets is an iptables/nftables rule
> > which rejects packets classified as INVALID by conntrack.
> >
> > As Florian suggested, why don't you change that rule?
>
> As I said, just for the workaround method, only turning on that sysctl
> knob can solve the issue.
Sorry, but why? If you drop the packet by a rule, then there's no need to
use the sysctl setting and there's no unwanted RST packets.
Best regards,
Jozsef
> > The conntrack states are not fine-grained to express different TCP states
> > which covered with INVALID. It was never a good idea to reject INVALID
> > packets or let them through (leaking internal addresses).
> >
> > Best regards,
> > Jozsef
> >
> > > > > Besides, resorting to turning on nf_conntrack_tcp_be_liberal sysctl
> > > > > knob seems odd to me though it can workaround :S
> > > >
> > > > I don't see a better alternative, other than -p tcp -m conntrack
> > > > --ctstate INVALID -j DROP rule, if you wish for tcp stack to not see
> > > > such packets.
> > > >
> > > > > I would like to prevent sending such an RST as default behaviour.
> > > >
> > > > I don't see a way to make this work out of the box, without possible
> > > > unwanted side effects.
> > > >
> > > > MAYBE we could drop IFF we check that the conntrack entry candidate
> > > > that fails sequence validation has NAT translation applied to it, and
> > > > thus the '-NF_ACCEPT' packet won't be translated.
> > > >
> > > > Not even compile tested:
> > > >
> > > > diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c
> > > > --- a/net/netfilter/nf_conntrack_proto_tcp.c
> > > > +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> > > > @@ -1256,10 +1256,14 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct,
> > > > case NFCT_TCP_IGNORE:
> > > > spin_unlock_bh(&ct->lock);
> > > > return NF_ACCEPT;
> > > > - case NFCT_TCP_INVALID:
> > > > + case NFCT_TCP_INVALID: {
> > > > + verdict = -NF_ACCEPT;
> > > > + if (ct->status & IPS_NAT_MASK)
> > > > + res = NF_DROP; /* skb would miss nat transformation */
> > >
> > > Above line, I guess, should be 'verdict = NF_DROP'? Then this skb
> > > would be dropped in nf_hook_slow() eventually and would not be passed
> > > to the TCP layer.
> > >
> > > > nf_tcp_handle_invalid(ct, dir, index, skb, state);
> > > > spin_unlock_bh(&ct->lock);
> > > > - return -NF_ACCEPT;
> > > > + return verdict;
> > > > + }
> > > > case NFCT_TCP_ACCEPT:
> > > > break;
> > > > }
> > >
> > > Great! I think your draft patch makes sense really, which takes NAT
> > > into consideration.
> > >
> > > >
> > > > But I don't really see the advantage compared to doing drop decision in
> > > > iptables/nftables ruleset.
> > >
> > > From our views, especially to kernel developers, you're right: we
> > > could easily turn on that knob or add a drop policy to prevent it
> > > happening. Actually I did this in production to prevent such a case.
> > > It surely works.
> > >
> > > But from the views of normal users and those who do not understand how
> > > it works in the kernel, it looks strange: people may ask why we get
> > > some unknown RSTs in flight?
> > >
> > > > I also have a hunch that someone will eventually complain about this
> > > > change in behavior.
> > >
> > > Well, I still think the patch you suggested is proper and don't know
> > > why people could complain about it.
> > >
> > > Thanks for your patience :)
> > >
> > > Thanks,
> > > Jason
> > >
> >
> > --
> > E-mail : kadlec@...ckhole.kfki.hu, kadlecsik.jozsef@...ner.hu
> > PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
> > Address : Wigner Research Centre for Physics
> > H-1525 Budapest 114, POB. 49, Hungary
>
--
E-mail : kadlec@...ckhole.kfki.hu, kadlecsik.jozsef@...ner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
H-1525 Budapest 114, POB. 49, Hungary
Powered by blists - more mailing lists