[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58b14269-4d81-8939-020e-c33ed70df483@ssi.bg>
Date: Wed, 6 Dec 2023 14:33:47 +0200 (EET)
From: Julian Anastasov <ja@....bg>
To: Lev Pantiukhin <kndrvt@...dex-team.ru>
cc: mitradir@...dex-team.ru, Simon Horman <horms@...ge.net.au>,
linux-kernel <linux-kernel@...r.kernel.org>, netdev@...r.kernel.org,
lvs-devel@...r.kernel.org, netfilter-devel@...r.kernel.org,
coreteam@...filter.org
Subject: Re: [PATCH] ipvs: add a stateless type of service and a stateless
Maglev hashing scheduler
Hello,
On Mon, 4 Dec 2023, Lev Pantiukhin wrote:
> +#define IP_VS_SVC_F_STATELESS 0x0040 /* stateless scheduling */
I have another idea for the traffic that does not
need per-client state. We need some per-dest cp to forward the packet.
If we replace the cp->caddr usage with iph->saddr/daddr usage we can try
it. cp->caddr is used at the following places:
- tcp_snat_handler (iph->daddr), tcp_dnat_handler (iph->saddr): iph is
already provided. tcp_snat_handler requires IP_VS_SVC_F_STATELESS
to be set for serivce with present vaddr, i.e. non-fwmark based.
So, NAT+svc->fwmark is another restriction for IP_VS_SVC_F_STATELESS
because we do not know what VIP to use as saddr for outgoing traffic.
- ip_vs_nfct_expect_related
- we should investigate for any problems when IP_VS_CONN_F_NFCT
is set, probably, we can not work with NFCT?
- ip_vs_conn_drop_conntrack
- FTP:
- sets IP_VS_CONN_F_NFCT, uses cp->app
May be IP_VS_CONN_F_NFCT should be restriction for
IP_VS_SVC_F_STATELESS mode? cp->app for sure because we keep TCP
seq/ack state for the app in cp->in_seq/out_seq.
We can keep some dest->cp_route or another name that will
hold our cp for such connections. The idea is to not allocate cp for
every packet but to reuse this saved cp. It has all needed info to
forward skb to real server. The first packet will create it, save
it with some locking into dest and next packets will reuse it.
Probably, it should be ONE_PACKET entry (not hashed in table) but
can be with running timer, if needed. One refcnt for attaching to dest,
new temp refcnt for every packet. But in this mode __ip_vs_conn_put_timer
uses 0-second timer, we have to handle it somehow. It should be released
when dest is removed and on edit_dest if needed.
There are other problems to solve, such as set_tcp_state()
changing dest->activeconns and dest->inactconns. They are used also
in ip_vs_bind_dest(), ip_vs_unbind_dest(). As we do not keep previous
connection state and as conn can start in established state, we should
avoid touching these counters. For UDP ONE_PACKET has no such problem
with states but for TCP/SCTP we should take care.
Regards
--
Julian Anastasov <ja@....bg>
Powered by blists - more mailing lists