[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADvbK_eeJVoWps8UrygEfNdXL76Q2XMoNOoELWHFqOTq2634cA@mail.gmail.com>
Date: Tue, 22 Jun 2021 23:48:33 -0400
From: Xin Long <lucien.xin@...il.com>
To: David Laight <David.Laight@...lab.com>
Cc: network dev <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"kuba@...nel.org" <kuba@...nel.org>,
Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
"linux-sctp@...r.kernel.org" <linux-sctp@...r.kernel.org>
Subject: Re: [PATCHv2 net-next 00/14] sctp: implement RFC8899: Packetization
Layer Path MTU Discovery for SCTP transport
On Tue, Jun 22, 2021 at 9:09 PM Xin Long <lucien.xin@...il.com> wrote:
>
> On Tue, Jun 22, 2021 at 6:13 PM David Laight <David.Laight@...lab.com> wrote:
> >
> > From: Xin Long
> > > Sent: 22 June 2021 19:05
> > >
> > > Overview(From RFC8899):
> > >
> > > In contrast to PMTUD, Packetization Layer Path MTU Discovery
> > > (PLPMTUD) [RFC4821] introduces a method that does not rely upon
> > > reception and validation of PTB messages. It is therefore more
> > > robust than Classical PMTUD. This has become the recommended
> > > approach for implementing discovery of the PMTU [BCP145].
> > >
> > > It uses a general strategy in which the PL sends probe packets to
> > > search for the largest size of unfragmented datagram that can be sent
> > > over a network path. Probe packets are sent to explore using a
> > > larger packet size. If a probe packet is successfully delivered (as
> > > determined by the PL), then the PLPMTU is raised to the size of the
> > > successful probe. If a black hole is detected (e.g., where packets
> > > of size PLPMTU are consistently not received), the method reduces the
> > > PLPMTU.
> >
> > This seems to take a long time (probably well over a minute)
> > to determine the mtu.
> I just noticed this is a misread of RFC8899, and the next probe packet
> should be sent immediately once the ACK of the last probe is received,
> instead of waiting the timeout, which should be for the missing probe.
>
> I will fix this with:
>
> diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
> index d29b579da904..f3aca1acf93a 100644
> --- a/net/sctp/sm_statefuns.c
> +++ b/net/sctp/sm_statefuns.c
> @@ -1275,6 +1275,8 @@ enum sctp_disposition
> sctp_sf_backbeat_8_3(struct net *net,
> return SCTP_DISPOSITION_DISCARD;
>
> sctp_transport_pl_recv(link);
> + sctp_add_cmd_sf(commands, SCTP_CMD_PROBE_TIMER_UPDATE,
> + SCTP_TRANSPORT(link));
> return SCTP_DISPOSITION_CONSUME;
> }
>
> diff --git a/net/sctp/transport.c b/net/sctp/transport.c
> index f27b856ea8ce..88815b98d9d0 100644
> --- a/net/sctp/transport.c
> +++ b/net/sctp/transport.c
> @@ -215,6 +215,11 @@ void sctp_transport_reset_probe_timer(struct
> sctp_transport *transport)
> {
> int scale = 1;
>
> + if (transport->pl.probe_count == 0) {
> + if (!mod_timer(&transport->probe_timer, jiffies +
> transport->rto))
> + sctp_transport_hold(transport);
> + return;
> + }
> if (timer_pending(&transport->probe_timer))
> return;
> if (transport->pl.state == SCTP_PL_COMPLETE &&
>
> Thanks for the comment.
A more efficient improvement:
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c
index d29b579da904..d5cb0124bafa 100644
--- a/net/sctp/sm_statefuns.c
+++ b/net/sctp/sm_statefuns.c
@@ -1275,7 +1275,13 @@ enum sctp_disposition
sctp_sf_backbeat_8_3(struct net *net,
return SCTP_DISPOSITION_DISCARD;
sctp_transport_pl_recv(link);
- return SCTP_DISPOSITION_CONSUME;
+
+ if (link->pl.state == SCTP_PL_COMPLETE) {
+ sctp_add_cmd_sf(commands, SCTP_CMD_PROBE_TIMER_UPDATE,
+ SCTP_TRANSPORT(link));
+ return SCTP_DISPOSITION_CONSUME;
+ }
+ return sctp_sf_send_probe(net, ep, asoc, type, link, commands);
}
max_interval = link->hbinterval + link->rto;
diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index f27b856ea8ce..37f65f617f05 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -215,10 +215,8 @@ void sctp_transport_reset_probe_timer(struct
sctp_transport *transport)
{
int scale = 1;
- if (timer_pending(&transport->probe_timer))
- return;
if (transport->pl.state == SCTP_PL_COMPLETE &&
- transport->pl.probe_count == 1)
+ transport->pl.probe_count == 0)
scale = 30; /* works as PMTU_RAISE_TIMER */
if (!mod_timer(&transport->probe_timer,
jiffies + transport->probe_interval * scale))
[103] pl_send: PLPMTUD: state: 1, size: 1200, high: 0 <--[a]
[103] pl_recv: PLPMTUD: state: 1, size: 1200, high: 0
[103] pl_send: PLPMTUD: state: 2, size: 1232, high: 0
[103] pl_recv: PLPMTUD: state: 2, size: 1232, high: 0
[103] pl_send: PLPMTUD: state: 2, size: 1264, high: 0
[103] pl_recv: PLPMTUD: state: 2, size: 1264, high: 0
[103] pl_send: PLPMTUD: state: 2, size: 1296, high: 0
[103] pl_recv: PLPMTUD: state: 2, size: 1296, high: 0
[103] pl_send: PLPMTUD: state: 2, size: 1328, high: 0
[103] pl_recv: PLPMTUD: state: 2, size: 1328, high: 0
[103] pl_send: PLPMTUD: state: 2, size: 1360, high: 0
[103] pl_recv: PLPMTUD: state: 2, size: 1360, high: 0
[103] pl_send: PLPMTUD: state: 2, size: 1392, high: 0
[103] pl_recv: PLPMTUD: state: 2, size: 1392, high: 0
[103] pl_send: PLPMTUD: state: 2, size: 1424, high: 0
[103] pl_recv: PLPMTUD: state: 2, size: 1424, high: 0
[103] pl_send: PLPMTUD: state: 2, size: 1456, high: 0
[103] pl_recv: PLPMTUD: state: 2, size: 1456, high: 0 <--[b]
[103] pl_send: PLPMTUD: state: 2, size: 1488, high: 0
[108] pl_send: PLPMTUD: state: 2, size: 1488, high: 0
[113] pl_send: PLPMTUD: state: 2, size: 1488, high: 0
[118] pl_send: PLPMTUD: state: 2, size: 1488, high: 0
[118] pl_recv: PLPMTUD: state: 2, size: 1456, high: 1488 <---[c]
[118] pl_send: PLPMTUD: state: 2, size: 1460, high: 1488
[118] pl_recv: PLPMTUD: state: 2, size: 1460, high: 1488 <--- [d]
[118] pl_send: PLPMTUD: state: 2, size: 1464, high: 1488
[124] pl_send: PLPMTUD: state: 2, size: 1464, high: 1488
[129] pl_send: PLPMTUD: state: 2, size: 1464, high: 1488
[134] pl_send: PLPMTUD: state: 2, size: 1464, high: 1488
[134] pl_recv: PLPMTUD: state: 2, size: 1460, high: 1464 <-- around
30s "search complete from 1200 bytes"
[287] pl_send: PLPMTUD: state: 3, size: 1460, high: 0
[287] pl_recv: PLPMTUD: state: 3, size: 1460, high: 0
[287] pl_send: PLPMTUD: state: 2, size: 1464, high: 0 <-- [aa]
[292] pl_send: PLPMTUD: state: 2, size: 1464, high: 0
[298] pl_send: PLPMTUD: state: 2, size: 1464, high: 0
[303] pl_send: PLPMTUD: state: 2, size: 1464, high: 0
[303] pl_recv: PLPMTUD: state: 2, size: 1460, high: 1464 <--[bb] <--
around 15s "re-search complete from current pmtu"
So since no interval to send the next probe when the ACK is received
for the last one,
it won't take much time from [a] to [b], and [c] to [d],
and there are at most 2 failures to find the right pmtu, each failure
takes 5s * 3 = 15s.
when it goes back to search from search complete after a long timeout,
it will take only 1 failure to get the right pmtu from [aa] to [bb].
Thanks.
> >
> > What is used for the actual mtu while this is in progress?
> >
> > Does packet loss and packet retransmission cause the mtu
> > to be reduced as well?
> No, the data packet is not a probe in this implementation.
>
> >
> > I can imagine that there is an expectation (from the application)
> > that the mtu is that of an ethernet link - perhaps less a PPPoE
> > header.
> > Starting with an mtu of 1200 will break this assumption and may
> > have odd side effects.
> Starting searching from mtu of 1200, but the real pmtu will only be updated
> when the search is done and optimal mtu is found.
> So at the beginning, it will still use the dst mtu as before.
>
> > For TCP/UDP the ICMP segmentation required error is immediate
> > and gets used for the retransmissions.
> > This code seems to be looking at separate timeouts - so a lot of
> > packets could get discarded and application timers expire before
> > if determines the correct mtu.
> This patch will also process ICMP error msg, and gets the 'mtu' size from it
> but before using it, it will verify(probe) it first:
>
> see Patch: sctp: do state transition when receiving an icmp TOOBIG packet
>
> >
> > Maybe I missed something about this only being done on inactive
> > paths?
> >
> > David
> >
> > -
> > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> > Registration No: 1397386 (Wales)
> >
Powered by blists - more mailing lists