[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5700e27c-2d82-4a7a-8275-740140b3b279@auristor.com>
Date: Fri, 6 Aug 2021 09:45:25 -0400
From: Jeffrey E Altman <jaltman@...istor.com>
To: "David Howells (dhowells@...hat.com)" <dhowells@...hat.com>,
Marc Dionne <marc.dionne@...istor.com>
Cc: Benjamin Kaduk <kaduk@....edu>, linux-afs@...ts.infradead.org,
netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] rxrpc: Support reception of extended-SACK ACK packet
David,
Thanks for working on this.
For the benefit of other readers. The RxRPC protocol is not
standardized by the IETF
or any other standards organization. The best description of the
protocol is present in
the OpenAFS source repository (doc/txt/rx-spec.txt). Proposed
updates to this
specification are available in OpenAFS Gerrit
doc: rx-spec Update for accuracy with current Rx implementations
https://gerrit.openafs.org/#/c/14692
doc: rx-spec Document the Extended SACK Table protocol extension
https://gerrit.openafs.org/#/c/14693
On 8/6/2021 6:08 AM, David Howells (dhowells@...hat.com) wrote:
> The RxRPC ACK packet supports selective ACK of up to 255 DATA packets. It
> contains a variable length array with one octet allocated for each DATA
> packet to be ACK'd. Each octet is either 0 or 1 depending on whether it is
> a negative or positive ACK. 7 bits in each octet are effectively unused
> and, further, there are three reserved octets following the ACK array that
> are all set to 0.
The proposed EXTENDED_SACK ACK packet format assigns usage for all three
of the
unused octets. The first is used to extend the SACK table from 255
octets to 256 octets.
This octet will be zero unless it is actively being used to represent a
positive
acknowledgement of a DATA packet.
The second unused octet is designated as the <Trailer Count> field.
The three
unused octets are the result of a math error in the mid-90s when IBM
extended
the ACK packet format with a series of 32-bit fields. Over time
additional fields
were added but there has never been a clear method for the ACK packet
receiver
to determine how many fields were sent. The EXTENDED_SACK proposal
addresses
this oversight with the <Trailer Count> field which should be set to the
number of
32-bit trailer fields. At present this value is 4, not 0. This field
will permit
additional trailers to be defined in the future.
The third unused octet is designated as the <Extra SACK Count> field.
This field
designates how many additional variable length SACK tables are present
following
the designated number of trailer fields. Each additional SACK table
can represent
up to 2048 additional DATA packets.
>
> To extend the ACK window up to 2048 ACKs, it is proposed[1]:
>
> (1) that the ACKs for DATA packets first+0...first+254 in the Rx window
> are in bit 0 of the octets in the array, ie. acks[0...254], pretty
> much as now; and
>
> (2) that if the ACK count is >=256, the first reserved byte after the ACK
> table is annexed to the ACK table as acks[255] and contains the ACK
> for packet first+255 in bit 0; and
>
> (3) that if the ACK count is >256, horizontal striping be employed such
> that the ACK for packet first+256 in the window is then in bit 1 of
> acks[0], first+257 is in bit 1 of acks[1], up to first+511 being in
> bit 1 of the borrowed reserved byte (ie. acks[255]).
>
> first+512 is then in bit 2 of acks[0], going all the way up to
> first+2048 being in bit 7 of acks[255].
I think this should say "first + 2047 in bit 7 of acks[255]."
> If extended SACK is employed in an ACK packet, it should have EXTENDED-SACK
> (0x08) set in the RxRPC packet header.
The EXTENDED_SACK format proposal also addresses a historical deficiency
in the
specification and usage of the <previousPacket> field.
https://gerrit.openafs.org/#/c/14692
adds version specific details to rx-spec.txt but in summary, the
<previousPacket> field
is unreliable at present and cannot be used in any meaningful way.
Various Rx
implementations (including OpenAFS) attempt to use it to filter
out-of-sequence
ACK packets but this is unreliable because the value sent by many Rx
peers can move
backwards, or represent a sequence number that was out of range, or even
not be a
sequence number at all. When the EXTENDED_SACK flag is set the
<previousPacket>
field is defined to be the largest DATA packet sequence number accepted
by the
sender of the ACK packet. As such the <previousPacket> field can
reliably be
used for two purposes:
1. <previousPacket> can be used to detect out-of-sequence ACK packets
2. <previousPacket> - <firstPacket> + 1 represents the maximum number
of SACK table bits that contain valid acknowledgements.
Regardless of
how many SACK tables are present in the ACK packet.
> Alter rxrpc_input_ack() to sanity check the ACK count.
>
> Alter rxrpc_input_ack() to limit the number of bytes it extracts from the
> packet for the ack array to 256.
>
> Alter rxrpc_input_soft_acks() to handle an extended SACK table.
EXTENDED_SACK is still a proposal awaiting review.
What has been agreed upon at this point in time is that the acks[] array
be interpreted
as bit fields. I suggest that the first patch to rxrpc implement the
equivalent change
to OpenAFS
rx: compare RX_ACK_TYPE_ACK as a bit-field
https://gerrit.openafs.org/#/c/14465/4
I also suggest that rxrpc be updated if necessary to always set
<previousPacket> as
described by the EXTENDED_SACK proposal regardless of whether or not the
EXTENDED_SACK proposal is implemented. Existing OpenAFS Rx peers expect
that behavior when receiving ACK packets even though they do not the field
according to those expectations.
Thanks again for taking the time to review and implement the proposal.
Jeffrey Altman
Download attachment "smime.p7s" of type "application/pkcs7-signature" (4033 bytes)
Powered by blists - more mailing lists