netdev - Re: [PATCHv4 net-next 3/3] ipv4: Create probe timer for tcp PMTU as per RFC4821

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 4 Mar 2015 08:26:52 -0500
From:	John Heffner <johnwheffner@...il.com>
To:	Fan Du <fengyuleidian0615@...il.com>
Cc:	Fan Du <fan.du@...el.com>, Eric Dumazet <edumazet@...gle.com>,
	David Miller <davem@...emloft.net>,
	Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCHv4 net-next 3/3] ipv4: Create probe timer for tcp PMTU as
 per RFC4821

On Tue, Mar 3, 2015 at 9:11 PM, Fan Du <fengyuleidian0615@...il.com> wrote:
> 于 2015年03月04日 00:39, John Heffner 写道:
>
>> On Tue, Mar 3, 2015 at 4:19 AM, Fan Du <fan.du@...el.com> wrote:
>>>
>>> As per RFC4821 7.3.  Selecting Probe Size, a probe timer should
>>> be armed once probing has converged. Once this timer expired,
>>> probing again to take advantage of any path PMTU change. The
>>> recommended probing interval is 10 minutes per RFC1981. Probing
>>> interval could be sysctled by sysctl_tcp_probe_interval.
>>>
>>> Eric Dumazet suggested to implement pseudo timer based on 32bits
>>> jiffies tcp_time_stamp instead of using classic timer for such
>>> rare event.
>>>
>>> Signed-off-by: Fan Du <fan.du@...el.com>
>>
>>
>>> @@ -1823,6 +1825,31 @@ send_now:
>>>          return false;
>>>   }
>>>
>>> +static inline void tcp_mtu_check_reprobe(struct sock *sk)
>>> +{
>>> +       struct inet_connection_sock *icsk = inet_csk(sk);
>>> +       struct tcp_sock *tp = tcp_sk(sk);
>>> +       struct net *net = sock_net(sk);
>>> +       u32 interval;
>>> +       s32 delta;
>>> +
>>> +       interval = net->ipv4.sysctl_tcp_probe_interval;
>>> +       delta = tcp_time_stamp - icsk->icsk_mtup.probe_timestamp;
>>> +       if (unlikely(delta >= interval * HZ)) {
>>> +               int mss = tcp_current_mss(sk);
>>> +
>>> +               /* Update current search range */
>>> +               icsk->icsk_mtup.search_high = tp->rx_opt.mss_clamp +
>>> +                       sizeof(struct tcphdr) +
>>> +                       icsk->icsk_af_ops->net_header_len;
>>> +               icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
>>> +               icsk->icsk_mtup.probe_size = 0;
>>> +
>>> +               /* Update probe time stamp */
>>> +               icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
>>> +       }
>>> +}
>>> +
>>>   /* Create a new MTU probe if we are ready.
>>>    * MTU probe is regularly attempting to increase the path MTU by
>>>    * deliberately sending larger packets.  This discovers routing
>>
>>
>> I think the only update to the search range required here is
>> potentially moving search_high upward.  Touching search_low seems
>> unnecessary,
>
> No.
>
> If you restoring search_low to its original, the first probe will probably
> be no better than using current mss, because current mss is working good,
> there may be better mss between mtu(current mss) and search_high. Reprobing
> from search_low to search_high is definitely not worth the effort.

Set search_high to the maximal MSS.  The next probe will be halfway
between search_low and MSS.  There's no reason to reduce search_low
when the current MSS == search_low.


>>> @@ -1866,8 +1893,11 @@ static int tcp_mtu_probe(struct sock *sk)
>>>          size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache;
>>>          interval = icsk->icsk_mtup.search_high -
>>> icsk->icsk_mtup.search_low;
>>>          if (probe_size > tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_high)
>>> ||
>>> -           interval < max(1, net->ipv4.sysctl_tcp_probe_threshold)) {
>>> -               /* TODO: set timer for probe_converge_event */
>>> +               interval < net->ipv4.sysctl_tcp_probe_threshold) {
>>> +               /* Check whether enough time has elaplased for
>>> +                * another round of probing.
>>> +                */
>>> +               tcp_mtu_check_reprobe(sk);
>>>                  return -1;
>>>          }
>>
>>
>> The way this check works, I think putting it here may not be exactly
>> what we want.  The comment was to set a timer here, but that's not
>> exactly what tcp_mtu_check_reprobe does.  Since it may update the
>> search range, I think it would be better to call prior to comparing
>> the candidate probe_size to search_high.
>
> Semantically speaking, reprobe checking should be placed inside where
> a decision to not probe has been made, reprobing from a stable state ;)
> If we moving the reporobe checking outside, current probing may be legal,
> but the reprobe check will rule it out.
>
> So I don't see any compelling reason to move the reprobe checking outside.

The reprobe check should happen when you know you don't already have a
pending probe, not when a decision has already been made not to probe.
The impact is relatively minor because practically you're just
extending the duration of a long timer a bit more, but there's not a
good reason to do so.

Thanks,
  -John
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html