lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 05 Mar 2015 10:36:50 +0800
From:	Fan Du <fengyuleidian0615@...il.com>
To:	John Heffner <johnwheffner@...il.com>
CC:	Fan Du <fan.du@...el.com>, Eric Dumazet <edumazet@...gle.com>,
	David Miller <davem@...emloft.net>,
	Netdev <netdev@...r.kernel.org>
Subject: Re: [PATCHv4 net-next 3/3] ipv4: Create probe timer for tcp PMTU
 as per RFC4821

于 2015年03月04日 21:26, John Heffner 写道:
> On Tue, Mar 3, 2015 at 9:11 PM, Fan Du <fengyuleidian0615@...il.com> wrote:
>> 于 2015年03月04日 00:39, John Heffner 写道:
>>
>>> On Tue, Mar 3, 2015 at 4:19 AM, Fan Du <fan.du@...el.com> wrote:
>>>>
>>>> As per RFC4821 7.3.  Selecting Probe Size, a probe timer should
>>>> be armed once probing has converged. Once this timer expired,
>>>> probing again to take advantage of any path PMTU change. The
>>>> recommended probing interval is 10 minutes per RFC1981. Probing
>>>> interval could be sysctled by sysctl_tcp_probe_interval.
>>>>
>>>> Eric Dumazet suggested to implement pseudo timer based on 32bits
>>>> jiffies tcp_time_stamp instead of using classic timer for such
>>>> rare event.
>>>>
>>>> Signed-off-by: Fan Du <fan.du@...el.com>
>>>
>>>
>>>> @@ -1823,6 +1825,31 @@ send_now:
>>>>           return false;
>>>>    }
>>>>
>>>> +static inline void tcp_mtu_check_reprobe(struct sock *sk)
>>>> +{
>>>> +       struct inet_connection_sock *icsk = inet_csk(sk);
>>>> +       struct tcp_sock *tp = tcp_sk(sk);
>>>> +       struct net *net = sock_net(sk);
>>>> +       u32 interval;
>>>> +       s32 delta;
>>>> +
>>>> +       interval = net->ipv4.sysctl_tcp_probe_interval;
>>>> +       delta = tcp_time_stamp - icsk->icsk_mtup.probe_timestamp;
>>>> +       if (unlikely(delta >= interval * HZ)) {
>>>> +               int mss = tcp_current_mss(sk);
>>>> +
>>>> +               /* Update current search range */
>>>> +               icsk->icsk_mtup.search_high = tp->rx_opt.mss_clamp +
>>>> +                       sizeof(struct tcphdr) +
>>>> +                       icsk->icsk_af_ops->net_header_len;
>>>> +               icsk->icsk_mtup.search_low = tcp_mss_to_mtu(sk, mss);
>>>> +               icsk->icsk_mtup.probe_size = 0;
>>>> +
>>>> +               /* Update probe time stamp */
>>>> +               icsk->icsk_mtup.probe_timestamp = tcp_time_stamp;
>>>> +       }
>>>> +}
>>>> +
>>>>    /* Create a new MTU probe if we are ready.
>>>>     * MTU probe is regularly attempting to increase the path MTU by
>>>>     * deliberately sending larger packets.  This discovers routing
>>>
>>>
>>> I think the only update to the search range required here is
>>> potentially moving search_high upward.  Touching search_low seems
>>> unnecessary,
>>
>> No.
>>
>> If you restoring search_low to its original, the first probe will probably
>> be no better than using current mss, because current mss is working good,
>> there may be better mss between mtu(current mss) and search_high. Reprobing
>> from search_low to search_high is definitely not worth the effort.
>
> Set search_high to the maximal MSS.  The next probe will be halfway
> between search_low and MSS.  There's no reason to reduce search_low
> when the current MSS == search_low.

I think you mis-interpret the code and my intention here, put it in a simple way:
Search range is 512 ~ 1500, if current mss is 1024, then for another reprobing,
Should we start from A or B?
A: 1024 ~ 1500
B: 512  ~ 1500

The modification choose searching from A, because more optimal mss lies between
current mss and search_high.

In case of current mss == search_low, the modification also cover this scenario.
Again reprobing from current mss to search high.

>
>>>> @@ -1866,8 +1893,11 @@ static int tcp_mtu_probe(struct sock *sk)
>>>>           size_needed = probe_size + (tp->reordering + 1) * tp->mss_cache;
>>>>           interval = icsk->icsk_mtup.search_high -
>>>> icsk->icsk_mtup.search_low;
>>>>           if (probe_size > tcp_mtu_to_mss(sk, icsk->icsk_mtup.search_high)
>>>> ||
>>>> -           interval < max(1, net->ipv4.sysctl_tcp_probe_threshold)) {
>>>> -               /* TODO: set timer for probe_converge_event */
>>>> +               interval < net->ipv4.sysctl_tcp_probe_threshold) {
>>>> +               /* Check whether enough time has elaplased for
>>>> +                * another round of probing.
>>>> +                */
>>>> +               tcp_mtu_check_reprobe(sk);
>>>>                   return -1;
>>>>           }
>>>
>>>
>>> The way this check works, I think putting it here may not be exactly
>>> what we want.  The comment was to set a timer here, but that's not
>>> exactly what tcp_mtu_check_reprobe does.  Since it may update the
>>> search range, I think it would be better to call prior to comparing
>>> the candidate probe_size to search_high.
>>
>> Semantically speaking, reprobe checking should be placed inside where
>> a decision to not probe has been made, reprobing from a stable state ;)
>> If we moving the reporobe checking outside, current probing may be legal,
>> but the reprobe check will rule it out.
>>
>> So I don't see any compelling reason to move the reprobe checking outside.
>
> The reprobe check should happen when you know you don't already have a
> pending probe, not when a decision has already been made not to probe.
> The impact is relatively minor because practically you're just
> extending the duration of a long timer a bit more, but there's not a
> good reason to do so.
When a reprobe check tell us enough time has elapsed, it's time for another probe
attempt, but *IF *we have already been probing at the time being due to misfortunes,
and we are approaching to the optimal mss no longer in distance than reprobing
from the very beginning. Why bother ignoring current probe effort?

Restarting the reprobe in exactly time interval sounds logical, but not so in
scenario where a *active* probing is going on.

> Thanks,
>    -John
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists