[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALx6S344y8ze7Tp_yQ8yRW5YW2Vs6aY18HpKyE1L9Vq2VP9Yyg@mail.gmail.com>
Date: Tue, 2 Aug 2016 18:59:05 -0700
From: Tom Herbert <tom@...bertland.com>
To: Feng Gao <gfree.wind@...il.com>
Cc: Gao Feng <fgao@...ai8.com>,
"David S. Miller" <davem@...emloft.net>,
Philp Prindeville <philipp@...fish-solutions.com>,
Stephen Hemminger <stephen@...workplumber.org>,
Pravin B Shelar <pshelar@...ira.com>,
Alex Duyck <aduyck@...antis.com>,
Linux Kernel Network Developers <netdev@...r.kernel.org>
Subject: Re: [PATCH 1/1] rps: Inspect PPTP encapsulated by GRE to get flow hash
On Tue, Aug 2, 2016 at 5:28 PM, Feng Gao <gfree.wind@...il.com> wrote:
> Hi Tom.
>
> I quote some description in RFC 2637.
>
> /*********************************************************************/
> C
> (Bit 0) Checksum Present. Set to zero (0).
>
Note that the requirement is "*Set* to zero"; it is not must check to
be zero on reception. Per the robustness principle (part about be
liberal in what you receive) it is allowable to process the flag if
the bit is set set. In practice this is nice to have because we know
that overloading fields in GRE (seq, csum in particular) with private
data has been done before. Consolidating makes for cleaner code, and
if someone is really violating the protocol the worst thing that would
happen it that they would just get suboptimal RPS for those packets.
> R
> (Bit 1) Routing Present. Set to zero (0).
>
> K
> (Bit 2) Key Present. Set to one (1).
> S
> (Bit 3) Sequence Number Present. Set to one (1) if a payload
> (data) packet is present. Set to zero (0) if payload is not
> present (GRE packet is an Acknowledgment only).
>
> s
> (Bit 4) Strict source route present. Set to zero (0).
>
> Recur
> (Bits 5-7) Recursion control. Set to zero (0).
>
> A
> (Bit 8) Acknowledgment sequence number present. Set to one (1) if
> packet contains Acknowledgment Number to be used for acknowledging
> previously transmitted data.
>
> Flags
> (Bits 9-12) Must be set to zero (0).
>
> Ver
> (Bits 13-15) Must contain 1 (enhanced GRE).
>
> Protocol Type
> Set to hex 880B [8].
>
> Key
> Use of the Key field is up to the implementation. PPTP uses it as
> follows:
> Payload Length
> (High 2 octets of Key) Size of the payload, not including
> the GRE header
>
> Call ID
> (Low 2 octets) Contains the Peer's Call ID for the session
> to which this packet belongs.
> /*********************************************************************/
>
> Just a reminder, the version description is "contain 1", not "set 1".
>
I think your misinterpreting that (at least I hope :-) ). "Contains 1"
should be that the contents of the Ver field is the value of "1".
Otherwise, the RFC has appropriated half of the eight possible
versions to be enhanced GRE.
Tom
> Best Regards
> Feng
>
> On Wed, Aug 3, 2016 at 4:35 AM, Tom Herbert <tom@...bertland.com> wrote:
>> On Thu, Jul 28, 2016 at 12:14 AM, <fgao@...ai8.com> wrote:
>>> From: Gao Feng <fgao@...ai8.com>
>>>
>>> The PPTP is encapsulated by GRE header with that GRE_VERSION bits
>>> must contain one. But current GRE RPS needs the GRE_VERSION must be
>>> zero. So RPS does not work for PPTP traffic.
>>>
>>> In my test environment, there are four MIPS cores, and all traffic
>>> are passed through by PPTP. As a result, only one core is 100% busy
>>> while other three cores are very idle. After this patch, the usage
>>> of four cores are balanced well.
>>>
>>> Signed-off-by: Gao Feng <fgao@...ai8.com>
>>> ---
>>> v2: Update according to Tom and Philp's advice.
>>> 1) Consolidate the codes with GRE version 0 path;
>>> 2) Use PPP_PROTOCOL to get ppp protol;
>>> 3) Set the FLOW_DIS_ENCAPSULATION flag;
>>> v1: Initial patch
>>>
>>> include/uapi/linux/if_tunnel.h | 5 +-
>>> net/core/flow_dissector.c | 146 ++++++++++++++++++++++++++---------------
>>> 2 files changed, 97 insertions(+), 54 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/if_tunnel.h b/include/uapi/linux/if_tunnel.h
>>> index 1046f55..dda4e4b 100644
>>> --- a/include/uapi/linux/if_tunnel.h
>>> +++ b/include/uapi/linux/if_tunnel.h
>>> @@ -24,9 +24,12 @@
>>> #define GRE_SEQ __cpu_to_be16(0x1000)
>>> #define GRE_STRICT __cpu_to_be16(0x0800)
>>> #define GRE_REC __cpu_to_be16(0x0700)
>>> -#define GRE_FLAGS __cpu_to_be16(0x00F8)
>>> +#define GRE_ACK __cpu_to_be16(0x0080)
>>> +#define GRE_FLAGS __cpu_to_be16(0x0078)
>>> #define GRE_VERSION __cpu_to_be16(0x0007)
>>>
>>> +#define GRE_PROTO_PPP __cpu_to_be16(0x880b)
>>> +
>>> struct ip_tunnel_parm {
>>> char name[IFNAMSIZ];
>>> int link;
>>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>>> index 61ad43f..33e957b 100644
>>> --- a/net/core/flow_dissector.c
>>> +++ b/net/core/flow_dissector.c
>>> @@ -346,63 +346,103 @@ ip_proto_again:
>>> hdr = __skb_header_pointer(skb, nhoff, sizeof(_hdr), data, hlen, &_hdr);
>>> if (!hdr)
>>> goto out_bad;
>>> - /*
>>> - * Only look inside GRE if version zero and no
>>> - * routing
>>> - */
>>> - if (hdr->flags & (GRE_VERSION | GRE_ROUTING))
>>> - break;
>>> -
>>> - proto = hdr->proto;
>>> - nhoff += 4;
>>> - if (hdr->flags & GRE_CSUM)
>>> - nhoff += 4;
>>> - if (hdr->flags & GRE_KEY) {
>>> - const __be32 *keyid;
>>> - __be32 _keyid;
>>> -
>>> - keyid = __skb_header_pointer(skb, nhoff, sizeof(_keyid),
>>> - data, hlen, &_keyid);
>>>
>>> - if (!keyid)
>>> - goto out_bad;
>>> + /* Only look inside GRE without routing */
>>> + if (!(hdr->flags & GRE_ROUTING)) {
>>> + proto = hdr->proto;
>>> +
>>> + if (hdr->flags & GRE_VERSION) {
>>
>> This matches any non-zero GRE version (1-7). Is that really what you want?
>>
>>> + /* It should be the PPTP in GRE */
>>> + u8 _ppp_hdr[PPP_HDRLEN];
>>> + u8 *ppp_hdr;
>>> + int offset = 0;
>>> +
>>> + /* Check the flags according to RFC 2637*/
>>> + if (!(proto == GRE_PROTO_PPP && (hdr->flags & GRE_KEY) &&
>>> + !(hdr->flags & (GRE_CSUM | GRE_STRICT | GRE_REC | GRE_FLAGS)))) {
>>> + break;
>>> + }
>>> +
>> I might be missing something, but I only see two material differences
>> between v0 and v1 of GRE. First is that keyid in v1 has specific use
>> that is not appropriate to use as entropy. Second is how the payload
>> is parsed. All the flag processing is common so it stills seems like
>> this should be consolidated.
>>
>>> + /* Skip GRE header */
>>> + offset += 4;
>>> + /* Skip payload length and call id */
>>> + offset += 4;
>>> +
>>> + if (hdr->flags & GRE_SEQ)
>>> + offset += 4;
>>> +
>>> + if (hdr->flags & GRE_ACK)
>>> + offset += 4;
>>> +
>>> + ppp_hdr = skb_header_pointer(skb, nhoff + offset, sizeof(_ppp_hdr), _ppp_hdr);
>>> + if (!ppp_hdr)
>>> + goto out_bad;
>>> + proto = PPP_PROTOCOL(ppp_hdr);
>>> + if (proto == PPP_IP) {
>>> + nhoff += (PPP_HDRLEN + offset);
>>> + proto = htons(ETH_P_IP);
>>> + key_control->flags |= FLOW_DIS_ENCAPSULATION;
>>> + goto again;
>>> + } else if (proto == PPP_IPV6) {
>>> + nhoff += (PPP_HDRLEN + offset);
>>> + proto = htons(ETH_P_IPV6);
>>> + key_control->flags |= FLOW_DIS_ENCAPSULATION;
>>> + goto again;
>>> + }
>>> + } else {
>>> + /* Original GRE */
>>> + nhoff += 4;
>>> + if (hdr->flags & GRE_CSUM)
>>> + nhoff += 4;
>>> + if (hdr->flags & GRE_KEY) {
>>> + const __be32 *keyid;
>>> + __be32 _keyid;
>>> +
>>> + keyid = __skb_header_pointer(skb, nhoff, sizeof(_keyid),
>>> + data, hlen, &_keyid);
>>> +
>>> + if (!keyid)
>>> + goto out_bad;
>>> +
>>> + if (dissector_uses_key(flow_dissector,
>>> + FLOW_DISSECTOR_KEY_GRE_KEYID)) {
>>> + key_keyid = skb_flow_dissector_target(flow_dissector,
>>> + FLOW_DISSECTOR_KEY_GRE_KEYID,
>>> + target_container);
>>> + key_keyid->keyid = *keyid;
>>> + }
>>> + nhoff += 4;
>>> + }
>>> + if (hdr->flags & GRE_SEQ)
>>> + nhoff += 4;
>>> + if (proto == htons(ETH_P_TEB)) {
>>> + const struct ethhdr *eth;
>>> + struct ethhdr _eth;
>>> +
>>> + eth = __skb_header_pointer(skb, nhoff,
>>> + sizeof(_eth),
>>> + data, hlen, &_eth);
>>> + if (!eth)
>>> + goto out_bad;
>>> + proto = eth->h_proto;
>>> + nhoff += sizeof(*eth);
>>> +
>>> + /* Cap headers that we access via pointers at the
>>> + * end of the Ethernet header as our maximum alignment
>>> + * at that point is only 2 bytes.
>>> + */
>>> + if (NET_IP_ALIGN)
>>> + hlen = nhoff;
>>> + }
>>> +
>>> + key_control->flags |= FLOW_DIS_ENCAPSULATION;
>>> + if (flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
>>> + goto out_good;
>>>
>>> - if (dissector_uses_key(flow_dissector,
>>> - FLOW_DISSECTOR_KEY_GRE_KEYID)) {
>>> - key_keyid = skb_flow_dissector_target(flow_dissector,
>>> - FLOW_DISSECTOR_KEY_GRE_KEYID,
>>> - target_container);
>>> - key_keyid->keyid = *keyid;
>>> + goto again;
>>> }
>>> - nhoff += 4;
>>> - }
>>> - if (hdr->flags & GRE_SEQ)
>>> - nhoff += 4;
>>> - if (proto == htons(ETH_P_TEB)) {
>>> - const struct ethhdr *eth;
>>> - struct ethhdr _eth;
>>> -
>>> - eth = __skb_header_pointer(skb, nhoff,
>>> - sizeof(_eth),
>>> - data, hlen, &_eth);
>>> - if (!eth)
>>> - goto out_bad;
>>> - proto = eth->h_proto;
>>> - nhoff += sizeof(*eth);
>>> -
>>> - /* Cap headers that we access via pointers at the
>>> - * end of the Ethernet header as our maximum alignment
>>> - * at that point is only 2 bytes.
>>> - */
>>> - if (NET_IP_ALIGN)
>>> - hlen = nhoff;
>>> }
>>> -
>>> - key_control->flags |= FLOW_DIS_ENCAPSULATION;
>>> - if (flags & FLOW_DISSECTOR_F_STOP_AT_ENCAP)
>>> - goto out_good;
>>> -
>>> - goto again;
>>> + break;
>>> }
>>> case NEXTHDR_HOP:
>>> case NEXTHDR_ROUTING:
>>> --
>>> 1.9.1
>>>
Powered by blists - more mailing lists