lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4A8AE76D.7040707@iki.fi>
Date:	Tue, 18 Aug 2009 20:39:57 +0300
From:	Timo Teräs <timo.teras@....fi>
To:	Patrick McHardy <kaber@...sh.net>
CC:	netfilter-devel@...r.kernel.org, netdev@...r.kernel.org
Subject: Re: bad nat connection tracking performance with ip_gre

Patrick McHardy wrote:
> Timo Teräs wrote:
>> Looped back by multicast routing:
>>
>> raw:PREROUTING:policy:1 IN=eth1 OUT= MAC= SRC=10.252.5.1
>> DST=239.255.12.42 LEN=1344 TOS=0x00 PREC=0x00 TTL=8 ID=36594 DF
>> PROTO=UDP SPT=33977 DPT=1234 LEN=1324 mangle:PREROUTING:policy:1 IN=eth1
>> OUT= MAC= SRC=10.252.5.1 DST=239.255.12.42 LEN=1344 TOS=0x00 PREC=0x00
>> TTL=8 ID=36594 DF PROTO=UDP SPT=33977 DPT=1234 LEN=1324
>> The cpu hogging happens somewhere below this, since the more
>> multicast destinations I have the more CPU it takes.
> 
> So you're sending to multiple destinations? That obviously increases
> the time spent in netfilter and the remaining networking stack.

Yes. But my observation was that for the same amount of packets
sent locally the CPU usage is significantly higher than if they
are forwarded from physical interface. That's what made me
curious.

If I had remember that icmp conn track entries get pruned right
when they get icmp reply back, I would not have probably bothered
to bug you. But that made me think it was more of generic problem
than my patch.

>> Multicast forwarded (I hacked this into the code; but similar
>> dump happens on local sendto()):
>>
>> Actually, now that I think, here we should have the inner IP
>> contents, and not the incomplete outer yet. So apparently
>> the ipgre_header() messes the network_header position.
> 
> It shouldn't even have been called at this point. Please retry this
> without your changes.

I patched ipmr.c to explicitly call dev_hard_header to setup the
ipgre nbma receiver. Sadly, the call was wrong side of the nf_hook.
Adjusting that makes the forward hooks look ok.

I thought hook was using network_header to figure out where the
IP header is, but looks like that isn't the case.

>> mangle:FORWARD:policy:1 IN=eth1 OUT=gre1 SRC=0.0.0.0 DST=re.mo.te.ip
>> LEN=0 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=47 filter:FORWARD:rule:2
>> IN=eth1 OUT=gre1 SRC=0.0.0.0 DST=re.mo.te.ip LEN=0 TOS=0x00 PREC=0x00
>> TTL=64 ID=0 DF PROTO=47
> 
> This looks really broken. Why is the protocol already 47 before it even
> reaches the gre tunnel?

Broken by me as explained.

>> ip_gre xmit sends out:
> 
> There should be a POSTROUTING hook here.

Hmm... Looking at the code I probably broke this too. Could missing
this hook have a performance penalty for future packets for the
same flow?

Ok. I'll go back to drawing board. I should have done the
multicast handling for nbma destinations on ip_gre side as I was
wondering earlier. I'll also double check with oprofile the local
sendto() approach where it dies.

- Timo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ