[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87wppsc5sy.fsf@pingu.sky.yk.fujitsu.co.jp>
Date: Fri, 26 Feb 2016 16:02:53 +0900
From: Kouya Shimura <kouya@...fujitsu.com>
To: David Miller <davem@...emloft.net>
CC: <netdev@...r.kernel.org>
Subject: Re: [PATCH] ipv4: do not cache local route when using SO_BINDTODEVICE
Thanks for reviewing.
David Miller <davem@...emloft.net> writes:
> From: Kouya Shimura <kouya@...fujitsu.com>
> Date: Tue, 23 Feb 2016 14:27:32 +0900
>
>> After v3.6, output routes are cached, however, the 'rt_iif' field of
>> struct rtable can not be shared by various traffics using SO_BINDTODEVICE.
>> It causes that traffic can not reach the local receiver which also uses
>> SO_BINDTODEVICE after another traffic creates a local route cache.
>>
>> /* sender and receiver are running on same host. */
>> sender:
>> - socket(SOCK_DGRAM)
>> - setsockopt(SO_BINDTODEVICE, "eth0")
>> - connect(INADDR_ANY)
>> - send("message") ------>[loopback]------+
>> |
>> receiver: |
>> - socket(SOCK_DGRAM) |
>> - setsockopt(SO_BINDTODEVICE, "eth0") |
>> - bind(INADDR_ANY) |
>> - recv() <------------------------------+
>> /* sometimes reach, sometimes not reach!!! */
>>
>> Before v3.6, above traffics always reached. It seems to be a
>> regression. Actually the 'dhcp_release' command relies on it.
>>
>> Commit 7bd86cc282a458b66c41e3f6 ("ipv4: Cache local output routes")
>> originated this issue. Revert it conditionally.
>>
>> Signed-off-by: Kouya Shimura <kouya@...fujitsu.com>
>
> I strongly fear that this will cause a performance regression for some important
> cases. If you elide the cache, it means that every route lookup has
> to allocate,
> create, and destroy the cache object. This is really expensive.
The performace degradation was on my mind. Thus, I intended that this
patch affects only special (abnormal?) case that destination is local
and using SO_BINDTODEVICE. Isn't it acceptable?
This patch deesn't elide the existing cache. Normal (not using
SO_BINDTODEVICE) routes are still cached (not destoyed). Outer
routes, too. I confirmed it.
If you worry about non-zeroed 'orig_oif' case, this patch certainly
implies a perfomance problem. 'orig_oif' seems to be non-zero in
cases of ICMP, GRE, MULTICAST besides SO_BINDTODEVICE.
>
> There has to be a better way to do so, so I'm not applying this patch,
> sorry.
Yes, it has to be. The root cause is sharing the 'rt_iif' field.
I'm looking forward to the right fix.
Thanks,
Kouya
Powered by blists - more mailing lists