[<prev] [next>] [day] [month] [year] [list]
Message-ID: <5A630F46702DD1498FFD48394B4A664C3472BAC2@john.ad.clarku.edu>
Date: Fri, 7 Dec 2007 17:21:02 -0500
From: Brian S Julin <BJulin@...rku.edu>
To: "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: [RFC] [PATCH] easier PBR for dynamic source tables (via multipath)
This is a first swat and not in final form. I hope folks here will help vet
my thinking on it.
This fills in a missed niche in policy routing support. It allows multipath
routes to select nexthop based on the source realm, inside the routing
decision step, immediately after RPF is performed. It moves RPF before
multipath selection.
This would be for people wanting to do policy routing based on a table
injected by a dynamic routing protocol, e.g. quagga, rather than static rules.
The existing methods for achieving this effect are all a bit tacky for
various reasons:
1) "iptables -m realm --realm X -j ROUTE" in FORWARD,mangle
because ipt_ROUTE is not a well supported iptables target
and has started to get dropped from mainstream distros. Maybe
for lack of maintenence, but perhaps it is intentionally
deprecated. (?)
2) "tc route from" ... "action mirred egress redirect" happens
too late in the packet processing to do much else to the
packet, like say edit the MAC addresses which remain what they
were on the original output dev. Doing this is really an
abuse of the queueing system and involves setting up qdiscs in
weird ways when one may only want to route.
3) Userspace scripts to glue loading from a kernel routing table
to a pre-routing ipset, iptables -j MARK, then "ip rule add fwmark"
because the kernel then has to check the source address against two
tables rather than one, and they could get quite large. Plus it's
hackery.
This patch is a raw proof-of-concept I put together to get
things working just enough to ensure that nothing blows up when
packets are routed this way. As such it does a couple of distasteful
things and has a couple rough edges:
Reuses the nh_weight field as the realm
Does not allow normal load balancing to fully mix in
ipv4 only
forward only, no code for local/output route.
probably will break ifndef CONFIG_NET_CLS_ROUTE
Were this general idea to be deemed worthy, and as long as limiting
sizeof(struct fib_nh) is not a major concern to any linux routing
application. I could work up a more thorough/cleaner patch allowing
statistical multipath and SAD policy-routing multipath to play nicely
together.
Especially needing comments on proper multipath RPF: The mainline code
only checks the selected path and if RPF fails it does not choose a
different one. From this I assumed it is OK to do RPF on any old nexthop,
and we just assume the user won't or can't put any PR rule in that would gum
that up. Otherwise both the mainline code and this code would have to
RPF multiple times, defeating the goal of good performance. (Not to
mention that could get extra confusing when you are using the source
realm to choose.) Special attention to the spec_dest handling, what
should be (?) OK since this is forward-only.
Also to consider is what this means to multipath caching should that
make a comeback.
I've only tested this code lightly so far, just bouncing things around
to static arp maps on the same if.
After patching iproute2, just substitute "weight X" with "byrealm X" to
activate it. Probably you want to avoid realm 0. You should be able to
put catch-all nexthops in with "weight X" alongside the "byrealm" ones
but they do not interact statistically. Comments on that syntax
also welcome.
Sorry about the attachments, no real MUAs available here that won't
corrupt tabs.
View attachment "linux-2.6.23.dsad.diff" of type "text/plain" (5737 bytes)
View attachment "iproute.dsad.diff" of type "text/plain" (2107 bytes)
Powered by blists - more mailing lists