[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55a4f86e0910181609o6b21d667g8e65638667a1d687@mail.gmail.com>
Date: Sun, 18 Oct 2009 16:09:48 -0700
From: Maciej Żenczykowski <zenczykowski@...il.com>
To: hadi@...erus.ca
Cc: Eric Dumazet <eric.dumazet@...il.com>, netdev@...r.kernel.org,
David Miller <davem@...emloft.net>,
Atis Elsts <atis@...rotik.com>
Subject: Re: [PATCH][RFC]: ingress socket filter by mark
I haven't looked at the patches, but I do not believe requiring
marking to be symmetric to be a good idea.
Example:
- a relatively complex router/load balancer setup
- normal (no mark) packets get routed/load balanced to destinations
which are healthy
- a separate job which health checks the destinations (and updates the
no mark routing table on destination health state transitions) - it
uses socket marking to select a separate routing table which throws
all load at a specific destination host.
ie. the socket marking is used to explicitly direct load at a specific
destination host.
Obviously only the transmit path is affected. There is no marking
happening on the receive path.
Another example would be tunneling. I'd envision something like:
ip tunnel add vtun0 mode gre remote ... local ... tos inherit ttl 64 mark 0x1234
ip rule add fwmark 0x1234 lookup 250
ip route add 192.168.1.0/24 dev eth0 table 250
ip route add default via 192.168.1.1 dev eth0 table 250
ip route add default dev vtun0 table main
(obviously this is just an example and not fully fleshed out,
furthermore ip tunnel doesn't currently support mark, nor do the
tunnel modules themselves)
If you do want to allow explicit incoming mark filtering use another
socket option (SO_MARK_RCV) and a different/new socket field.
>> I vote for extending BPF, and not adding the price of a compare
>> for each packet. Only users wanting mark filtering should pay the price.
I agree that being able to filter on mark in bpf makes a lot of sense.
I wonder if we're not hitting the filters potentially before the mark
is set though (on receive at least)...
I'm nowhere near sure but I think packets get diverted/cloned to
tcpdump before they hit the ip stack (and thus potentially get marked
by ip(6)table mangle rules)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists