[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YmszSXueTxYOC41G@zx2c4.com>
Date: Fri, 29 Apr 2022 02:37:29 +0200
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Eric Dumazet <eric.dumazet@...il.com>
Cc: Netdev <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>, kuba@...nel.org,
hannes@...essinduktion.org
Subject: Re: Routing loops & TTL tracking with tunnel devices
Hey Eric,
On Tue, Nov 17, 2015 at 03:41:35AM +0100, Jason A. Donenfeld wrote:
> On Mon, Nov 16, 2015 at 11:28 PM, Eric Dumazet <eric.dumazet@...il.com> wrote:
> > There is very little chance we'll accept a new member in sk_buff, unless
> > proven needed.
>
> I actually have no intention of doing this! I'm wondering if there
> already is a member in sk_buff that moonlights as my desired ttl
> counter, or if there's another mechanism for avoiding routing loops. I
> want to work with what's already there, rather than meddling with the
> innards of important and memory sensitive structures such as sk_buff.
Well, 7 years later... Maybe you have a better idea now of what I was
working on then. :)
As an update on this issue, it's still quasi problematic. To review, I
can't use the TTL value, because the outer packet always must get the
TTL of the route to the outer destination, not the inner packet minus
one. I can't rely on reaching MTU size, because people want this to work
with fragmentation (see [1] for my attempt to disallow fragmentation for
this issue, which resulted in hoots and hollers). I can't use the
per-cpu xmit_recursion variable, because I use threads.
What I can sort of use is taking advantage of what looks like a bug in
pskb expansion, such that it always allocates too much, and pretty
quickly fails allocations after a few loops. Only powerpc64 and s390x
don't appear to have this bug. See [2] for a description of this in
depth I wrote a few months ago to you.
Anyway, it'd be nice if there were a free u8 somewhere in sk_buff that I
could use for tracking times through the stack. Other kernels have this
but afaict Linux still does not. I looked into trying to overload some
existing fields -- tstamp/skb_mstamp_ns or queue_mapping -- which I was
thinking might be totally unused on TX?
Any ideas about this?
Thanks,
Jason
[1] https://lore.kernel.org/wireguard/CAHmME9rNnBiNvBstb7MPwK-7AmAN0sOfnhdR=eeLrowWcKxaaQ@mail.gmail.com/
[2] https://lore.kernel.org/netdev/CAHmME9pv1x6C4TNdL6648HydD8r+txpV4hTUXOBVkrapBXH4QQ@mail.gmail.com/
Powered by blists - more mailing lists