[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UdiYRHrSUGb9qDJ-GGMBj53P1L4KHSV7tv+omA5FjRZNQ@mail.gmail.com>
Date: Thu, 2 Sep 2021 19:01:22 -0700
From: Alexander Duyck <alexander.duyck@...il.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: Netdev <netdev@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Ido Schimmel <idosch@...sch.org>,
chouhan.shreyansh630@...il.com
Subject: Re: [PATCH net] ip_gre: validate csum_start only if CHECKSUM_PARTIAL
On Thu, Sep 2, 2021 at 2:45 PM Willem de Bruijn
<willemdebruijn.kernel@...il.com> wrote:
>
> On Thu, Sep 2, 2021 at 5:17 PM Alexander Duyck
> <alexander.duyck@...il.com> wrote:
> >
> > On Thu, Sep 2, 2021 at 1:30 PM Willem de Bruijn
> > <willemdebruijn.kernel@...il.com> wrote:
> > >
> > > On Thu, Sep 2, 2021 at 4:25 PM Alexander Duyck
> > > <alexander.duyck@...il.com> wrote:
> > > >
> > > > On Thu, Sep 2, 2021 at 12:38 PM Willem de Bruijn
> > > > <willemdebruijn.kernel@...il.com> wrote:
> > > > >
> > > > > From: Willem de Bruijn <willemb@...gle.com>
> > > > >
> > > > > Only test integrity of csum_start if checksum offload is configured.
> > > > >
> > > > > With checksum offload and GRE tunnel checksum, gre_build_header will
> > > > > cheaply build the GRE checksum using local checksum offload. This
> > > > > depends on inner packet csum offload, and thus that csum_start points
> > > > > behind GRE. But validate this condition only with checksum offload.
> > > > >
> > > > > Link: https://lore.kernel.org/netdev/YS+h%2FtqCJJiQei+W@shredder/
> > > > > Fixes: 1d011c4803c7 ("ip_gre: add validation for csum_start")
> > > > > Signed-off-by: Willem de Bruijn <willemb@...gle.com>
> > > > > ---
> > > > > net/ipv4/ip_gre.c | 5 ++++-
> > > > > 1 file changed, 4 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> > > > > index 177d26d8fb9c..09311992a617 100644
> > > > > --- a/net/ipv4/ip_gre.c
> > > > > +++ b/net/ipv4/ip_gre.c
> > > > > @@ -473,8 +473,11 @@ static void __gre_xmit(struct sk_buff *skb, struct net_device *dev,
> > > > >
> > > > > static int gre_handle_offloads(struct sk_buff *skb, bool csum)
> > > > > {
> > > > > - if (csum && skb_checksum_start(skb) < skb->data)
> > > > > + /* Local checksum offload requires csum offload of the inner packet */
> > > > > + if (csum && skb->ip_summed == CHECKSUM_PARTIAL &&
> > > > > + skb_checksum_start(skb) < skb->data)
> > > > > return -EINVAL;
> > > > > +
> > > > > return iptunnel_handle_offloads(skb, csum ? SKB_GSO_GRE_CSUM : SKB_GSO_GRE);
> > > > > }
> > >
> > > Thanks for taking a look.
> > >
> > > > So a few minor nits.
> > > >
> > > > First I think we need this for both v4 and v6 since it looks like this
> > > > code is reproduced for net/ipv6/ip6_gre.c.
> > >
> > > I sent a separate patch for v6. Perhaps should have made it a series
> > > to make this more clear.
> >
> > Yeah, that was part of the reason I assumed the ipv6 patch was overlooked.
>
> I was in two minds only because a series should come with a cover
> letter and thus one extra email added to the firehose. But this makes
> clear the value. Will just do that in the future.
>
> > > > Second I don't know if we even need to bother including the "csum"
> > > > portion of the lookup since that technically is just telling us if the
> > > > GRE tunnel is requesting a checksum or not and I am not sure that
> > > > applies to the fact that the inner L4 header is going to be what is
> > > > requesting the checksum offload most likely.
> > >
> > > This test introduced in the original patch specifically protects the
> > > GRE tunnel checksum calculation using lco_csum. The regular inner
> > > packet path likely is already robust (as similar bug reports would be
> > > more likely for that more common case).
> >
> > I was just thinking in terms of shaving off some extra comparisons. I
> > suppose it depends on if this is being inlined or not. If it is being
> > inlined there are at least 2 cases where the if statement would be
> > dropped since csum is explicitly false. My thought was that by just
> > jumping straight to the skb->ip_summed check we can drop the lookup
> > for csum since it would be implied by the fact that skb->ip_summed is
> > being checked. It would create a broader check, but at the same time
> > it would reduce the number of comparisons in the call.
>
> Most GRE tunnels don't have checksums and csum is likely in a register,
> as function argument, so it likely is the cheaper test?
>
> More functional argument: if !csum, the GRE tunnel does not care about
> the integrity of csum_start. So I think that it should not read it at all.
The problem as I see it is that it is just passing the problem on.
Adding the check to the GRE drivers only really fixes one spot that
exposes the issue. In this case it was triggered because the lco_csum
was causing issues. What happens when one of these packets makes it
down to hardware and it has to deal with the malformed csum_start? I
suspect we end up potentially causing issues where Tx metadata could
be malformed resulting in a dropped packet at the best case, and
kernel panics at the worst.
> > > > Also maybe this should be triggering a WARN_ON_ONCE if we hit this as
> > > > the path triggering this should be fixed rather than us silently
> > > > dropping frames. We should be figuring out what cases are resulting in
> > > > us getting CHECKSUM_PARTIAL without skb_checksum_start being set.
> > >
> > > We already know that bad packets can enter the kernel and trigger
> > > this, using packet sockets and virtio_net_hdr. Unfortunately, this
> > > *is* the fix.
> >
> > It sounds almost like we need a CHECKSUM_DODGY to go along with the
> > SKB_GSO_DODGY in order to resolve these kinds of issues.
> >
> > So essentially we have a source that we know can give us bad packets.
> > We really should be performing some sort of validation on these much
> > earlier in order to clean them up so that we don't have to add this
> > sort of exception handling code all over the Tx path.
>
> Agreed with the concern. I've been arguing for validation at kernel
> entry of virtio_net_hdr. As an optional feature, if nothing else:
> https://patchwork.kernel.org/project/netdevbpf/patch/20210616203448.995314-3-tannerlove.kernel@gmail.com/
>
> But unless we accept the cost of full parsing to identify the
> transport headers, we cannot predict at that stage whether the field
> is bogus, let alone whether it might trigger a bug later on. We do
> basic validation: csum_start is verified to be within the skb linear,
> so not totally out of bounds.
>
> That the offset is not just bogus, but causes a bug appears to be a
> rare exception peculiar to the GRE tunnel. Only it pulls the outer
> header (in ipgre_xmit), applies lco_csum and can so trigger negative
> overflow, as far as I could tell. That's why we decided to add the
> limited check local to that code.
>
> I'm not sure how we would use CHECKSUM_DODGY in practice.
The issue is drivers with NETIF_F_HW_CSUM would be expecting a
reasonable csum_start and csum_offset. If the hardware is only
advertising that and we are expecting it to offload the checksum we
should probably be doing some sort of validation on user derived
inputs to make sure that they aren't totally ridiculous as is the case
here where the original issue was that the csum_start wasn't even in
the packet data.
Would it maybe make sense to look at reverting the earlier fixes and
instead updating skb_partial_csum_set so that we cannot write a
csum_start value that is less than the start of skb->data? That way we
are addressing this at the source instead of allowing the garbage data
to propagate further down the stack and having to drop it at the
driver level which is going to have us playing whack a mole trying to
fix it where it pops up. It seems like we are already validating the
upper bounds for these values in that function so why not validate the
lower bounds as well?
Powered by blists - more mailing lists