netdev - Re: [PATCH net v2 1/2] net: Fix return value of qdisc ingress handling on success

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <d0ae6e4b-5392-704a-5c87-2939bc2d9c44@nvidia.com>
Date:   Sun, 2 Oct 2022 12:30:18 +0300
From:   Paul Blakey <paulb@...dia.com>
To:     Cong Wang <xiyou.wangcong@...il.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Vlad Buslov <vladbu@...dia.com>, Oz Shlomo <ozsh@...dia.com>,
        Roi Dayan <roid@...dia.com>, netdev@...r.kernel.org,
        Saeed Mahameed <saeedm@...dia.com>,
        Eric Dumazet <edumazet@...gle.com>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>
Subject: Re: [PATCH net v2 1/2] net: Fix return value of qdisc ingress
 handling on success



On 01/10/2022 23:19, Cong Wang wrote:
> On Wed, Sep 28, 2022 at 10:55:49AM +0300, Paul Blakey wrote:
>>
>>
>> On 25/09/2022 21:00, Cong Wang wrote:
>>> On Sun, Sep 25, 2022 at 11:14:21AM +0300, Paul Blakey wrote:
>>>> Currently qdisc ingress handling (sch_handle_ingress()) doesn't
>>>> set a return value and it is left to the old return value of
>>>> the caller (__netif_receive_skb_core()) which is RX drop, so if
>>>> the packet is consumed, caller will stop and return this value
>>>> as if the packet was dropped.
>>>>
>>>> This causes a problem in the kernel tcp stack when having a
>>>> egress tc rule forwarding to a ingress tc rule.
>>>> The tcp stack sending packets on the device having the egress rule
>>>> will see the packets as not successfully transmitted (although they
>>>> actually were), will not advance it's internal state of sent data,
>>>> and packets returning on such tcp stream will be dropped by the tcp
>>>> stack with reason ack-of-unsent-data. See reproduction in [0] below.
>>>>
>>>
>>> Hm, but how is this return value propagated to egress? I checked
>>> tcf_mirred_act() code, but don't see how it is even used there.
>>>
>>> 318         err = tcf_mirred_forward(want_ingress, skb2);
>>> 319         if (err) {
>>> 320 out:
>>> 321                 tcf_action_inc_overlimit_qstats(&m->common);
>>> 322                 if (tcf_mirred_is_act_redirect(m_eaction))
>>> 323                         retval = TC_ACT_SHOT;
>>> 324         }
>>> 325         __this_cpu_dec(mirred_rec_level);
>>> 326
>>> 327         return retval;
>>>
>>>
>>> What am I missing?
>>
>> for the ingress acting act_mirred it will return TC_ACT_CONSUMED above
>> the code you mentioned (since redirect=1, use_reinsert=1. Although
>> TC_ACT_STOLEN which is the retval set for this action, will also act the
>> same)
>>
>>
>> It is propagated as such (TX stack starting from tcp):
>>
> 
> Sorry for my misunderstanding.
> 
> I meant to say those TC_ACT_* return value, not NET_RX_*, but I worried
> too much here, as mirred lets user specify the return value

Yes TC_ACT_* start at the action mirred case, and end in 
tcf_handle_ingress/egresss switch cases, which then should be converted 
to NET_RX and NET_XMIT if done.

> 
> BTW, it seems you at least miss the drop case, which is NET_RX_DROP for
> TC_ACT_SHOT at least? Possibly other code paths in sch_handle_ingress()
> too.

I'll add the SHOT for v3 as the packet was handled in this case, but I 
should only update ret where the packet/skb was handled, which is where 
we also return NULL, as otherwise rx pipeline should continue and will 
update ret once handled (say in running the rx_handler).

For example, if there are not tc filters (tcf_classify returns 
TC_ACT_UNSPEC) I should not update *ret, and it will continue to the rx 
handler, and if there isn't any, it would return the default ret RX_DROP 
value.


> 
> Thanks.