lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Sun, 2 Oct 2022 12:30:18 +0300
From:   Paul Blakey <>
To:     Cong Wang <>
Cc:     Daniel Borkmann <>,
        Vlad Buslov <>, Oz Shlomo <>,
        Roi Dayan <>,,
        Saeed Mahameed <>,
        Eric Dumazet <>,
        "David S. Miller" <>,
        Jakub Kicinski <>,
        Paolo Abeni <>
Subject: Re: [PATCH net v2 1/2] net: Fix return value of qdisc ingress
 handling on success

On 01/10/2022 23:19, Cong Wang wrote:
> On Wed, Sep 28, 2022 at 10:55:49AM +0300, Paul Blakey wrote:
>> On 25/09/2022 21:00, Cong Wang wrote:
>>> On Sun, Sep 25, 2022 at 11:14:21AM +0300, Paul Blakey wrote:
>>>> Currently qdisc ingress handling (sch_handle_ingress()) doesn't
>>>> set a return value and it is left to the old return value of
>>>> the caller (__netif_receive_skb_core()) which is RX drop, so if
>>>> the packet is consumed, caller will stop and return this value
>>>> as if the packet was dropped.
>>>> This causes a problem in the kernel tcp stack when having a
>>>> egress tc rule forwarding to a ingress tc rule.
>>>> The tcp stack sending packets on the device having the egress rule
>>>> will see the packets as not successfully transmitted (although they
>>>> actually were), will not advance it's internal state of sent data,
>>>> and packets returning on such tcp stream will be dropped by the tcp
>>>> stack with reason ack-of-unsent-data. See reproduction in [0] below.
>>> Hm, but how is this return value propagated to egress? I checked
>>> tcf_mirred_act() code, but don't see how it is even used there.
>>> 318         err = tcf_mirred_forward(want_ingress, skb2);
>>> 319         if (err) {
>>> 320 out:
>>> 321                 tcf_action_inc_overlimit_qstats(&m->common);
>>> 322                 if (tcf_mirred_is_act_redirect(m_eaction))
>>> 323                         retval = TC_ACT_SHOT;
>>> 324         }
>>> 325         __this_cpu_dec(mirred_rec_level);
>>> 326
>>> 327         return retval;
>>> What am I missing?
>> for the ingress acting act_mirred it will return TC_ACT_CONSUMED above
>> the code you mentioned (since redirect=1, use_reinsert=1. Although
>> TC_ACT_STOLEN which is the retval set for this action, will also act the
>> same)
>> It is propagated as such (TX stack starting from tcp):
> Sorry for my misunderstanding.
> I meant to say those TC_ACT_* return value, not NET_RX_*, but I worried
> too much here, as mirred lets user specify the return value

Yes TC_ACT_* start at the action mirred case, and end in 
tcf_handle_ingress/egresss switch cases, which then should be converted 
to NET_RX and NET_XMIT if done.

> BTW, it seems you at least miss the drop case, which is NET_RX_DROP for
> TC_ACT_SHOT at least? Possibly other code paths in sch_handle_ingress()
> too.

I'll add the SHOT for v3 as the packet was handled in this case, but I 
should only update ret where the packet/skb was handled, which is where 
we also return NULL, as otherwise rx pipeline should continue and will 
update ret once handled (say in running the rx_handler).

For example, if there are not tc filters (tcf_classify returns 
TC_ACT_UNSPEC) I should not update *ret, and it will continue to the rx 
handler, and if there isn't any, it would return the default ret RX_DROP 

> Thanks.

Powered by blists - more mailing lists