lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <d0ae6e4b-5392-704a-5c87-2939bc2d9c44@nvidia.com> Date: Sun, 2 Oct 2022 12:30:18 +0300 From: Paul Blakey <paulb@...dia.com> To: Cong Wang <xiyou.wangcong@...il.com> Cc: Daniel Borkmann <daniel@...earbox.net>, Vlad Buslov <vladbu@...dia.com>, Oz Shlomo <ozsh@...dia.com>, Roi Dayan <roid@...dia.com>, netdev@...r.kernel.org, Saeed Mahameed <saeedm@...dia.com>, Eric Dumazet <edumazet@...gle.com>, "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com> Subject: Re: [PATCH net v2 1/2] net: Fix return value of qdisc ingress handling on success On 01/10/2022 23:19, Cong Wang wrote: > On Wed, Sep 28, 2022 at 10:55:49AM +0300, Paul Blakey wrote: >> >> >> On 25/09/2022 21:00, Cong Wang wrote: >>> On Sun, Sep 25, 2022 at 11:14:21AM +0300, Paul Blakey wrote: >>>> Currently qdisc ingress handling (sch_handle_ingress()) doesn't >>>> set a return value and it is left to the old return value of >>>> the caller (__netif_receive_skb_core()) which is RX drop, so if >>>> the packet is consumed, caller will stop and return this value >>>> as if the packet was dropped. >>>> >>>> This causes a problem in the kernel tcp stack when having a >>>> egress tc rule forwarding to a ingress tc rule. >>>> The tcp stack sending packets on the device having the egress rule >>>> will see the packets as not successfully transmitted (although they >>>> actually were), will not advance it's internal state of sent data, >>>> and packets returning on such tcp stream will be dropped by the tcp >>>> stack with reason ack-of-unsent-data. See reproduction in [0] below. >>>> >>> >>> Hm, but how is this return value propagated to egress? I checked >>> tcf_mirred_act() code, but don't see how it is even used there. >>> >>> 318 err = tcf_mirred_forward(want_ingress, skb2); >>> 319 if (err) { >>> 320 out: >>> 321 tcf_action_inc_overlimit_qstats(&m->common); >>> 322 if (tcf_mirred_is_act_redirect(m_eaction)) >>> 323 retval = TC_ACT_SHOT; >>> 324 } >>> 325 __this_cpu_dec(mirred_rec_level); >>> 326 >>> 327 return retval; >>> >>> >>> What am I missing? >> >> for the ingress acting act_mirred it will return TC_ACT_CONSUMED above >> the code you mentioned (since redirect=1, use_reinsert=1. Although >> TC_ACT_STOLEN which is the retval set for this action, will also act the >> same) >> >> >> It is propagated as such (TX stack starting from tcp): >> > > Sorry for my misunderstanding. > > I meant to say those TC_ACT_* return value, not NET_RX_*, but I worried > too much here, as mirred lets user specify the return value Yes TC_ACT_* start at the action mirred case, and end in tcf_handle_ingress/egresss switch cases, which then should be converted to NET_RX and NET_XMIT if done. > > BTW, it seems you at least miss the drop case, which is NET_RX_DROP for > TC_ACT_SHOT at least? Possibly other code paths in sch_handle_ingress() > too. I'll add the SHOT for v3 as the packet was handled in this case, but I should only update ret where the packet/skb was handled, which is where we also return NULL, as otherwise rx pipeline should continue and will update ret once handled (say in running the rx_handler). For example, if there are not tc filters (tcf_classify returns TC_ACT_UNSPEC) I should not update *ret, and it will continue to the rx handler, and if there isn't any, it would return the default ret RX_DROP value. > > Thanks.
Powered by blists - more mailing lists