lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <eb5601d0-2c95-4485-31b2-e5cb5ff3037c@redhat.com>
Date:   Thu, 19 Jan 2023 14:18:12 +0100
From:   Jesper Dangaard Brouer <jbrouer@...hat.com>
To:     Eric Dumazet <edumazet@...gle.com>,
        Jesper Dangaard Brouer <jbrouer@...hat.com>
Cc:     brouer@...hat.com, Jakub Kicinski <kuba@...nel.org>,
        netdev@...r.kernel.org, "David S. Miller" <davem@...emloft.net>,
        pabeni@...hat.com
Subject: Re: [PATCH net-next V2 2/2] net: kfree_skb_list use
 kmem_cache_free_bulk


On 19/01/2023 12.46, Eric Dumazet wrote:
> On Thu, Jan 19, 2023 at 12:22 PM Jesper Dangaard Brouer
> <jbrouer@...hat.com> wrote:
>>
>>
>>
>> On 19/01/2023 11.28, Eric Dumazet wrote:
>>> On Thu, Jan 19, 2023 at 11:18 AM Jesper Dangaard Brouer
>>> <jbrouer@...hat.com> wrote:
>>>>
>>>>
>>>>
>>>> On 19/01/2023 03.26, Jakub Kicinski wrote:
>>>>> On Wed, 18 Jan 2023 22:37:47 +0100 Jesper Dangaard Brouer wrote:
>>>>>>> +           skb_mark_not_on_list(segs);
>>>>>>
>>>>>> The syzbot[1] bug goes way if I remove this skb_mark_not_on_list().
>>>>>>
>>>>>> I don't understand why I cannot clear skb->next here?
>>>>>
>>>>> Some of the skbs on the list are not private?
>>>>> IOW we should only unlink them if skb_unref().
>>>>
>>>> Yes, you are right.
>>>>
>>>> The skb_mark_not_on_list() should only be called if __kfree_skb_reason()
>>>> returns true, meaning the SKB is ready to be free'ed (as it calls/check
>>>> skb_unref()).
>>>
>>>
>>> This was the case already before your changes.
>>>
>>> skb->next/prev can not be shared by multiple users.
>>>
>>> One skb can be put on a single list by definition.
>>>
>>> Whoever calls kfree_skb_list(list) owns all the skbs->next|prev found
>>> in the list
>>>
>>> So you can mangle skb->next as you like, even if the unref() is
>>> telling that someone
>>> else has a reference on skb.
>>
>> Then why does the bug go way if I remove the skb_mark_not_on_list() call
>> then?
>>
> 
> Some side effects.
> 
> This _particular_ repro uses a specific pattern that might be defeated
> by a small change.
> (just working around another bug)
> 
> Instead of setting skb->next to NULL, try to set it to
> 
> skb->next = (struct sk_buff *)0x800;
> 
> This might show a different pattern.

Nice trick, I'll use this next time.

I modified to code and added a kfree_skb tracepoint with a known reason
(PROTO_MEM) to capture the callstack. See end of email.  Which shows
multicast code path is involved.

  trace_kfree_skb(segs, __builtin_return_address(0), 
SKB_DROP_REASON_PROTO_MEM);

>>>>
>>>> I will send a proper fix patch shortly... after syzbot do a test on it.
>>>>
>>
>> I've send a patch for syzbot that only calls skb_mark_not_on_list() when
>> unref() and __kfree_skb_reason() "permits" this.
>> I tested it locally with reproducer and it also fixes/"removes" the bug.
> 
> This does not mean we will accept a patch with no clear explanation
> other than "this removes a syzbot bug, so this must be good"
> 
> Make sure to give precise details on _why_ this is needed or not.
> 
> Again, the user of kfree_skb_list(list) _owns_ skb->next for sure.
> If you think this assertion is not true, we are in big trouble.
> 

I think I have found an explanation, why/when refcnt can be elevated on
an SKB-list.  The skb_shinfo(skb)->flag_list can increase refcnt.

See code:
  static void skb_clone_fraglist(struct sk_buff *skb)
  {
	struct sk_buff *list;

	skb_walk_frags(skb, list)
		skb_get(list);
  }

This walks the SKBs on the shinfo->frag_list and increase the refcnt
(skb->users).

Notice that kfree_skb_list is also called when freeing the SKBs
"frag_list" in skb_release_data().

IMHO this explains why we can only remove the SKB from the list, when
"permitted" by skb_unref(), e.g. if __kfree_skb_reason() returns true.

--Jesper


Call-stack of case with elevated refcnt when walking SKB-list:
--------------------------------------------------------------

repro  3048 [003]   101.689670: skb:kfree_skb: 
skbaddr=0xffff888104086600 protocol=0 
location=skb_release_data.cold+0x25 reason: PROTO_MEM
         ffffffff81bb6448 kfree_skb_list_reason.cold+0x3b 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81bb6448 kfree_skb_list_reason.cold+0x3b 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81bb6484 skb_release_data.cold+0x25 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff819137e1 kfree_skb_reason+0x41 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81a1e246 igmp_rcv+0xf6 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff819c5e85 ip_protocol_deliver_rcu+0x165 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff819c5f12 ip_local_deliver_finish+0x72 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81930b89 __netif_receive_skb_one_core+0x69 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81930e31 process_backlog+0x91 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff8193197b __napi_poll+0x2b 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81931e86 net_rx_action+0x276 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81bcf083 __do_softirq+0xd3 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81087243 do_softirq+0x63 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff810872e4 __local_bh_enable_ip+0x64 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff8192ba8f netif_rx+0xdf 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff8192bb27 dev_loopback_xmit+0x77 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff819c9195 ip_mc_finish_output+0x65 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff819cc597 ip_mc_output+0x137 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff819cd2b8 ip_push_pending_frames+0xa8 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81a04f67 raw_sendmsg+0x607 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff8190153b sock_sendmsg+0x8b 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff8190323b __sys_sendto+0xeb 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff819032b0 __x64_sys_sendto+0x20 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81bb9ffa do_syscall_64+0x3a 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
         ffffffff81c000aa entry_SYSCALL_64+0xaa 
(/boot/vmlinux-6.2.0-rc3-net-next-kfunc-xdp-hints+)
                   10af3d syscall+0x1d (/usr/lib64/libc.so.6)
                     46d3 do_sandbox_none+0x81 
(/home/jbrouer/syzbot-kfree_skb_list/repro)
                     48ce main+0xa5 
(/home/jbrouer/syzbot-kfree_skb_list/repro)
                    29550 __libc_start_call_main+0x80 (/usr/lib64/libc.so.6)




Resolving some symbols:

ip_mc_finish_output+0x65
------------------------
[net-next]$ ./scripts/faddr2line net/ipv4/ip_output.o 
ip_mc_finish_output+0x65
ip_mc_finish_output+0x65/0x190:
ip_mc_finish_output at net/ipv4/ip_output.c:356

ip_mc_output+0x137
------------------
[net-next]$ ./scripts/faddr2line vmlinux ip_mc_output+0x137
ip_mc_output+0x137/0x2a0:
skb_network_header at include/linux/skbuff.h:2829
(inlined by) ip_hdr at include/linux/ip.h:21
(inlined by) ip_mc_output at net/ipv4/ip_output.c:401

igmp_rcv+0xf6
-------------
[net-next]$ ./scripts/faddr2line vmlinux igmp_rcv+0xf6
igmp_rcv+0xf6/0x2e0:
igmp_rcv at net/ipv4/igmp.c:1130

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ