lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Date:   Tue, 3 May 2022 05:14:05 +0000
From:   Chris Packham <Chris.Packham@...iedtelesis.co.nz>
To:     Lokesh Dhoundiyal <Lokesh.Dhoundiyal@...iedtelesis.co.nz>,
        "wenxu@...oud.cn" <wenxu@...oud.cn>,
        David Miller <davem@...emloft.net>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: Re: Regarding _skb_refdst memory alloc/dealloc

+ Dave and Wen

On 3/05/22 15:10, Lokesh Dhoundiyal wrote:

> Hi,
>
> I have the tunnel destination entry set via skb_dst_set inside
> ip_tunnel_rcv. I wish to release the memory referenced by
> skb->_skb_refdst after use.
>
> Could you please advise the api to use for it. I am assuming that it is
> skb_dst_drop, Is that correct?

A bit more context. We've been seeing a memory leak that seems to have 
appeared when we updated our Linux kernel from v4.4.16 to v5.7.19. The 
test scenario involves learning OSPF routes over a tunnel. I don't 
imagine there's anything particularly special about OSFP just that it 
uses multicast traffic to communicate.

Some debugging pointed us at the kmalloc-256 slab and kmemleak seemed to 
confirm the suspicion.

unreferenced object 0x8000000044beb900 (size 256):
   comm "softirq", pid 0, jiffies 4294984455 (age 35.980s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 80 00 00 00 05 13 74 80 ..............t.
     80 00 00 00 04 9b bf f9 00 00 00 00 00 00 00 00 ................
   backtrace:
     [<00000000f83947e0>] __kmalloc+0x1e8/0x300
     [<00000000b7ed8dca>] metadata_dst_alloc+0x24/0x58
     [<0000000081d32c20>] __ipgre_rcv+0x100/0x2b8
     [<00000000824f6cf1>] gre_rcv+0x178/0x540
     [<00000000ccd4e162>] gre_rcv+0x7c/0xd8
     [<00000000c024b148>] ip_protocol_deliver_rcu+0x124/0x350
     [<000000006a483377>] ip_local_deliver_finish+0x54/0x68
     [<00000000d9271b3a>] ip_local_deliver+0x128/0x168
     [<00000000bd4968ae>] xfrm_trans_reinject+0xb8/0xf8
     [<0000000071672a19>] tasklet_action_common.isra.16+0xc4/0x1b0
     [<0000000062e9c336>] __do_softirq+0x1fc/0x3e0
     [<00000000013d7914>] irq_exit+0xc4/0xe0
     [<00000000a4d73e90>] plat_irq_dispatch+0x7c/0x108
     [<000000000751eb8e>] handle_int+0x16c/0x178
     [<00000000a0c43b3e>] put_object+0x20/0xd8
     [<000000009439acbb>] scan_gray_list+0x18c/0x268

It appears that the leak is due to commit c0d59da79534 ("ip_gre: Make 
none-tun-dst gre tunnel store tunnel info as metadat_dst in recv"). 
Prior to c0d59da79534 we'd only allocate a new dst if tunnel->collect_md 
were true but now we'll also allocate one if tnl_params->daddr == 0. 
When ip_route_input_mc() is eventually called it will call skb_dst_set() 
leaking whatever is in skb->_skb_refdst.

A naive fix would be to call skb_dst_drop() in ip_route_input_mc() just 
before calling skb_dst_set() (hence Lokesh's question) but I'm worried 
we've missed something. I can't rule out that this has already been 
fixed or is due to other changes in our kernel fork. I can't see 
anything that says "Fixes: c0d59da79534" so if it has been fixed 
c0d59da79534 doesn't appear to have been noted as the culprit. I've 
asked Lokesh to try and reproduce the problem with the latest kernel so 
we can rule out any changes we've made and confirm that the leak still 
exists.

I wanted to get this out now just in case it rings any bells or if 
someone has got a tunnel+multicast setup that might show the problem.

Thanks,
Chris


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ