lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAHmME9pv1x6C4TNdL6648HydD8r+txpV4hTUXOBVkrapBXH4QQ@mail.gmail.com>
Date:   Thu, 20 Jan 2022 15:22:19 +0100
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Eric Dumazet <edumazet@...gle.com>
Cc:     Netdev <netdev@...r.kernel.org>
Subject: pskb_expand_head always allocates in next power-of-two bucket

Hi Eric,

I saw you played with pskb_expand_head logic recently, so thought I'd
run something by you...

I've got some test code that sets up a nested tunnel routing loop for
the absolute most pathological case. To my satisfaction, packets are
dropped after a few times through the loop. Great. But then I tried
testing on powerpc64 and noticed this wasn't happening, so I decided
to jump in and figure out what's happening.

Actually the question becomes, why are packets being dropped on other
platforms, rather than why they're not on powerpc64. Here's what my
trace is:

A packet makes its way to ip6_finish_output2, and hits:

        if (unlikely(hh_len > skb_headroom(skb)) && dev->header_ops) {
               skb = skb_expand_head(skb, hh_len);
               if (!skb) {
                       IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTDISCARDS);
                       return -ENOMEM;
               }
       }

Each time through there, skb_expand_head is called, which then calls
into pskb_expand_head, where the fun begins.

On the way into pskb_expand_head, osize =
SKB_DATA_ALIGN(skb_end_offset(skb)) = 768, and then a new data is
allocated via size + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)),
and osize is then set to SKB_WITH_OVERHEAD(ksize(data)). On the way
out, it's then 1728. Rinse and repeat a few times and it blows up:

[    2.218080] skbuff: size in 768 out 1728
[    2.218426] skbuff: size in 1792 out 3776
[    2.218774] skbuff: size in 3840 out 7872
[    2.219123] skbuff: size in 7936 out 16064
[    2.219482] skbuff: size in 16128 out 32448
[    2.219837] skbuff: size in 32512 out 65216
[    2.220215] skbuff: size in 65280 out 130752
[    2.220608] skbuff: size in 130816 out 261824
[    2.221005] skbuff: size in 261888 out 523968
[    2.221401] skbuff: size in 524032 out 1048256
[    2.221807] skbuff: size in 1048320 out 2096832
[    2.222222] skbuff: size in 2096896 out 4193984
[    2.222618] skbuff: kmalloc_reserve failure for 4194368

As you can see, each time it's being reallocated in the next large
power-of-two allocation bucket. Something seems wrong with this. If
you do kmalloc(ksize(kmalloc(ksize(data) + n) + n)), you're always
going to bump up to the next kmalloc bucket, because you're adding to
the allocated bucket size.

I don't understand this code super well and so far I haven't had luck
poking at it. Any idea what's going on here? Is this behavior
intentional?

With regards to my routing loop, I set out to "fix" the powerpc64
behavior so it'd be like the other platforms, but now it's looking
more like the other platforms need fixing. So the loop from my test is
still an issue, but one I'll table until I understand what's happening
in pskb_expand_head better.

Thanks,
Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ