[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4BzZT5SSGaEBP3ep6ZZkEqhnznzaAo=EUB-juDzbLwjyErA@mail.gmail.com>
Date: Tue, 25 Feb 2025 13:47:58 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Stephen Brennan <stephen.s.brennan@...cle.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>, Masahiro Yamada <masahiroy@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>, Nicolas Schier <nicolas@...sle.eu>, Kees Cook <kees@...nel.org>,
KP Singh <kpsingh@...nel.org>, Martin KaFai Lau <martin.lau@...ux.dev>,
Sami Tolvanen <samitolvanen@...gle.com>, Eduard Zingerman <eddyz87@...il.com>,
linux-arch <linux-arch@...r.kernel.org>, Stanislav Fomichev <sdf@...ichev.me>,
Kent Overstreet <kent.overstreet@...ux.dev>, Pasha Tatashin <pasha.tatashin@...een.com>,
Jiri Olsa <jolsa@...nel.org>, John Fastabend <john.fastabend@...il.com>,
Jann Horn <jannh@...gle.com>, Ard Biesheuvel <ardb@...nel.org>,
Yonghong Song <yonghong.song@...ux.dev>, Hao Luo <haoluo@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linux Kbuild mailing list <linux-kbuild@...r.kernel.org>, Daniel Borkmann <daniel@...earbox.net>,
Arnd Bergmann <arnd@...db.de>, Nathan Chancellor <nathan@...nel.org>, linux-debuggers@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>, Song Liu <song@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
bpf <bpf@...r.kernel.org>
Subject: Re: [PATCH 2/2] btf: Add the option to include global variable types
On Tue, Feb 18, 2025 at 3:10 PM Stephen Brennan
<stephen.s.brennan@...cle.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@...il.com> writes:
> > On Tue, Feb 11, 2025 at 3:59 PM Stephen Brennan
> [...]
> >> We can dust that off and include it for a new version of this series.
> >> I'd be curious of what you'd like to see for kernel modules? A
> >> three-level tree would be too complex, in my opinion.
> >
> > What is the use case for vars in kernel modules?
>
> The use case would be the same as for the core kernel. My primary
> motivation is to allow drgn to understand the types of global variables,
> and that extends to kernel modules too.
>
> >> module BTF size increased by 53.2%.
> >
> > This is the sum of all mods with vars divided by
> > the sum of all mods without?
>
> That was a poorly done comparison, so let me provide this one that I did
> using 6.13 and these patches. It was essentially a localmodconfig for a
> VM instance, so I could still do better by picking a popular
> distribution config. But I think this is far more representative.
>
> MODULE BASE COMP CHG PCT
> drm.ko 115833 123410 7577 6.54%
> iscsi_boot_sysfs.ko 2627 5380 2753 104.80%
> joydev.ko 1816 2289 473 26.05%
> libcxgbi.ko 24556 25266 710 2.89%
> drm_vram_helper.ko 22325 22751 426 1.91%
> nvme-tcp.ko 25044 25973 929 3.71%
> vfat.ko 3448 3953 505 14.65%
> btrfs.ko 275139 343686 68547 24.91%
> libiscsi.ko 21177 21977 800 3.78%
> xt_owner.ko 449 803 354 78.84%
> nft_ct.ko 4912 6157 1245 25.35%
> iscsi_ibft.ko 3967 4463 496 12.50%
> pcspkr.ko 283 682 399 140.99%
> crc32-pclmul.ko 390 771 381 97.69%
> nf_conntrack.ko 23686 28191 4505 19.02%
> iscsi_tcp.ko 16827 17750 923 5.49%
> nft_fib.ko 835 1117 282 33.77%
> nf_reject_ipv6.ko 699 981 282 40.34%
> rfkill.ko 4233 6410 2177 51.43%
> dm-region-hash.ko 6214 6496 282 4.54%
> cxgb3i.ko 35469 37078 1609 4.54%
> dm-mirror.ko 7576 8191 615 8.12%
> pvpanic-pci.ko 174 574 400 229.89%
> crct10dif-pclmul.ko 146 525 379 259.59%
> nvme-fabrics.ko 17341 18124 783 4.52%
> kvm-amd.ko 47302 51914 4612 9.75%
> crc8.ko 221 405 184 83.26%
> ib_iser.ko 27769 29116 1347 4.85%
> sg.ko 4234 5656 1422 33.59%
> intel_rapl_common.ko 5678 8446 2768 48.75%
> bochs.ko 35643 36997 1354 3.80%
> sha1-ssse3.ko 790 1305 515 65.19%
> kvm-intel.ko 53802 59220 5418 10.07%
> nft_chain_nat.ko 279 714 435 155.91%
> vmlinux 5484970 7330096 1845126 33.64%
> sha256-ssse3.ko 851 1378 527 61.93%
> nf_nat.ko 6341 7240 899 14.18%
> configs.ko 72 256 184 255.56%
> xt_comment.ko 151 507 356 235.76%
> ccp.ko 30433 34782 4349 14.29%
> cxgb3.ko 44981 47504 2523 5.61%
> crypto_simd.ko 1331 1613 282 21.19%
> iptable_filter.ko 855 1456 601 70.29%
> qedi.ko 70653 72786 2133 3.02%
> drm_kms_helper.ko 63238 65000 1762 2.79%
> cnic.ko 117074 117790 716 0.61%
> failover.ko 780 1216 436 55.90%
> nft_redir.ko 874 1529 655 74.94%
> serio_raw.ko 708 1234 526 74.29%
> nf_defrag_ipv6.ko 1520 2253 733 48.22%
> nf_defrag_ipv4.ko 306 770 464 151.63%
> nft_reject_ipv4.ko 517 939 422 81.62%
> nft_nat.ko 1192 1732 540 45.30%
> nft_reject_inet.ko 554 976 422 76.17%
> fuse.ko 32181 41859 9678 30.07%
> nft_compat.ko 3705 4404 699 18.87%
> zstd_compress.ko 42597 43622 1025 2.41%
> tls.ko 15140 20683 5543 36.61%
> virtio_pci.ko 8456 9193 737 8.72%
> blake2b_generic.ko 1364 1699 335 24.56%
> cryptd.ko 3697 4297 600 16.23%
> xor.ko 1358 1879 521 38.37%
> intel_rapl_msr.ko 2851 3440 589 20.66%
> kvm.ko 177060 256377 79317 44.80%
> cxgb4.ko 215865 220844 4979 2.31%
> bnx2i.ko 39524 41477 1953 4.94%
> dm-round-robin.ko 1795 2123 328 18.27%
> virtio_pci_legacy_dev.ko 909 1191 282 31.02%
> qla4xxx.ko 79040 82694 3654 4.62%
> nfs.ko 108350 169642 61292 56.57%
> libata.ko 47301 66188 18887 39.93%
> ghash-clmulni-intel.ko 578 997 419 72.49%
> nf_reject_ipv4.ko 706 988 282 39.94%
> nft_reject.ko 820 1196 376 45.85%
> sunrpc.ko 127496 197841 70345 55.17%
> nft_fib_ipv4.ko 803 1257 454 56.54%
> scsi_transport_iscsi.ko 40419 57633 17214 42.59%
> lockd.ko 36144 42137 5993 16.58%
> drm_shmem_helper.ko 32555 33043 488 1.50%
> nvme-core.ko 50275 58298 8023 15.96%
> iw_cm.ko 13405 14796 1391 10.38%
> mdio.ko 857 1041 184 21.47%
> bnx2.ko 20354 21611 1257 6.18%
> net_failover.ko 1742 2187 445 25.55%
> ip_set.ko 11812 13093 1281 10.84%
> libcxgb.ko 8698 8980 282 3.24%
> dm-multipath.ko 8124 8898 774 9.53%
> grace.ko 462 890 428 92.64%
> virtio_net.ko 12322 14896 2574 20.89%
> qed.ko 228735 232231 3496 1.53%
> cdc-acm.ko 2923 3679 756 25.86%
> i2c-piix4.ko 1124 2341 1217 108.27%
> pvpanic-mmio.ko 177 625 448 253.11%
> virtio_scsi.ko 3154 3898 744 23.59%
> uio.ko 2602 4295 1693 65.07%
> nft_fib_ipv6.ko 956 1410 454 47.49%
> cec.ko 28370 29266 896 3.16%
> qemu_fw_cfg.ko 1601 3476 1875 117.11%
> ttm.ko 23672 25727 2055 8.68%
> sd_mod.ko 9976 13030 3054 30.61%
> xfs.ko 574594 926637 352043 61.27%
> libiscsi_tcp.ko 17444 17911 467 2.68%
> ib_cm.ko 32324 62373 30049 92.96%
> aesni-intel.ko 3370 4922 1552 46.05%
> drm_client_lib.ko 27449 27794 345 1.26%
> virtio_pci_modern_dev.ko 2537 2819 282 11.12%
> rdma_cm.ko 32504 51823 19319 59.44%
> fat.ko 11958 13297 1339 11.20%
> dm-log.ko 6529 6986 457 7.00%
> pata_acpi.ko 9231 9700 469 5.08%
> ata_piix.ko 10998 12598 1600 14.55%
> ipt_REJECT.ko 956 1311 355 37.13%
> drm_ttm_helper.ko 33160 33544 384 1.16%
> be2iscsi.ko 55078 56993 1915 3.48%
> i2c-smbus.ko 582 973 391 67.18%
> cuse.ko 8435 9241 806 9.56%
> nft_fib_inet.ko 579 995 416 71.85%
> ib_core.ko 103656 123701 20045 19.34%
> pulse8-cec.ko 9153 9890 737 8.05%
> pvpanic.ko 494 1087 593 120.04%
> dm-mod.ko 31377 35265 3888 12.39%
> raid6_pq.ko 2774 4207 1433 51.66%
> nft_reject_ipv6.ko 517 939 422 81.62%
> cxgb4i.ko 47490 49021 1531 3.22%
> ata_generic.ko 9008 9666 658 7.30%
> vboxvideo.ko 47622 48844 1222 2.57%
> ip_tables.ko 3109 3564 455 14.63%
>
> ALL MODS 9153268 11895301 2742033 29.96%
> vmlinux 5484970 7330096 1845126 33.64%
> TOTAL 14638238 19225397 4587159 31.34%
>
> So this shows a 1.8 MiB increase in vmlinux size, or 33.6%.
> And for these modules in aggregate, an increase of 2.7 MiB or 30.0%.
>
> > Any outliers there?
> > I would expect modules to have few global variables.
>
> In terms of outliers, there are groups that stand out to me:
>
> 1. Large percentage increases are usually always for modules that had
> very tiny BTF before. The module system inherently creates a few
> global variables for each module, so there's always a slight constant
> increase of the BTF size (184 bytes, as far as I can tell), and in those
> cases it can be a quite large percentage. Here's an example,
> "configs.ko" which comes from the CONFIG_IKCONFIG enablement:
>
> BEFORE:
> $ bpftool btf dump file ../build_pahole_novars/kernel/configs.ko -B ../build_pahole_novars/vmlinux
> [127877] CONST '(anon)' type_id=11124
> [127878] ARRAY '(anon)' type_id=127877 index_type_id=21 nr_elems=1
> [127879] CONST '(anon)' type_id=127878
>
> AFTER:
> $ bpftool btf dump file ../build_pahole_vars/kernel/configs.ko -B ../build_pahole_vars/vmlinux
> [162827] CONST '(anon)' type_id=11124
> [162828] ARRAY '(anon)' type_id=162827 index_type_id=21 nr_elems=1
> [162829] CONST '(anon)' type_id=162828
> [162830] VAR '____versions' type_id=162829, linkage=static
> [162831] DATASEC '__versions' size=64 vlen=1
> type_id=162830 offset=0 size=64 (VAR '____versions')
> [162832] VAR 'orc_header' type_id=8667, linkage=static
> [162833] DATASEC '.orc_header' size=20 vlen=1
> type_id=162832 offset=0 size=20 (VAR 'orc_header')
> [162834] VAR '__this_module' type_id=312, linkage=global
> [162835] DATASEC '.gnu.linkonce.this_module' size=1344 vlen=1
> type_id=162834 offset=0 size=1344 (VAR '__this_module')
>
> What is, I think interesting, is that the types in that module were
> totally useless to begin with, because they were used by a variable
> which didn't even get emitted. So while this is a substantial
> percentage-wise increase, I think it's a net improvement for this and
> other modules.
>
> 2. The largest absolute increases come from large, complex modules like
> xfs, kvm, sunrpc, btrfs, etc. For example, xfs had 5696 VAR
> declarations. What is disappointing is how much of this is due to
> automatically-generated "variables" from macros (e.g. tracepoints):
> Here is a list of variable prefixes like that:
>
> print_fmt_*
> trace_event_fields_*
> trace_event_type_funcs_*
> event_*
> __SCK__tp_func_*
> __bpf_trace_tp_map_*
> __event_*
> event_class_*
> TRACE_SYSTEM_*
> __TRACE_SYSTEM_*
> __tracepoint_*
>
> These are, unfortunately, all valid declarations produced by macros and
> they correspond to valid symbols as well. If you look at the kallsyms
> for the modules (and core kernel), these variables are present there as
> well. It may indeed make sense to have kallsyms entries for them: I
> don't know.
>
> These are all, as far as I'm concerned, totally uninteresting types. If
> you want to access any of this data, you probably already know its type
> and wouldn't need a BTF declaration. Unfortunately, the flip side is
> that I don't think we have a good way to automatically detect these,
> outside of prefix matching, which quickly goes out of date as the kernel
> changes, and can have false positives as well. For kernel modules, many
> of these may appear in separate ELF sections, but for vmlinux, they
> don't. I'd be happy to eliminate types for these auto-generated kinds of
> variables, if we could somehow annotate them so that pahole knows to
> ignore them. For instance, maybe we cauld use
>
> __attribute__((btf_decl_tag("btf_omit")))
>
> as an instruction to pahole to omit declarations for these things?
>
All such tracepoint-related variables, can't we just put them into
some separate ELF section, and teach pahole to ignore global variables
from that section? btf_decl_tag is a similar idea, but (currently)
won't work for GCC-built kernels. So I'd go with the ELF section.
> Thanks,
> Stephen
>
> > So before we decide on what to do with vars in mods lets figure out
> > the need.
Powered by blists - more mailing lists