lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 1 Mar 2019 14:03:23 -0800
From:   Guenter Roeck <linux@...ck-us.net>
To:     Alexei Starovoitov <ast@...nel.org>
Cc:     Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org,
        bpf@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] bpf: enable program stats

Hi,

On Mon, Feb 25, 2019 at 02:28:39PM -0800, Alexei Starovoitov wrote:
> JITed BPF programs are indistinguishable from kernel functions, but unlike
> kernel code BPF code can be changed often.
> Typical approach of "perf record" + "perf report" profiling and tuning of
> kernel code works just as well for BPF programs, but kernel code doesn't
> need to be monitored whereas BPF programs do.
> Users load and run large amount of BPF programs.
> These BPF stats allow tools monitor the usage of BPF on the server.
> The monitoring tools will turn sysctl kernel.bpf_stats_enabled
> on and off for few seconds to sample average cost of the programs.
> Aggregated data over hours and days will provide an insight into cost of BPF
> and alarms can trigger in case given program suddenly gets more expensive.
> 
> The cost of two sched_clock() per program invocation adds ~20 nsec.
> Fast BPF progs (like selftests/bpf/progs/test_pkt_access.c) will slow down
> from ~10 nsec to ~30 nsec.
> static_key minimizes the cost of the stats collection.
> There is no measurable difference before/after this patch
> with kernel.bpf_stats_enabled=0
> 

This patch causes my qemu tests for 'parisc' to crash. Reverting this patch
as well as "bpf: expose program stats via bpf_prog_info" fixes the problem.

Crash log and bisect results are attached. Bisect ends with the merge;
I identified the two patches manually.

I suspect that
	prog->aux->stats = alloc_percpu_gfp(struct bpf_prog_stats, gfp_flags);
	...
	u64_stats_init(&prog->aux->stats->syncp);
may be wrong. At the very least it looks odd, and I don't find a similar use
of u64_stats_init() anywhere else in the kernel.

Guenter

---
Crash logs:

..
PDC_CHASSIS: Fault (1), CHASSIS  0
Backtrace:
 [<109c6dac>] bpf_prog_create+0x58/0xc4
 [<10131fd0>] 0x10131fd0
 [<101314ac>] 0x101314ac
 [<1018e1c4>] do_one_initcall+0x54/0x1a8
 [<101011e8>] 0x101011e8
 [<10b218d0>] kernel_init+0x1c/0x150
 [<1019601c>] ret_from_kernel_thread+0x1c/0x24
Kernel Fault: Code=18 (Data memory protection/unaligned access trap) at addr 10193fd0
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc8-next-20190301 #1
Hardware name: 9000/778/B160L
     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001000011111100001111 Not tainted
r00-03  00043f0f 1017ec10 10260078 2f83c380
r04-07  00000dc0 00810000 2f834620 10dbbb44
r08-11  10dcf000 00000002 10179528 10e4cc10
r12-15  10e4cc10 1013c010 00000008 f0018f04
r16-19  f001672e 00000001 00000028 1131d000
r20-23  00000000 11370fd8 00000018 00000006
r24-27  00000018 00000000 2f834620 10db9c10
r28-31  10193fc0 00000400 2f83c400 11370fc0
sr00-03  00000000 00000000 00000000 00000000
sr04-07  00000000 00000000 00000000 00000000
IASQ: 00000000 00000000 IAOQ: 10260088 1026008c
 IIR: 6b800020    ISR: 00000000  IOR: 10193fd0
 CPU:        0   CR30: 2f83c000 CR31: 00000000
 ORIG_R28: 00000000
 IAOQ[0]: bpf_prog_alloc+0x50/0x8c
 IAOQ[1]: bpf_prog_alloc+0x54/0x8c
 RP(r2): bpf_prog_alloc+0x40/0x8c
Backtrace:
 [<109c6dac>] bpf_prog_create+0x58/0xc4
 [<10131fd0>] 0x10131fd0
 [<101314ac>] 0x101314ac
 [<1018e1c4>] do_one_initcall+0x54/0x1a8
 [<101011e8>] 0x101011e8
 [<10b218d0>] kernel_init+0x1c/0x150
 [<1019601c>] ret_from_kernel_thread+0x1c/0x24
SeaBIOS wants SYSTEM HALT.

Note: The missing call is from ptp_classifier_init(), called from
sock_init(). The actual crash occurs in u64_stats_init().

---
Another crash log:

...
random: get_random_u32 called from bucket_table_alloc+0xa0/0x1e4 with crng_init=0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
futex hash table entries: 1024 (order: 3, 49152 bytes)
xor: measuring software checksum speed
   8regs     :  1326.000 MB/sec
   8regs_prefetch:   719.000 MB/sec
   32regs    :   722.000 MB/sec
   32regs_prefetch:   690.000 MB/sec
xor: using function: 8regs (1326.000 MB/sec)
PDC_CHASSIS: Fault (1), CHASSIS  0
Backtrace:
 [<109c6dac>] bpf_prog_create+0x58/0xc4
 [<10131fd0>] 0x10131fd0
 [<101314ac>] 0x101314ac
 [<1018e1c4>] do_one_initcall+0x54/0x1a8
 [<101011e8>] 0x101011e8
 [<10b218d0>] kernel_init+0x1c/0x150
 [<1019601c>] ret_from_kernel_thread+0x1c/0x24
Kernel Fault: Code=18 (Data memory protection/unaligned access trap) at addr 10193fd0
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.0-rc8-next-20190301 #1
Hardware name: 9000/778/B160L
     YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00000000000001100011111100001111 Not tainted
r00-03  00063f0f 1017ec10 10260078 2fc3c380
r04-07  00000dc0 00810000 2fc356e0 10dbbb44
r08-11  10dcf000 00000002 10179528 10e4cc10
r12-15  10e4cc10 1013c010 00000008 f0018f04
r16-19  f001672e 00000001 00000028 1131d000
r20-23  00000000 11370fd8 00000018 00000006
r24-27  00000018 00000000 2fc356e0 10db9c10
r28-31  10193fc0 00000400 2fc3c400 11370fc0
sr00-03  00000000 00000000 00000000 00000000
sr04-07  00000000 00000000 00000000 00000000
IASQ: 00000000 00000000 IAOQ: 10260088 1026008c
 IIR: 6b800020    ISR: 00000000  IOR: 10193fd0
 CPU:        0   CR30: 2fc3c000 CR31: 00000000
 ORIG_R28: 00000000
 IAOQ[0]: bpf_prog_alloc+0x50/0x8c
 IAOQ[1]: bpf_prog_alloc+0x54/0x8c
 RP(r2): bpf_prog_alloc+0x40/0x8c
Backtrace:
 [<109c6dac>] bpf_prog_create+0x58/0xc4
 [<10131fd0>] 0x10131fd0
 [<101314ac>] 0x101314ac
 [<1018e1c4>] do_one_initcall+0x54/0x1a8
 [<101011e8>] 0x101011e8
 [<10b218d0>] kernel_init+0x1c/0x150
 [<1019601c>] ret_from_kernel_thread+0x1c/0x24
SeaBIOS wants SYSTEM HALT.

---
# bad: [42fd8df9d1d9c83046226e771a4f5f3d05a706ff] Add linux-next specific files for 20190228
# good: [5908e6b738e3357af42c10e1183753c70a0117a9] Linux 5.0-rc8
git bisect start 'HEAD' 'v5.0-rc8'
# good: [7c3eb312dbd9653e6a932eec87ae55d0a0cc43b6] Merge remote-tracking branch 'net-next/master'
git bisect good 7c3eb312dbd9653e6a932eec87ae55d0a0cc43b6
# bad: [f7ae930a8cb9ca6429ac096c7fdac52560732524] Merge remote-tracking branch 'spi/for-next'
git bisect bad f7ae930a8cb9ca6429ac096c7fdac52560732524
# bad: [45821a15aebf0d27a37dfa9e8fc0ed4e6bfa9307] Merge remote-tracking branch 'drm/drm-next'
git bisect bad 45821a15aebf0d27a37dfa9e8fc0ed4e6bfa9307
# good: [148fb2e2e38769d99e3dfafdc03d06eeedcf7ad3] Merge branch 'drm-next-5.1' of git://people.freedesktop.org/~agd5f/linux into drm-next
git bisect good 148fb2e2e38769d99e3dfafdc03d06eeedcf7ad3
# good: [a5f2fafece141ef3509e686cea576366d55cabb6] Merge https://gitlab.freedesktop.org/drm/msm into drm-next
git bisect good a5f2fafece141ef3509e686cea576366d55cabb6
# bad: [07b71f8ab1b2714eef4f00eed8086fbeb30f85bf] Merge remote-tracking branch 'mtd/mtd/next'
git bisect bad 07b71f8ab1b2714eef4f00eed8086fbeb30f85bf
# bad: [3e23a64449722b73f941341f3a4b91a4963d3b0e] Merge remote-tracking branch 'wireless-drivers-next/master'
git bisect bad 3e23a64449722b73f941341f3a4b91a4963d3b0e
# bad: [236de6acd6dd91716cdc468e6ba3b413324d4064] Merge remote-tracking branch 'netfilter-next/master'
git bisect bad 236de6acd6dd91716cdc468e6ba3b413324d4064
# good: [d2e614cb0795d935aee879e47aab231247274f13] samples: bpf: fix: broken sample regarding removed function
git bisect good d2e614cb0795d935aee879e47aab231247274f13
# good: [03fe5efc4c9c8c7157092d0cbb5d1bbea629ce48] netfilter: nat: remove csum_update hook
git bisect good 03fe5efc4c9c8c7157092d0cbb5d1bbea629ce48
# good: [35acfbab6e5b229ea3c4807d1949c3ab8fda9818] netfilter: remove unneeded switch fall-through
git bisect good 35acfbab6e5b229ea3c4807d1949c3ab8fda9818
# good: [a01cbae57ec29b161d42ee1caa4ffffda5d519c2] netfilter: nft_set_hash: bogus element self comparison from deactivation path
git bisect good a01cbae57ec29b161d42ee1caa4ffffda5d519c2
# good: [123f89c8aab782854d8b27b1c01f3f2cff33c20d] netfilter: nft_set_hash: remove nft_hash_key()
git bisect good 123f89c8aab782854d8b27b1c01f3f2cff33c20d
# bad: [688a5cf30666a8145c3ef8da6d089e6a9cdc414d] Merge remote-tracking branch 'bpf-next/master'
git bisect bad 688a5cf30666a8145c3ef8da6d089e6a9cdc414d
# first bad commit: [688a5cf30666a8145c3ef8da6d089e6a9cdc414d] Merge remote-tracking branch 'bpf-next/master'

Powered by blists - more mailing lists