netdev - Re: [BUG] BPF splat on latest kernels

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <97786b24-dc45-2638-e0f1-a4ae9b490f7a@gmail.com>
Date:   Fri, 8 Mar 2019 17:59:08 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Daniel Borkmann <daniel@...earbox.net>,
        Alexei Starovoitov <ast@...com>,
        Networking <netdev@...r.kernel.org>,
        Eric Dumazet <edumazet@...gle.com>
Subject: Re: [BUG] BPF splat on latest kernels



On 03/08/2019 04:29 PM, Alexei Starovoitov wrote:
> On Fri, Mar 8, 2019 at 12:33 PM Eric Dumazet <eric.dumazet@...il.com> wrote:
>>
>> Running test_progs on a LOCKDEP enabled kernel (latest David Miller net tree)
>>
>> I got the following splat.
>>
>> It is not immediately obvious to me. Any idea ?
> 
> I think I saw something similar running test_maps,
> but I had other changes in the tree. And later I was
> never able to reproduce it, so figured it must have
> been my local change. Looks like it was for real.
> Do you have a reproducer?

You might need to tweak a bit test_maps [1] to stress the percpu allocator a bit.

Then run test_maps, then test_progs (from tools/testing/selftests/bpf/)  another splat in [2]



[1]

diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
index 3c627771f9654b676a6eca5a143ffcaf0d65d8bc..44c331bf114530057f0d7b2e67effe1649be132a 100644
--- a/tools/testing/selftests/bpf/test_maps.c
+++ b/tools/testing/selftests/bpf/test_maps.c
@@ -164,7 +164,7 @@ static void test_hashmap_percpu(unsigned int task, void *data)
        int fd, i;
 
        fd = bpf_create_map(BPF_MAP_TYPE_PERCPU_HASH, sizeof(key),
-                           sizeof(bpf_percpu(value, 0)), 2, map_flags);
+                           sizeof(bpf_percpu(value, 0)), 10000, map_flags);
        if (fd < 0) {
                printf("Failed to create hashmap '%s'!\n", strerror(errno));
                exit(1);
@@ -213,8 +213,8 @@ static void test_hashmap_percpu(unsigned int task, void *data)
         * inserted due to max_entries limit.
         */
        key = 0;
-       assert(bpf_map_update_elem(fd, &key, value, BPF_NOEXIST) == -1 &&
-              errno == E2BIG);
+//     assert(bpf_map_update_elem(fd, &key, value, BPF_NOEXIST) == -1 &&
+//            errno == E2BIG);
 
        /* Check that key = 0 doesn't exist. */
        assert(bpf_map_delete_elem(fd, &key) == -1 && errno == ENOENT);


[2]
[  215.615945] ======================================================
[  215.622140] WARNING: possible circular locking dependency detected
[  215.628366] 5.0.0-dbg-DEV #597 Not tainted
[  215.632487] ------------------------------------------------------
[  215.638702] kworker/43:158/25017 is trying to acquire lock:
[  215.644306] 0000000098ba7f09 (&(&pool->lock)->rlock){-.-.}, at: __queue_work+0xb2/0x540
[  215.652353] 
               but task is already holding lock:
[  215.658243] 00000000ccf019e1 (pcpu_lock){..-.}, at: free_percpu+0x44/0x270
[  215.665169] 
               which lock already depends on the new lock.

[  215.673366] 
               the existing dependency chain (in reverse order) is:
[  215.680873] 
               -> #4 (pcpu_lock){..-.}:
[  215.685921]        _raw_spin_lock_irqsave+0x3a/0x50
[  215.690823]        pcpu_alloc+0xfa/0x790
[  215.694739]        __alloc_percpu_gfp+0x12/0x20
[  215.699299]        alloc_htab_elem+0x182/0x2b0
[  215.703779]        __htab_percpu_map_update_elem+0x270/0x2c0
[  215.709472]        bpf_percpu_hash_update+0x7c/0x130
[  215.714442]        __do_sys_bpf+0x194a/0x1c10
[  215.718831]        __x64_sys_bpf+0x1a/0x20
[  215.722946]        do_syscall_64+0x5a/0x460
[  215.727135]        entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  215.732742] 
               -> #3 (&htab->buckets[i].lock){....}:
[  215.738942]        _raw_spin_lock_irqsave+0x3a/0x50
[  215.743838]        htab_map_update_elem+0x1b6/0x3a0
[  215.748731] 
               -> #2 (&rq->lock){-.-.}:
[  215.753819]        _raw_spin_lock+0x2f/0x40
[  215.758015]        task_fork_fair+0x37/0x160
[  215.762314]        sched_fork+0x206/0x310
[  215.766353]        copy_process.part.41+0x7eb/0x2370
[  215.771350]        _do_fork+0xda/0x6a0
[  215.775120]        kernel_thread+0x29/0x30
[  215.779240]        rest_init+0x22/0x260
[  215.783086]        arch_call_rest_init+0xe/0x10
[  215.787649]        start_kernel+0x4d8/0x4f9
[  215.791887]        x86_64_start_reservations+0x24/0x26
[  215.797044]        x86_64_start_kernel+0x6f/0x72
[  215.801697]        secondary_startup_64+0xa4/0xb0
[  215.806436] 
               -> #1 (&p->pi_lock){-.-.}:
[  215.811664]        _raw_spin_lock_irqsave+0x3a/0x50
[  215.816559]        try_to_wake_up+0x41/0x610
[  215.820830]        wake_up_process+0x15/0x20
[  215.825095]        create_worker+0x16b/0x1e0
[  215.829363]        workqueue_init+0x1ff/0x31d
[  215.833744]        kernel_init_freeable+0x116/0x2a5
[  215.838651]        kernel_init+0xf/0x180
[  215.842602]        ret_from_fork+0x24/0x30
[  215.846699] 
               -> #0 (&(&pool->lock)->rlock){-.-.}:
[  215.852822]        lock_acquire+0xa7/0x190
[  215.856943]        _raw_spin_lock+0x2f/0x40
[  215.861124]        __queue_work+0xb2/0x540
[  215.865250]        queue_work_on+0x38/0x80
[  215.869356]        free_percpu+0x231/0x270
[  215.873482]        pcpu_freelist_destroy+0x11/0x20
[  215.878275]        htab_map_free+0x48/0x110
[  215.882505]        bpf_map_free_deferred+0x3c/0x50
[  215.887307]        process_one_work+0x1f4/0x590
[  215.891837]        worker_thread+0x6f/0x430
[  215.896041]        kthread+0x132/0x170
[  215.899827]        ret_from_fork+0x24/0x30
[  215.903941] 
               other info that might help us debug this:

[  215.911951] Chain exists of:
                 &(&pool->lock)->rlock --> &htab->buckets[i].lock --> pcpu_lock

[  215.923391]  Possible unsafe locking scenario:

[  215.929362]        CPU0                    CPU1
[  215.933895]        ----                    ----
[  215.938453]   lock(pcpu_lock);
[  215.941528]                                lock(&htab->buckets[i].lock);
[  215.948272]                                lock(pcpu_lock);
[  215.953872]   lock(&(&pool->lock)->rlock);
[  215.958000] 
                *** DEADLOCK ***

[  215.963968] 3 locks held by kworker/43:158/25017:
[  215.968706]  #0: 00000000f6b8c08b ((wq_completion)"events"){+.+.}, at: process_one_work+0x177/0x590
[  215.977788]  #1: 00000000ca9d2a82 ((work_completion)(&map->work)){+.+.}, at: process_one_work+0x177/0x590
[  215.987399]  #2: 00000000ccf019e1 (pcpu_lock){..-.}, at: free_percpu+0x44/0x270
[  215.994747] 
               stack backtrace:
[  215.999145] CPU: 43 PID: 25017 Comm: kworker/43:158 Not tainted 5.0.0-dbg-DEV #597
[  216.006732] Hardware name: Intel RML,PCH/Iota_QC_19, BIOS 2.54.0 06/07/2018
[  216.013738] Workqueue: events bpf_map_free_deferred
[  216.018659] Call Trace:
[  216.021140]  dump_stack+0x67/0x95
[  216.024501]  print_circular_bug.isra.36+0x1fd/0x20b
[  216.029410]  __lock_acquire+0x15b3/0x1850
[  216.033447]  ? __lock_acquire+0x3f4/0x1850
[  216.037577]  lock_acquire+0xa7/0x190
[  216.041165]  ? lock_acquire+0xa7/0x190
[  216.044959]  ? __queue_work+0xb2/0x540
[  216.048712]  _raw_spin_lock+0x2f/0x40
[  216.052404]  ? __queue_work+0xb2/0x540
[  216.056183]  __queue_work+0xb2/0x540
[  216.059789]  queue_work_on+0x38/0x80
[  216.063403]  free_percpu+0x231/0x270
[  216.067009]  pcpu_freelist_destroy+0x11/0x20
[  216.071315]  htab_map_free+0x48/0x110
[  216.075015]  bpf_map_free_deferred+0x3c/0x50
[  216.079310]  process_one_work+0x1f4/0x590
[  216.083317]  worker_thread+0x6f/0x430
[  216.087000]  ? rescuer_thread+0x3d0/0x3d0
[  216.091047]  kthread+0x132/0x170
[  216.094315]  ? kthread_stop+0x200/0x200