lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 8 Jan 2019 14:08:49 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Vlastimil Babka <vbabka@...e.cz>
Cc:     syzbot <syzbot+93d94a001cfbce9e60e1@...kaller.appspotmail.com>,
        aarcange@...hat.com, akpm@...ux-foundation.org,
        kirill.shutemov@...ux.intel.com, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, linux@...inikbrodowski.net, mhocko@...e.com,
        rientjes@...gle.com, syzkaller-bugs@...glegroups.com,
        xieyisheng1@...wei.com, zhongjiang@...wei.com,
        Mel Gorman <mgorman@...hsingularity.net>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>, hannes@...xchg.org
Subject: Re: possible deadlock in __wake_up_common_lock

On Wed, Jan 02, 2019 at 01:51:01PM +0100, Vlastimil Babka wrote:

> > syz-executor0/8529 is trying to acquire lock:
> > 000000005e7fb829 (&pgdat->kswapd_wait){....}, at:  
> > __wake_up_common_lock+0x19e/0x330 kernel/sched/wait.c:120
> 
> From the backtrace at the end of report I see it's coming from
> 
> >   wakeup_kswapd+0x5f0/0x930 mm/vmscan.c:3982
> >   steal_suitable_fallback+0x538/0x830 mm/page_alloc.c:2217
> 
> This wakeup_kswapd is new due to Mel's 1c30844d2dfe ("mm: reclaim small
> amounts of memory when an external fragmentation event occurs") so CC Mel.

Right; and I see Mel already has a fix for that.

> > the existing dependency chain (in reverse order) is:
> > 
> > -> #4 (&(&zone->lock)->rlock){-.-.}:
> >         __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> >         _raw_spin_lock_irqsave+0x99/0xd0 kernel/locking/spinlock.c:152
> >         rmqueue mm/page_alloc.c:3082 [inline]
> >         get_page_from_freelist+0x9eb/0x52a0 mm/page_alloc.c:3491
> >         __alloc_pages_nodemask+0x4f3/0xde0 mm/page_alloc.c:4529
> >         __alloc_pages include/linux/gfp.h:473 [inline]
> >         alloc_page_interleave+0x25/0x1c0 mm/mempolicy.c:1988
> >         alloc_pages_current+0x1bf/0x210 mm/mempolicy.c:2104
> >         alloc_pages include/linux/gfp.h:509 [inline]
> >         depot_save_stack+0x3f1/0x470 lib/stackdepot.c:260
> >         save_stack+0xa9/0xd0 mm/kasan/common.c:79
> >         set_track mm/kasan/common.c:85 [inline]
> >         kasan_kmalloc+0xcb/0xd0 mm/kasan/common.c:482
> >         kasan_slab_alloc+0x12/0x20 mm/kasan/common.c:397
> >         kmem_cache_alloc+0x130/0x730 mm/slab.c:3541
> >         kmem_cache_zalloc include/linux/slab.h:731 [inline]
> >         fill_pool lib/debugobjects.c:134 [inline]
> >         __debug_object_init+0xbb8/0x1290 lib/debugobjects.c:379
> >         debug_object_init lib/debugobjects.c:431 [inline]
> >         debug_object_activate+0x323/0x600 lib/debugobjects.c:512
> >         debug_timer_activate kernel/time/timer.c:708 [inline]
> >         debug_activate kernel/time/timer.c:763 [inline]
> >         __mod_timer kernel/time/timer.c:1040 [inline]
> >         mod_timer kernel/time/timer.c:1101 [inline]
> >         add_timer+0x50e/0x1490 kernel/time/timer.c:1137
> >         __queue_delayed_work+0x249/0x380 kernel/workqueue.c:1533
> >         queue_delayed_work_on+0x1a2/0x1f0 kernel/workqueue.c:1558
> >         queue_delayed_work include/linux/workqueue.h:527 [inline]
> >         schedule_delayed_work include/linux/workqueue.h:628 [inline]
> >         start_dirtytime_writeback+0x4e/0x53 fs/fs-writeback.c:2043
> >         do_one_initcall+0x145/0x957 init/main.c:889
> >         do_initcall_level init/main.c:957 [inline]
> >         do_initcalls init/main.c:965 [inline]
> >         do_basic_setup init/main.c:983 [inline]
> >         kernel_init_freeable+0x4c1/0x5af init/main.c:1136
> >         kernel_init+0x11/0x1ae init/main.c:1056
> >         ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
> > 
> > -> #3 (&base->lock){-.-.}:

However I really, _really_ hate that dependency. We really should not
get memory allocations under rq->lock.

We seem to avoid this for the existing hrtimer usage, because of
hrtimer_init() doing: debug_init() -> debug_hrtimer_init() ->
debug_object_init().

But that isn't done for the (PSI) schedule_delayed_work() thing for some
raisin; even though: group_init() does INIT_DELAYED_WORK() ->
__INIT_DELAYED_WORK() -> __init_timer() -> init_timer_key() ->
debug_init() -> debug_timer_init() -> debug_object_init().

But _somehow_ that isn't doing it.

Now debug_object_activate() has this case:

	if (descr->is_static_object && descr->is_static_object(addr)) {
		debug_object_init()

which does an debug_object_init() for static allocations, which brings
us to:

  static DEFINE_PER_CPU(struct psi_group_cpu, system_group_pcpu);
  static struct psi_group psi_system = {

But that _should_ get initialized by psi_init(), which is called from
sched_init() which _should_ be waaay before do_basic_setup().

Something goes wobbly.. but I'm not seeing it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ