linux-kernel - Re: Question on KASAN calltrace record in RT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+Z5i+MOc+in9DuFj0b6cyyuHur5fpgu4e9-_6i4Luiygw@mail.gmail.com>
Date:   Wed, 14 Apr 2021 09:07:16 +0200
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     "Zhang, Qiang" <Qiang.Zhang@...driver.com>
Cc:     Andrew Halaney <ahalaney@...hat.com>,
        "andreyknvl@...il.com" <andreyknvl@...il.com>,
        "ryabinin.a.a@...il.com" <ryabinin.a.a@...il.com>,
        "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kasan-dev@...glegroups.com" <kasan-dev@...glegroups.com>
Subject: Re: Question on KASAN calltrace record in RT

On Wed, Apr 14, 2021 at 8:58 AM Zhang, Qiang <Qiang.Zhang@...driver.com> wrote:
> ________________________________________
> 发件人: Dmitry Vyukov <dvyukov@...gle.com>
> 发送时间: 2021年4月13日 23:29
> 收件人: Zhang, Qiang
> 抄送: Andrew Halaney; andreyknvl@...il.com; ryabinin.a.a@...il.com; akpm@...ux-foundation.org; linux-kernel@...r.kernel.org; kasan-dev@...glegroups.com
> 主题: Re: Question on KASAN calltrace record in RT
>
> [Please note: This e-mail is from an EXTERNAL e-mail address]
>
> On Tue, Apr 6, 2021 at 10:26 AM Zhang, Qiang <Qiang.Zhang@...driver.com> wrote:
> >
> > Hello everyone
> >
> > In RT system,   after  Andrew test,   found the following calltrace ,
> > in KASAN, we record callstack through stack_depot_save(), in this function, may be call alloc_pages,  but in RT, the spin_lock replace with
> > rt_mutex in alloc_pages(), if before call this function, the irq is disabled,
> > will trigger following calltrace.
> >
> > maybe  add array[KASAN_STACK_DEPTH] in struct kasan_track to record callstack  in RT system.
> >
> > Is there a better solution ？
>
> >Hi Qiang,
> >
> >Adding 2 full stacks per heap object can increase memory usage too >much.
> >The stackdepot has a preallocation mechanism, I would start with
> >adding interrupts check here:
> >https://elixir.bootlin.com/linux/v5.12-rc7/source/lib/stackdepot.c#L294
> >and just not do preallocation in interrupt context. This will solve
> >the problem, right?
>
> It seems to be useful,  however, there are the following situations
> If there is a lot of stack information that needs to be saved in  interrupts,  the memory which has been allocated to hold the stack information is depletion,   when need to save stack again in interrupts,  there will be no memory available .

Yes, this is true. This also true now because we allocate with
GFP_ATOMIC. This is deliberate design decision.
Note that a unique allocation stack is saved only once, so it's enough
to be lucky only once per stack. Also interrupts don't tend to
allocate thousands of objects. So I think all in all it should work
fine in practice.
If it turns out to be a problem, we could simply preallocate more
memory in RT config.

> Thanks
> Qiang
>
>
> > Thanks
> > Qiang
> >
> > BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:951
> > [   14.522262] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 640, name: mount
> > [   14.522304] Call Trace:
> > [   14.522306]  dump_stack+0x92/0xc1
> > [   14.522313]  ___might_sleep.cold.99+0x1b0/0x1ef
> > [   14.522319]  rt_spin_lock+0x3e/0xc0
> > [   14.522329]  local_lock_acquire+0x52/0x3c0
> > [   14.522332]  get_page_from_freelist+0x176c/0x3fd0
> > [   14.522543]  __alloc_pages_nodemask+0x28f/0x7f0
> > [   14.522559]  stack_depot_save+0x3a1/0x470
> > [   14.522564]  kasan_save_stack+0x2f/0x40
> > [   14.523575]  kasan_record_aux_stack+0xa3/0xb0
> > [   14.523580]  insert_work+0x48/0x340
> > [   14.523589]  __queue_work+0x430/0x1280
> > [   14.523595]  mod_delayed_work_on+0x98/0xf0
> > [   14.523607]  kblockd_mod_delayed_work_on+0x17/0x20
> > [   14.523611]  blk_mq_run_hw_queue+0x151/0x2b0
> > [   14.523620]  blk_mq_sched_insert_request+0x2ad/0x470
> > [   14.523633]  blk_mq_submit_bio+0xd2a/0x2330
> > [   14.523675]  submit_bio_noacct+0x8aa/0xfe0
> > [   14.523693]  submit_bio+0xf0/0x550
> > [   14.523714]  submit_bio_wait+0xfe/0x200
> > [   14.523724]  xfs_rw_bdev+0x370/0x480 [xfs]
> > [   14.523831]  xlog_do_io+0x155/0x320 [xfs]
> > [   14.524032]  xlog_bread+0x23/0xb0 [xfs]
> > [   14.524133]  xlog_find_head+0x131/0x8b0 [xfs]
> > [   14.524375]  xlog_find_tail+0xc8/0x7b0 [xfs]
> > [   14.524828]  xfs_log_mount+0x379/0x660 [xfs]
> > [   14.524927]  xfs_mountfs+0xc93/0x1af0 [xfs]
> > [   14.525424]  xfs_fs_fill_super+0x923/0x17f0 [xfs]
> > [   14.525522]  get_tree_bdev+0x404/0x680
> > [   14.525622]  vfs_get_tree+0x89/0x2d0
> > [   14.525628]  path_mount+0xeb2/0x19d0
> > [   14.525648]  do_mount+0xcb/0xf0
> > [   14.525665]  __x64_sys_mount+0x162/0x1b0
> > [   14.525670]  do_syscall_64+0x33/0x40
> > [   14.525674]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > [   14.525677] RIP: 0033:0x7fd6c15eaade