lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Thu, 14 Feb 2019 14:41:27 +0530
From:   Pintu Agarwal <pintu.ping@...il.com>
To:     Sai Prakash Ranjan <saiprakash.ranjan@...eaurora.org>
Cc:     open list <linux-kernel@...r.kernel.org>,
        linux-arm-kernel@...ts.infradead.org,
        linux-rt-users@...r.kernel.org, linux-mm@...ck.org,
        Jorge Ramirez <jorge.ramirez-ortiz@...aro.org>,
        "Xenomai@...omai.org" <xenomai@...omai.org>
Subject: Re: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:65

Hello Sai,

Thanks so much for your help.

On Thu, Feb 14, 2019 at 12:14 AM Sai Prakash Ranjan
<saiprakash.ranjan@...eaurora.org> wrote:
>
> Hi,
>
> On 2/13/2019 8:10 PM, Pintu Agarwal wrote:
> > OK thanks for your suggestions. sdm845-perf_defconfig did not work for
> > me. The target did not boot.
>
> Perf defconfig works fine. You need to enable serial console with below
> config added to perf defconfig.
>
> CONFIG_SERIAL_MSM_GENI_CONSOLE=y
>
Actually for me the kernel does not boot. It stuck in bootloader, with
"valid dtb not found".
I did not debug it further.
Anyways, we can look into this issue later.

> > However, disabling CONFIG_PANIC_ON_SCHED_BUG works, and I got a root
> > shell at least.
>
> >
> > But this seems to be a work around.
> > I still get a back trace in kernel logs from many different places.
> > So, it looks like there is some code in qualcomm specific drivers that
> > is calling a sleeping method from invalid context.
> > How to find that...
> > If this fix is already available in latest version, please let me know.
> >
>
> Seems like interrupts are disabled when down_write_killable() is called.
> It's not the drivers that is calling the sleeping method which can  be
> seen from the log.
>
> [   22.140224] [<ffffff88b8ce65a8>] ___might_sleep+0x140/0x188
> [   22.145862] [<ffffff88b8ce6648>] __might_sleep+0x58/0x90         <---
> [   22.151249] [<ffffff88b9d43f84>] down_write_killable+0x2c/0x80   <---
> [   22.157155] [<ffffff88b8e53cd8>] setup_arg_pages+0xb8/0x208      <---
> [   22.162792] [<ffffff88b8eb7534>] load_elf_binary+0x434/0x1298
> [   22.168600] [<ffffff88b8e55674>] search_binary_handler+0xac/0x1f0
> [   22.174763] [<ffffff88b8e560ec>]
> do_execveat_common.isra.15+0x504/0x6c8
> [   22.181452] [<ffffff88b8e562f4>] do_execve+0x44/0x58
> [   22.186481] [<ffffff88b8c84030>] run_init_process+0x38/0x48      <---
> [   22.192122] [<ffffff88b9d3db1c>] kernel_init+0x8c/0x108
> [   22.197411] [<ffffff88b8c83f00>] ret_from_fork+0x10/0x50
>
Yes, these are generic API, and I don't expect any changes in here.
We don't have this issue in another SOC 4.9 kernel.
Also I compared these APIs with mainline and there is no major changes here.
This is just one example.
This sleep issue is happening from other places as well.
May be one common similarity may be: during task loading, or switching.

>  >
>  > This at least proves that there is no issue in core ipipe patches, and
>  > I can proceed.
>
> I doubt the *IPIPE patches*. You said you removed the configs, but all
> code are not under IPIPE configs and as I see there are lots of
> changes to interrupt code in general with ipipe.
>
We observed that this issue is happening in normal sdm845 kernel as
well (without ipipe/xenomai patches applied in another branch).
Another point is, we don't see this issue in another arm64 target such
as hikey, with same 4.9 kernel.

> So to actually confirm whether the issue is with qcom drivers or ipipe,
> please *remove ipipe patches (not just configs)* and boot.
> Also paste the full dmesg logs for these 2 cases(with and without
> ipipe).
>
hmmm. This will be little tough.
I will try to find sometime to point the exact cause, and share findings here.
Currently, I am debugging another issue.

Thanks for your help.


Regards

Powered by blists - more mailing lists