lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <CAC=wTOhvDFg+9BCQhVhdP9N4HL6iETy+xe1mTGQw1NFPBo+c4A@mail.gmail.com>
Date:   Tue, 14 Dec 2021 19:58:34 +0000
From:   Chris Ward <tjcw01@...il.com>
To:     linux-kernel@...r.kernel.org
Subject: Kernel crash on ARM64

Please personally cc me on answers/comments as I am not currently
subscribed to the LKML. Apologies for the formatting; I was trying to
use my business email to send this and that has some html in which got
rejected by your filtering system.

My team has a problem which is being bounced between Canonical support
and Xilinx support.
We are using kernel 5.4.0-xilinx-v2020.2 built from sources under
https://github.com/Xilinx/linux-xlnx with a Ubuntu 20.04 userland on
an ARM64 embedded linux machine (i.e. not x86-64). When trying to set
up a file system on a ramdisk, we get a kernel crash for sizes of
ramdisk larger than 2GB while trying to 'dd if=/dev/zero ...'  in
preparation for issuing mkfs.
The first few lines of the crash message are
[   36.082810] cloud-init[858]: 2068-03-21 21:24:22,477 -
cc_final_message.py[WARNING]: Used fallback datae
[   40.413307] overlayfs: filesystem on
'/var/lib/docker/check-overlayfs-support002292139/upper' not suppor
[   40.937166] overlayfs: filesystem on
'/var/lib/docker/check-overlayfs-support240114958/upper' not suppor
[  112.624740] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000000
[  112.633516] Mem abort info:
[  112.636291]   ESR = 0x96000004
[  112.639330]   EC = 0x25: DABT (current EL), IL = 32 bits
[  112.644624]   SET = 0, FnV = 0
[  112.647662]   EA = 0, S1PTW = 0
[  112.650786] Data abort info:
[  112.653651]   ISV = 0, ISS = 0x00000004
[  112.657470]   CM = 0, WnR = 0
[  112.660423] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000b4f60b000
[  112.666845] [0000000000000000] pgd=0000000000000000
[  112.671707] Internal error: Oops: 96000004 [#1] SMP
[  112.676567] Modules linked in: br_netfilter
[  112.680747] CPU: 2 PID: 1060 Comm: dd Not tainted 5.4.0-xilinx-v2020.2 #1
[  112.687521] Hardware name: xlnx,zynqmp (DT)
[  112.691689] pstate: a0000085 (NzCv daIf -PAN -UAO)
[  112.696472] pc : __wake_up_common+0x58/0x170
[  112.700726] lr : __wake_up_common_lock+0x98/0x110
[  112.705410] sp : ffff800011ad3520
[  112.708709] x29: ffff800011ad3520 x28: 0000000000000080
[  112.714003] x27: ffff800011ad3650 x26: 0000000000000000
[  112.719298] x25: 0000000000000003 x24: 0000000000000000
[  112.724593] x23: 0000000000000001 x22: ffff800011ad35f0
[  112.729888] x21: ffff8000110d5d88 x20: 0000000000000001
[  112.735183] x19: ffff8000110d5d80 x18: fffffe002d6cd0c8
[  112.740477] x17: ffff000b6c001008 x16: ffff000b6c001028
[  112.745772] x15: 0001000000000000 x14: 0000000000000000
[  112.751067] x13: 0000000000000000 x12: 0000000000000000
[  112.756362] x11: 0000000000000000 x10: 0000000000000000
[  112.761657] x9 : 0000000000000000 x8 : 0000000000000000
[  112.766952] x7 : 0000000000000000 x6 : ffffffffffffffe8
[  112.772247] x5 : ffff800011ad35f0 x4 : ffff800011ad3650
[  112.777541] x3 : 0000000000000000 x2 : 0000000000000001
[  112.782836] x1 : 0000000000000000 x0 : 0000000000000000
[  112.788132] Call trace:
[  112.790565]  __wake_up_common+0x58/0x170
[  112.794479]  __wake_up_common_lock+0x98/0x110
[  112.798819]  __wake_up+0x14/0x20
[  112.802029]  wake_up_bit+0x78/0xa0
[  112.805416]  unlock_buffer+0x2c/0x38
[  112.808974]  end_buffer_async_write+0x98/0x1c0
[  112.813401]  end_bio_bh_io_sync+0x30/0x60

I will attach the full kernel log from boot to crash. Some questions:
1) Is it possible that there is an incompatibility between the Ubuntu
userland and the Xilinx kernel ? I think it is not possible to have an
incompatibility here, which would land the support question solidly in
Xilinx' court.
2) Is there a known problem with this kernel level on ARM64 hardware ?
3) Would it be likely to be productive to move to a newer Xilinx kernel ?
4) If I have to debug this myself, where do I start ?

Thanks all !
T J (Chris) Ward, IBM Research.
Scalable Data-Centric Computing - IBM Spectrum MPI
IBM United Kingdom Ltd., Hursley Park, Winchester, Hants, SO21 2JN
011-44-1962-818679
LinkedIn https://www.linkedin.com/in/tjcward/
ResearchGate https://www.researchgate.net/profile/Thomas_Ward16

IBM Research -- Data Centric Systems
IBM Supercomputer Marketing

IBM Branded Products IBM Branded Swag


UNIX in the Cloud - Find A Place Where There's Room To Grow, with the
original Open Standard. Free Trial Here Today
Protein Folding by Supercomputer - BlueMatter Molecular Dynamics package.
Data Tables - In-memory key-value store package.
Linux on Windows - Virtualisation package. On the Lighthouse !



Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Attachments area
Chris Ward (TJCW@...ibm.com)


On Tue, 14 Dec 2021 at 17:14, Chris Ward <TJCW@...ibm.com> wrote:
>
>
> T J (Chris) Ward, IBM Research.
> Scalable Data-Centric Computing - IBM Spectrum MPI
> IBM United Kingdom Ltd., Hursley Park, Winchester, Hants, SO21 2JN
> 011-44-1962-818679
>
>
> To:
>  linux-kernel@...r.kernel.org
> cc:
>  "Mohit Kapur" <mohitk@...ibm.com>, "Ralph Bellofatto" <Ralph.Bellofatto@....com>
> Date:
>  12:51:45 PM Today
> Subject:
>  Kernel crash on ARM64
>
> Please personally cc me on answers/comments as I am not currently subscribed to the LKML.
>
> My team has a problem which is being bounced between Canonical support and Xilinx support.
> We are using kernel 5.4.0-xilinx-v2020.2 built from sources under https://github.com/Xilinx/linux-xlnx with a Ubuntu 20.04 userland on an ARM64 embedded linux machine (i.e. not x86-64). When trying to set up a file system on a ramdisk, we get a kernel crash for sizes of ramdisk larger than 2GB while trying to 'dd if=/dev/zero ...'  in preparation for issuing mkfs.
> The first few lines of the crash message are
> [   36.082810] cloud-init[858]: 2068-03-21 21:24:22,477 - cc_final_message.py[WARNING]: Used fallback datae
> [   40.413307] overlayfs: filesystem on '/var/lib/docker/check-overlayfs-support002292139/upper' not suppor
> [   40.937166] overlayfs: filesystem on '/var/lib/docker/check-overlayfs-support240114958/upper' not suppor
> [  112.624740] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
> [  112.633516] Mem abort info:
> [  112.636291]   ESR = 0x96000004
> [  112.639330]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  112.644624]   SET = 0, FnV = 0
> [  112.647662]   EA = 0, S1PTW = 0
> [  112.650786] Data abort info:
> [  112.653651]   ISV = 0, ISS = 0x00000004
> [  112.657470]   CM = 0, WnR = 0
> [  112.660423] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000b4f60b000
> [  112.666845] [0000000000000000] pgd=0000000000000000
> [  112.671707] Internal error: Oops: 96000004 [#1] SMP
> [  112.676567] Modules linked in: br_netfilter
> [  112.680747] CPU: 2 PID: 1060 Comm: dd Not tainted 5.4.0-xilinx-v2020.2 #1
> [  112.687521] Hardware name: xlnx,zynqmp (DT)
> [  112.691689] pstate: a0000085 (NzCv daIf -PAN -UAO)
> [  112.696472] pc : __wake_up_common+0x58/0x170
> [  112.700726] lr : __wake_up_common_lock+0x98/0x110
> [  112.705410] sp : ffff800011ad3520
> [  112.708709] x29: ffff800011ad3520 x28: 0000000000000080
> [  112.714003] x27: ffff800011ad3650 x26: 0000000000000000
> [  112.719298] x25: 0000000000000003 x24: 0000000000000000
> [  112.724593] x23: 0000000000000001 x22: ffff800011ad35f0
> [  112.729888] x21: ffff8000110d5d88 x20: 0000000000000001
> [  112.735183] x19: ffff8000110d5d80 x18: fffffe002d6cd0c8
> [  112.740477] x17: ffff000b6c001008 x16: ffff000b6c001028
> [  112.745772] x15: 0001000000000000 x14: 0000000000000000
> [  112.751067] x13: 0000000000000000 x12: 0000000000000000
> [  112.756362] x11: 0000000000000000 x10: 0000000000000000
> [  112.761657] x9 : 0000000000000000 x8 : 0000000000000000
> [  112.766952] x7 : 0000000000000000 x6 : ffffffffffffffe8
> [  112.772247] x5 : ffff800011ad35f0 x4 : ffff800011ad3650
> [  112.777541] x3 : 0000000000000000 x2 : 0000000000000001
> [  112.782836] x1 : 0000000000000000 x0 : 0000000000000000
> [  112.788132] Call trace:
> [  112.790565]  __wake_up_common+0x58/0x170
> [  112.794479]  __wake_up_common_lock+0x98/0x110
> [  112.798819]  __wake_up+0x14/0x20
> [  112.802029]  wake_up_bit+0x78/0xa0
> [  112.805416]  unlock_buffer+0x2c/0x38
> [  112.808974]  end_buffer_async_write+0x98/0x1c0
> [  112.813401]  end_bio_bh_io_sync+0x30/0x60
>
> I will attach the full kernel log from boot to crash. Some questions:
> 1) Is it possible that there is an incompatibility between the Ubuntu userland and the Xilinx kernel ? I think it is not possible to have an incompatibility here, which would land the support question solidly in Xilinx' court.
> 2) Is there a known problem with this kernel level on ARM64 hardware ?
> 3) Would it be likely to be productive to move to a newer Xilinx kernel ?
> 4) If I have to debug this myself, where do I start ?
>
>
>
> Thanks all !
> T J (Chris) Ward, IBM Research.
> Scalable Data-Centric Computing - IBM Spectrum MPI
> IBM United Kingdom Ltd., Hursley Park, Winchester, Hants, SO21 2JN
> 011-44-1962-818679
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
>
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 741598.
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

View attachment "crash.txt" of type "text/plain" (68968 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ