linux-kernel - [BUG] Usersapce MTE error with allocation tag 0 when low on memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <5050805753ac469e8d727c797c2218a9d780d434.camel@mediatek.com>
Date:   Wed, 29 Mar 2023 02:55:49 +0000
From:   Qun-wei Lin (林群崴) 
        <Qun-wei.Lin@...iatek.com>
To:     "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "surenb@...gle.com" <surenb@...gle.com>,
        "david@...hat.com" <david@...hat.com>,
        Chinwen Chang (張錦文) 
        <chinwen.chang@...iatek.com>,
        "kasan-dev@...glegroups.com" <kasan-dev@...glegroups.com>,
        Kuan-Ying Lee (李冠穎) 
        <Kuan-Ying.Lee@...iatek.com>,
        Casper Li (李中榮) <casper.li@...iatek.com>,
        "catalin.marinas@....com" <catalin.marinas@....com>,
        "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
Subject: [BUG] Usersapce MTE error with allocation tag 0 when low on memory

Hi,

We meet the mass MTE errors happened in Android T with kernel-6.1.

When the system is under memory pressure, the MTE often triggers some
error reporting in userspace.

Like the tombstone below, there are many reports with the acllocation
tags of 0:

Build fingerprint:
'alps/vext_k6897v1_64/k6897v1_64:13/TP1A.220624.014/mp2ofp23:userdebug/
dev-keys'
Revision: '0'
ABI: 'arm64'
Timestamp: 2023-03-14 06:39:40.344251744+0800
Process uptime: 0s
Cmdline: /vendor/bin/hw/camerahalserver
pid: 988, tid: 1395, name: binder:988_3  >>>
/vendor/bin/hw/camerahalserver <<<
uid: 1047
tagged_addr_ctrl: 000000000007fff3 (PR_TAGGED_ADDR_ENABLE,
PR_MTE_TCF_SYNC, mask 0xfffe)
signal 11 (SIGSEGV), code 9 (SEGV_MTESERR), fault addr
0x0d000075f1d8d7f0
    x0  00000075018d3fb0  x1  00000000c0306201  x2  00000075018d3ae8  x
3  000000000000720c
    x4  0000000000000000  x5  0000000000000000  x6  00000642000004fe  x
7  0000054600000630
    x8  00000000fffffff2  x9  b34a1094e7e33c3f  x10
00000075018d3a80  x11 00000075018d3a50
    x12 ffffff80ffffffd0  x13 0000061e0000072c  x14
0000000000000004  x15 0000000000000000
    x16 00000077f2dfcd78  x17 00000077da3a8ff0  x18
00000075011bc000  x19 0d000075f1d8d898
    x20 0d000075f1d8d7f0  x21 0d000075f1d8d910  x22
0000000000000000  x23 00000000fffffff7
    x24 00000075018d4000  x25 0000000000000000  x26
00000075018d3ff8  x27 00000000000fc000
    x28 00000000000fe000  x29 00000075018d3b20
    lr  00000077f2d9f164  sp  00000075018d3ad0  pc  00000077f2d9f134  p
st 0000000080001000

backtrace:
      #00 pc 000000000005d134  /system/lib64/libbinder.so
(android::IPCThreadState::talkWithDriver(bool)+244) (BuildId:
8b5612259e4a42521c430456ec5939c7)
      #01 pc 000000000005d448  /system/lib64/libbinder.so
(android::IPCThreadState::getAndExecuteCommand()+24) (BuildId:
8b5612259e4a42521c430456ec5939c7)
      #02 pc 000000000005dd64  /system/lib64/libbinder.so
(android::IPCThreadState::joinThreadPool(bool)+68) (BuildId:
8b5612259e4a42521c430456ec5939c7)
      #03 pc 000000000008dba8  /system/lib64/libbinder.so
(android::PoolThread::threadLoop()+24) (BuildId:
8b5612259e4a42521c430456ec5939c7)
      #04 pc 0000000000013440  /system/lib64/libutils.so
(android::Thread::_threadLoop(void*)+416) (BuildId:
10aac5d4a671e4110bc00c9b69d83d8a)
      #05 pc
00000000000c14cc  /apex/com.android.runtime/lib64/bionic/libc.so
(__pthread_start(void*)+204) (BuildId:
718ecc04753b519b0f6289a7a2fcf117)
      #06 pc
0000000000054930  /apex/com.android.runtime/lib64/bionic/libc.so
(__start_thread+64) (BuildId: 718ecc04753b519b0f6289a7a2fcf117)

Memory tags around the fault address (0xd000075f1d8d7f0), one tag per
16 bytes:
      0x75f1d8cf00: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8d000: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8d100: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8d200: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8d300: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8d400: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8d500: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8d600: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
    =>0x75f1d8d700: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0 [0]
      0x75f1d8d800: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8d900: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8da00: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8db00: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8dc00: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8dd00: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
      0x75f1d8de00: 0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Also happens in coredump.

This problem only occurs when ZRAM is enabled, so we think there are
some issues regarding swap in/out.

Having compared the differences between Kernel-5.15 and Kernel-6.1,
We found the order of swap_free() and set_pte_at() is changed in
do_swap_page().

When fault in, do_swap_page() will call swap_free() first:
do_swap_page() -> swap_free() -> __swap_entry_free() ->
free_swap_slot() -> swapcache_free_entries() -> swap_entry_free() ->
swap_range_free() -> arch_swap_invalidate_page() ->
mte_invalidate_tags_area() ->  mte_invalidate_tags() -> xa_erase()

and then call set_pte_at():
do_swap_page() -> set_pte_at() -> __set_pte_at() -> mte_sync_tags() ->
mte_sync_page_tags() -> mte_restore_tags() -> xa_load()

This means that the swap slot is invalidated before pte mapping, and
this will cause the mte tag in XArray to be released before tag
restore.

After I moved swap_free() to the next line of set_pte_at(), the problem
is disappeared.

We suspect that the following patches, which have changed the order, do
not consider the mte tag restoring in page fault flow:
https://lore.kernel.org/all/20220131162940.210846-5-david@redhat.com/

Any suggestion is appreciated.

Thank you.