lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56e1c63a-4c09-4d92-9ef2-aad5390879cc@gmail.com>
Date: Thu, 28 Mar 2024 17:50:51 +0100
From: Maximilian Luz <luzmaximilian@...il.com>
To: Bartosz Golaszewski <brgl@...ev.pl>, Andy Gross <agross@...nel.org>,
 Bjorn Andersson <andersson@...nel.org>,
 Konrad Dybcio <konrad.dybcio@...aro.org>,
 Elliot Berman <quic_eberman@...cinc.com>,
 Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>,
 Guru Das Srinagesh <quic_gurus@...cinc.com>,
 Andrew Halaney <ahalaney@...hat.com>, Alex Elder <elder@...aro.org>,
 Srini Kandagatla <srinivas.kandagatla@...aro.org>,
 Arnd Bergmann <arnd@...db.de>
Cc: linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-arm-kernel@...ts.infradead.org, kernel@...cinc.com,
 Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
Subject: Re: [PATCH v9 00/13] firmware: qcom: qseecom: convert to using the TZ
 allocator

On 3/25/24 11:03 AM, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
> 
> SCM calls that take memory buffers as arguments require that they be
> page-aligned, physically continuous and non-cachable. The same
> requirements apply to the buffer used to pass additional arguments to SCM
> calls that take more than 4.
> 
> To that end drivers typically use dma_alloc_coherent() to allocate memory
> of suitable format which is slow and inefficient space-wise.
> 
> SHM Bridge is a safety mechanism that - once enabled - will only allow
> passing buffers to the TrustZone that have been explicitly marked as
> shared. It improves the overall system safety with SCM calls and is
> required by the upcoming scminvoke functionality.
> 
> The end goal of this series is to enable SHM bridge support for those
> architectures that support it but to that end we first need to unify the
> way memory for SCM calls is allocated. This in itself is beneficial as
> the current approach of using dma_alloc_coherent() in most places is quite
> slow.
> 
> First let's add a new TZ Memory allocator that allows users to create
> dynamic memory pools of format suitable for sharing with the TrustZone.
> Make it ready for implementing multiple build-time modes.
> 
> Convert all relevant drivers to using it. Add separate pools for SCM core
> and for qseecom.
> 
> Finally add support for SHM bridge and make it the default mode of
> operation with the generic allocator as fallback for the platforms that
> don't support SHM bridge.
> 
> Tested on db410c, RB5, sm8550-qrd. Previous iteration tested also on
> sa8775p-ride and lenovo X13s (please do retest on those platforms if you
> can).

Not sure in which version things changed (I haven't really kept up with
that, sorry), but this version (with the generic allocator selected in
the config) fails reading EFI vars on my Surface Pro X (sc8180x):

[    2.381020] BUG: scheduling while atomic: mount/258/0x00000002
[    2.383356] Modules linked in:
[    2.385669] Preemption disabled at:
[    2.385672] [<ffff800080f7868c>] qcom_tzmem_alloc+0x7c/0x224
[    2.390325] CPU: 1 PID: 258 Comm: mount Not tainted 6.8.0-1-surface-dev #2
[    2.392632] Hardware name: Microsoft Corporation Surface Pro X/Surface Pro X, BIOS 7.620.140 08/11/2023
[    2.394955] Call trace:
[    2.397269]  dump_backtrace+0x94/0x114
[    2.399583]  show_stack+0x18/0x24
[    2.401883]  dump_stack_lvl+0x48/0x60
[    2.404181]  dump_stack+0x18/0x24
[    2.406476]  __schedule_bug+0x84/0xa0
[    2.408770]  __schedule+0x6f4/0x7fc
[    2.411051]  schedule+0x30/0xf0
[    2.413323]  synchronize_rcu_expedited+0x158/0x1ec
[    2.415594]  lru_cache_disable+0x28/0x74
[    2.417853]  __alloc_contig_migrate_range+0x68/0x210
[    2.420122]  alloc_contig_range+0x140/0x280
[    2.422384]  cma_alloc+0x128/0x404
[    2.424643]  cma_alloc_aligned+0x44/0x6c
[    2.426881]  dma_alloc_contiguous+0x30/0x44
[    2.429111]  __dma_direct_alloc_pages.isra.0+0x60/0x20c
[    2.431343]  dma_direct_alloc+0x6c/0x2ec
[    2.433569]  dma_alloc_attrs+0x80/0xf4
[    2.435786]  qcom_tzmem_pool_add_memory+0x8c/0x178
[    2.438008]  qcom_tzmem_alloc+0xe8/0x224
[    2.440232]  qsee_uefi_get_next_variable+0x78/0x2cc
[    2.442443]  qcuefi_get_next_variable+0x50/0x94
[    2.444643]  efivar_get_next_variable+0x20/0x2c
[    2.446832]  efivar_init+0x8c/0x29c
[    2.449009]  efivarfs_fill_super+0xd4/0xec
[    2.451182]  get_tree_single+0x74/0xbc
[    2.453349]  efivarfs_get_tree+0x18/0x24
[    2.455513]  vfs_get_tree+0x28/0xe8
[    2.457680]  vfs_cmd_create+0x5c/0xf4
[    2.459840]  __do_sys_fsconfig+0x458/0x598
[    2.461993]  __arm64_sys_fsconfig+0x24/0x30
[    2.464143]  invoke_syscall+0x48/0x110
[    2.466281]  el0_svc_common.constprop.0+0x40/0xe0
[    2.468415]  do_el0_svc+0x1c/0x28
[    2.470546]  el0_svc+0x34/0xb4
[    2.472669]  el0t_64_sync_handler+0x120/0x12c
[    2.474793]  el0t_64_sync+0x190/0x194

and subsequently

[    2.477613] DEBUG_LOCKS_WARN_ON(val > preempt_count())
[    2.477618] WARNING: CPU: 4 PID: 258 at kernel/sched/core.c:5889 preempt_count_sub+0x90/0xd4
[    2.478682] Modules linked in:
[    2.479214] CPU: 4 PID: 258 Comm: mount Tainted: G        W          6.8.0-1-surface-dev #2
[    2.479752] Hardware name: Microsoft Corporation Surface Pro X/Surface Pro X, BIOS 7.620.140 08/11/2023
[    2.480296] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    2.480839] pc : preempt_count_sub+0x90/0xd4
[    2.481380] lr : preempt_count_sub+0x90/0xd4
[    2.481917] sp : ffff8000857cbb00
[    2.482450] x29: ffff8000857cbb00 x28: ffff8000806b759c x27: 8000000000000005
[    2.482988] x26: 0000000000000000 x25: ffff0000802cbaa0 x24: ffff0000809228b0
[    2.483525] x23: 0000000000000000 x22: ffff800082b462f0 x21: 0000000000001000
[    2.484062] x20: ffff80008363d000 x19: ffff000080922880 x18: fffffffffffc9660
[    2.484600] x17: 0000000000000020 x16: 0000000000000000 x15: 0000000000000038
[    2.485137] x14: 0000000000000000 x13: ffff800082649258 x12: 00000000000006db
[    2.485674] x11: 0000000000000249 x10: ffff8000826fc930 x9 : ffff800082649258
[    2.486207] x8 : 00000000ffffdfff x7 : ffff8000826f9258 x6 : 0000000000000249
[    2.486738] x5 : 0000000000000000 x4 : 40000000ffffe249 x3 : 0000000000000000
[    2.487267] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000841fa300
[    2.487792] Call trace:
[    2.488311]  preempt_count_sub+0x90/0xd4
[    2.488831]  _raw_spin_unlock_irqrestore+0x1c/0x44
[    2.489352]  qcom_tzmem_alloc+0x1cc/0x224
[    2.489873]  qsee_uefi_get_next_variable+0x78/0x2cc
[    2.490390]  qcuefi_get_next_variable+0x50/0x94
[    2.490907]  efivar_get_next_variable+0x20/0x2c
[    2.491420]  efivar_init+0x8c/0x29c
[    2.491931]  efivarfs_fill_super+0xd4/0xec
[    2.492440]  get_tree_single+0x74/0xbc
[    2.492948]  efivarfs_get_tree+0x18/0x24
[    2.493453]  vfs_get_tree+0x28/0xe8
[    2.493957]  vfs_cmd_create+0x5c/0xf4
[    2.494459]  __do_sys_fsconfig+0x458/0x598
[    2.494963]  __arm64_sys_fsconfig+0x24/0x30
[    2.495468]  invoke_syscall+0x48/0x110
[    2.495972]  el0_svc_common.constprop.0+0x40/0xe0
[    2.496475]  do_el0_svc+0x1c/0x28
[    2.496976]  el0_svc+0x34/0xb4
[    2.497475]  el0t_64_sync_handler+0x120/0x12c
[    2.497975]  el0t_64_sync+0x190/0x194
[    2.498466] ---[ end trace 0000000000000000 ]---
[    2.507347] qcom_scm firmware:scm: qseecom: scm call failed with error -22
[    2.507813] efivars: get_next_variable: status=8000000000000007

If I understand correctly, it enters an atomic section in
qcom_tzmem_alloc() and then tries to schedule somewhere down the line.
So this shouldn't be qseecom specific.

I should probably also say that I'm currently testing this on a patched
v6.8 kernel, so there's a chance that it's my fault. However, as far as
I understand, it enters an atomic section in qcom_tzmem_alloc() and then
later tries to expand the pool memory with dma_alloc_coherent(). Which
AFAIK is allowed to sleep with GFP_KERNEL (and I guess that that's the
issue here).

I've also tried the shmem allocator option, but that seems to get stuck
quite early at boot, before I even have usb-serial access to get any
logs. If I can find some more time, I'll try to see if I can get some
useful output for that.

Best regards,
Max

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ