[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMRc=Mf_pvrh2VMfTVE-ZTypyO010p=to-cd8Q745DzSDXLGFw@mail.gmail.com>
Date: Thu, 28 Mar 2024 19:55:22 +0100
From: Bartosz Golaszewski <brgl@...ev.pl>
To: Maximilian Luz <luzmaximilian@...il.com>
Cc: Andy Gross <agross@...nel.org>, Bjorn Andersson <andersson@...nel.org>,
Konrad Dybcio <konrad.dybcio@...aro.org>, Elliot Berman <quic_eberman@...cinc.com>,
Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>,
Guru Das Srinagesh <quic_gurus@...cinc.com>, Andrew Halaney <ahalaney@...hat.com>,
Alex Elder <elder@...aro.org>, Srini Kandagatla <srinivas.kandagatla@...aro.org>,
Arnd Bergmann <arnd@...db.de>, linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, kernel@...cinc.com,
Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
Subject: Re: [PATCH v9 00/13] firmware: qcom: qseecom: convert to using the TZ allocator
On Thu, Mar 28, 2024 at 5:50 PM Maximilian Luz <luzmaximilian@...ilcom> wrote:
>
> On 3/25/24 11:03 AM, Bartosz Golaszewski wrote:
> > From: Bartosz Golaszewski <bartosz.golaszewski@...aro.org>
> >
> > SCM calls that take memory buffers as arguments require that they be
> > page-aligned, physically continuous and non-cachable. The same
> > requirements apply to the buffer used to pass additional arguments to SCM
> > calls that take more than 4.
> >
> > To that end drivers typically use dma_alloc_coherent() to allocate memory
> > of suitable format which is slow and inefficient space-wise.
> >
> > SHM Bridge is a safety mechanism that - once enabled - will only allow
> > passing buffers to the TrustZone that have been explicitly marked as
> > shared. It improves the overall system safety with SCM calls and is
> > required by the upcoming scminvoke functionality.
> >
> > The end goal of this series is to enable SHM bridge support for those
> > architectures that support it but to that end we first need to unify the
> > way memory for SCM calls is allocated. This in itself is beneficial as
> > the current approach of using dma_alloc_coherent() in most places is quite
> > slow.
> >
> > First let's add a new TZ Memory allocator that allows users to create
> > dynamic memory pools of format suitable for sharing with the TrustZone.
> > Make it ready for implementing multiple build-time modes.
> >
> > Convert all relevant drivers to using it. Add separate pools for SCM core
> > and for qseecom.
> >
> > Finally add support for SHM bridge and make it the default mode of
> > operation with the generic allocator as fallback for the platforms that
> > don't support SHM bridge.
> >
> > Tested on db410c, RB5, sm8550-qrd. Previous iteration tested also on
> > sa8775p-ride and lenovo X13s (please do retest on those platforms if you
> > can).
>
> Not sure in which version things changed (I haven't really kept up with
> that, sorry), but this version (with the generic allocator selected in
> the config) fails reading EFI vars on my Surface Pro X (sc8180x):
>
> [ 2.381020] BUG: scheduling while atomic: mount/258/0x00000002
> [ 2.383356] Modules linked in:
> [ 2.385669] Preemption disabled at:
> [ 2.385672] [<ffff800080f7868c>] qcom_tzmem_alloc+0x7c/0x224
> [ 2.390325] CPU: 1 PID: 258 Comm: mount Not tainted 6.8.0-1-surface-dev #2
> [ 2.392632] Hardware name: Microsoft Corporation Surface Pro X/Surface Pro X, BIOS 7.620.140 08/11/2023
> [ 2.394955] Call trace:
> [ 2.397269] dump_backtrace+0x94/0x114
> [ 2.399583] show_stack+0x18/0x24
> [ 2.401883] dump_stack_lvl+0x48/0x60
> [ 2.404181] dump_stack+0x18/0x24
> [ 2.406476] __schedule_bug+0x84/0xa0
> [ 2.408770] __schedule+0x6f4/0x7fc
> [ 2.411051] schedule+0x30/0xf0
> [ 2.413323] synchronize_rcu_expedited+0x158/0x1ec
> [ 2.415594] lru_cache_disable+0x28/0x74
> [ 2.417853] __alloc_contig_migrate_range+0x68/0x210
> [ 2.420122] alloc_contig_range+0x140/0x280
> [ 2.422384] cma_alloc+0x128/0x404
> [ 2.424643] cma_alloc_aligned+0x44/0x6c
> [ 2.426881] dma_alloc_contiguous+0x30/0x44
> [ 2.429111] __dma_direct_alloc_pages.isra.0+0x60/0x20c
> [ 2.431343] dma_direct_alloc+0x6c/0x2ec
> [ 2.433569] dma_alloc_attrs+0x80/0xf4
> [ 2.435786] qcom_tzmem_pool_add_memory+0x8c/0x178
> [ 2.438008] qcom_tzmem_alloc+0xe8/0x224
> [ 2.440232] qsee_uefi_get_next_variable+0x78/0x2cc
> [ 2.442443] qcuefi_get_next_variable+0x50/0x94
> [ 2.444643] efivar_get_next_variable+0x20/0x2c
> [ 2.446832] efivar_init+0x8c/0x29c
> [ 2.449009] efivarfs_fill_super+0xd4/0xec
> [ 2.451182] get_tree_single+0x74/0xbc
> [ 2.453349] efivarfs_get_tree+0x18/0x24
> [ 2.455513] vfs_get_tree+0x28/0xe8
> [ 2.457680] vfs_cmd_create+0x5c/0xf4
> [ 2.459840] __do_sys_fsconfig+0x458/0x598
> [ 2.461993] __arm64_sys_fsconfig+0x24/0x30
> [ 2.464143] invoke_syscall+0x48/0x110
> [ 2.466281] el0_svc_common.constprop.0+0x40/0xe0
> [ 2.468415] do_el0_svc+0x1c/0x28
> [ 2.470546] el0_svc+0x34/0xb4
> [ 2.472669] el0t_64_sync_handler+0x120/0x12c
> [ 2.474793] el0t_64_sync+0x190/0x194
>
> and subsequently
>
> [ 2.477613] DEBUG_LOCKS_WARN_ON(val > preempt_count())
> [ 2.477618] WARNING: CPU: 4 PID: 258 at kernel/sched/core.c:5889 preempt_count_sub+0x90/0xd4
> [ 2.478682] Modules linked in:
> [ 2.479214] CPU: 4 PID: 258 Comm: mount Tainted: G W 68.0-1-surface-dev #2
> [ 2.479752] Hardware name: Microsoft Corporation Surface Pro X/Surface Pro X, BIOS 7.620.140 08/11/2023
> [ 2.480296] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 2.480839] pc : preempt_count_sub+0x90/0xd4
> [ 2.481380] lr : preempt_count_sub+0x90/0xd4
> [ 2.481917] sp : ffff8000857cbb00
> [ 2.482450] x29: ffff8000857cbb00 x28: ffff8000806b759c x27: 8000000000000005
> [ 2.482988] x26: 0000000000000000 x25: ffff0000802cbaa0 x24: ffff0000809228b0
> [ 2.483525] x23: 0000000000000000 x22: ffff800082b462f0 x21: 0000000000001000
> [ 2.484062] x20: ffff80008363d000 x19: ffff000080922880 x18: fffffffffffc9660
> [ 2.484600] x17: 0000000000000020 x16: 0000000000000000 x15: 0000000000000038
> [ 2.485137] x14: 0000000000000000 x13: ffff800082649258 x12: 00000000000006db
> [ 2.485674] x11: 0000000000000249 x10: ffff8000826fc930 x9 : ffff800082649258
> [ 2.486207] x8 : 00000000ffffdfff x7 : ffff8000826f9258 x6 : 0000000000000249
> [ 2.486738] x5 : 0000000000000000 x4 : 40000000ffffe249 x3 : 0000000000000000
> [ 2.487267] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000841fa300
> [ 2.487792] Call trace:
> [ 2.488311] preempt_count_sub+0x90/0xd4
> [ 2.488831] _raw_spin_unlock_irqrestore+0x1c/0x44
> [ 2.489352] qcom_tzmem_alloc+0x1cc/0x224
> [ 2.489873] qsee_uefi_get_next_variable+0x78/0x2cc
> [ 2.490390] qcuefi_get_next_variable+0x50/0x94
> [ 2.490907] efivar_get_next_variable+0x20/0x2c
> [ 2.491420] efivar_init+0x8c/0x29c
> [ 2.491931] efivarfs_fill_super+0xd4/0xec
> [ 2.492440] get_tree_single+0x74/0xbc
> [ 2.492948] efivarfs_get_tree+0x18/0x24
> [ 2.493453] vfs_get_tree+0x28/0xe8
> [ 2.493957] vfs_cmd_create+0x5c/0xf4
> [ 2.494459] __do_sys_fsconfig+0x458/0x598
> [ 2.494963] __arm64_sys_fsconfig+0x24/0x30
> [ 2.495468] invoke_syscall+0x48/0x110
> [ 2.495972] el0_svc_common.constprop.0+0x40/0xe0
> [ 2.496475] do_el0_svc+0x1c/0x28
> [ 2.496976] el0_svc+0x34/0xb4
> [ 2.497475] el0t_64_sync_handler+0x120/0x12c
> [ 2.497975] el0t_64_sync+0x190/0x194
> [ 2.498466] ---[ end trace 0000000000000000 ]---
> [ 2.507347] qcom_scm firmware:scm: qseecom: scm call failed with error -22
> [ 2.507813] efivars: get_next_variable: status=8000000000000007
>
> If I understand correctly, it enters an atomic section in
> qcom_tzmem_alloc() and then tries to schedule somewhere down the line.
> So this shouldn't be qseecom specific.
>
> I should probably also say that I'm currently testing this on a patched
> v6.8 kernel, so there's a chance that it's my fault. However, as far as
> I understand, it enters an atomic section in qcom_tzmem_alloc() and then
> later tries to expand the pool memory with dma_alloc_coherent(). Which
> AFAIK is allowed to sleep with GFP_KERNEL (and I guess that that's the
> issue here).
>
> I've also tried the shmem allocator option, but that seems to get stuck
> quite early at boot, before I even have usb-serial access to get any
> logs. If I can find some more time, I'll try to see if I can get some
> useful output for that.
>
Ah, I think it happens here:
+ guard(spinlock_irqsave)(&pool->lock);
+
+again:
+ vaddr = gen_pool_alloc(pool->genpool, size);
+ if (!vaddr) {
+ if (qcom_tzmem_try_grow_pool(pool, size, gfp))
+ goto again;
We were called with GFP_KERNEL so this is what we pass on to
qcom_tzmem_try_grow_pool() but we're now holding the spinlock. I need
to revisit it. Thanks for the catch!
Bart
Powered by blists - more mailing lists