lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 26 Jun 2024 09:32:23 +0000
From: Jason-JH Lin (林睿祥) <Jason-JH.Lin@...iatek.com>
To: "jassisinghbrar@...il.com" <jassisinghbrar@...il.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"linux-mediatek@...ts.infradead.org" <linux-mediatek@...ts.infradead.org>,
	Singo Chang (張興國) <Singo.Chang@...iatek.com>,
	"chunkuang.hu@...nel.org" <chunkuang.hu@...nel.org>,
	Nancy Lin (林欣螢) <Nancy.Lin@...iatek.com>,
	Project_Global_Chrome_Upstream_Group
	<Project_Global_Chrome_Upstream_Group@...iatek.com>,
	"linux-arm-kernel@...ts.infradead.org"
	<linux-arm-kernel@...ts.infradead.org>, "matthias.bgg@...il.com"
	<matthias.bgg@...il.com>, "angelogioacchino.delregno@...labora.com"
	<angelogioacchino.delregno@...labora.com>
Subject: Re: [PATCH 2/2] mailbox: mtk-cmdq: Move pm_runimte_get and put to
 mbox_chan_ops API


[snip]

> > > > "I want numbers" is a sensible request, honestly I would do the
> > > same so I totally
> > > > understand that.
> > > >
> > > > Jason, can you please perform latency measurements on 60Hz and
> > > *especially* ISP/cam
> > > > cases while "continuously" calling startup() and shutdown() for
> > > every power saving
> > > > operation?
> >
> > You mean moving clk_builk_enable() and clk_bulk_disable() to
> startup()
> > and shutdown(), and then calling startup() in cmdq_runtime_resume()
> and
> > calling shutdown() in cmdq_runtime_suspend()?
> > If we do that, I think GCE clocks will never be closed because
> > mbox_request_channel() and mbox_free_channel() will call startup()
> and
> > shutdown() already.
> >
> > I found that display thread:0 will call mbox_send_message() every
> 16ms
> > when I moving UI continuously:
> > [   61.503604] thread_id:0, send_diff:16 ms
> > [   61.520266] thread_id:0, send_diff:16 ms
> > [   61.537143] thread_id:0, send_diff:16 ms
> > [   61.553609] thread_id:0, send_diff:16 ms
> > [   61.570811] thread_id:0, send_diff:17 ms
> > [   61.587075] thread_id:0, send_diff:16 ms
> > [   61.603527] thread_id:0, send_diff:16 ms
> > [   61.620082] thread_id:0, send_diff:16 ms
> > [   61.637496] thread_id:0, send_diff:17 ms
> > [   61.653876] thread_id:0, send_diff:16 ms
> > [   61.670162] thread_id:0, send_diff:16 ms
> > [   61.687318] thread_id:0, send_diff:17 ms
> > [   61.703747] thread_id:0, send_diff:16 ms
> > [   61.720656] thread_id:0, send_diff:16 ms
> > If I didn't do anything to UI, I won't see any mbox_send_message()
> > logs.
> >
> > It means mbox_send_message() will only be called when the display
> is
> > updated:
> > [  782.777205] thread_id:0, send_diff:20072 ms
> > [  820.296738] thread_id:0, send_diff:37519 ms
> > [  863.034270] thread_id:0, send_diff:42737 ms
> > [  880.297759] thread_id:0, send_diff:17263 ms
> > [  917.795432] thread_id:0, send_diff:37497 ms
> >
> >
> > When opening camera APP, it seems to become 30fps:
> > [  120.551778] BUG: sleeping function called from invalid context
> at
> > drivers/base/power/runtime.c:1164
> > [  120.560844] in_atomic(): 1, irqs_disabled(): 128, non_block: 0,
> pid:
> > 201, name: kworker/u17:1
> > [  120.569363] preempt_count: 1, expected: 0
> > [  120.573367] RCU nest depth: 0, expected: 0
> > [  120.577457] INFO: lockdep is turned off.
> > [  120.581372] irq event stamp: 0
> > [  120.584420] hardirqs last  enabled at (0): [<0000000000000000>]
> 0x0
> > [  120.590687] hardirqs last disabled at (0): [<ffffffd3624a86e0>]
> > copy_process+0x578/0xf10
> > [  120.598780] softirqs last  enabled at (0): [<ffffffd3624a86ec>]
> > copy_process+0x584/0xf10
> > [  120.606865] softirqs last disabled at (0): [<0000000000000000>]
> 0x0
> > [  120.613128] CPU: 0 PID: 201 Comm: kworker/u17:1 Not tainted
> 6.1.92-
> > lockdep-14419-g8c5f20e56598-dirty #1
> > e971112e83e796239a41e46e8747794c737fb18c
> > [  120.626073] Hardware name: Google Ciri sku0/unprovisioned board
> (DT)
> > [  120.632420] Workqueue: imgsys_runner imgsys_runner_func
> > [  120.637647] Call trace:
> > [  120.640086]  dump_backtrace+0x100/0x120
> > [  120.643919]  show_stack+0x20/0x2c
> > [  120.647230]  dump_stack_lvl+0x84/0xb4
> > [  120.650891]  dump_stack+0x18/0x44
> > [  120.654203]  __might_resched+0x204/0x214
> > [  120.658123]  __might_sleep+0x50/0x80
> > [  120.661692]  __pm_runtime_resume+0x48/0xbc
> > [  120.665786]  cmdq_mbox_send_data+0x8c/0x434
> > [  120.669966]  msg_submit+0x84/0x104
> > [  120.673362]  mbox_send_message+0xb8/0x124
> > [  120.677365]  imgsys_cmdq_sendtask+0x688/0x928
> > [  120.681717]  imgsys_runner_func+0x44/0x78
> > [  120.685722]  process_one_work+0x274/0x574
> > [  120.689731]  worker_thread+0x240/0x3f0
> > [  120.693475]  kthread+0xfc/0x11c
> > [  120.696612]  ret_from_fork+0x10/0x20
> > [  120.710570] mtk-cam raw 16030000.cam-raw-a: [SOF] frame done
> reading
> > lost frames. req(7),ts(120697780083)
> > [  120.720352] mtk-cam raw 16030000.cam-raw-a: [SOF] SCQ_DELAY
> state:4
> > ts:120697780083
> > [  120.990091] thread_id:0, send_diff:1470 ms
> > [  121.007210] thread_id:0, send_diff:17 ms
> > [  121.016484] thread_id:0, send_diff:9 ms
> > [  121.032652] thread_id:0, send_diff:16 ms
> > [  121.049366] thread_id:0, send_diff:16 ms
> > [  121.066880] thread_id:0, send_diff:17 ms
> > [  121.087023] thread_id:0, send_diff:20 ms
> > [  121.099325] thread_id:0, send_diff:12 ms
> > [  121.135286] thread_id:0, send_diff:35 ms
> > [  121.149349] thread_id:0, send_diff:14 ms
> > [  121.179259] thread_id:0, send_diff:29 ms
> > [  121.203183] thread_id:0, send_diff:23 ms
> > [  121.242234] thread_id:0, send_diff:39 ms
> > [  121.280645] thread_id:0, send_diff:38 ms
> > [  121.318651] thread_id:0, send_diff:38 ms
> > [  121.356366] thread_id:0, send_diff:37 ms
> > [  121.394484] thread_id:0, send_diff:38 ms
> >
> > The cmdq_runtime_suspend() will only be called when I disabled
> camera
> > APP:
> > [  468.137530] pm_on ts:465259, pm_off ts:4681375, diff:2877 ms
> > and the diff is the duration from opening the camera APP to closing
> the
> > camera APP.
> >
> > From booting to seeing the UI and any operation of UI, I didn't see
> > any cmdq_runtime_suspend() logs from display mbox channel.
> > So cmdq_runtime_resume() and cmdq_runtime_suspend() are never
> called by
> > display mbox channel.
> > This can tell why only opening the camera APP can see this BUG
> report.
> >
> > > >
> > > To be clear, do with the mailbox channel that you do with the
> clocks
> > > now, because your startup() is literally and shutdown() is
> > > practically
> > > empty.
> > > Call shutdown() when no request has come in for a while, so you
> know
> > > the client has quiesced likely. So try putting request/release in
> > > higher level RPM with autosuspend.
> >
> > I'm not quit sure how to modify this from current usage.
> > Do you mean:
> > 1. Move pm_runtime_get_sync() from cmdq_mbox_semd_data() to
> startup()
> > and keep pm_runtime_put_autosuspend() in cmdq_mbox_semd_data()?
> >
> > 2. Call shutdown() in cmdq_runtime_suspend()?
> >
> No shutdown() is called from channel release. Maybe release the
> channel from suspend() of user driver (imgsys).

When users call mbox_channel_request(), there is nothing to do with HW,
so CMDQ driver leaves startup() empty.
When users call mbox_sned_message(), users want to use HW to execute
some instruction, so CMDQ driver needs to enable the power/clocks in
this function.
When users call mbox_free_channel() or mbox_flush(), CMDQ driver needs
to check the HW executing status and clears all the executing tasks, so
it also needs to enable power/clocks in shutdown() and flush() to
control the HW.

The power saving goal is to disable the power/clk more than the
configurable timeout (100ms currently) after enabling it.
So I think calling suspend() of user driver can not achieve this.
Please correct me if I'm wrong :)

> 
> The idea is that
> 1) If the gap between transfers on a busy channel is ~10ms or
> more.  And ..
> 2) The silence on the mailbox channel is multiple of 100ms. And ...

The mailbox channel would not be always busy and the gap is not
consistent as well.
For example, display driver calls mbox_send_message() every ~16ms while
UI is updating in 60fps.
If UI dose not update, display driver won't call mbox_send_message(),
so user may not has the gap and the silence like this.

> 3) startup() and shutdown() are empty (as in your driver)
>   ... then you could afford to release/suspend and resume/request
> during the silence period because the _api_ cost of request/release
> should not even be 1ms. Your stack may be designed to make doing that
> difficult but numbers wise it should work.

Yes, that's true, but I think we can not achieve the goal during the
suspend/resume stage.

Currently, we use the pm_runtime_get_sync() and
pm_runtime_auto_suspend() to achieve this.
So I think adding the PM function hook in mailbox driver is much better
for users driver and also more flexible for the controller driver.

Please correct me if there is something I misunderstood :)

Regards,
Jason-JH.Lin

> 
> Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ