[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <895af04e45d286c38d01f48b29ad41598b7dadb7.camel@mediatek.com>
Date: Tue, 25 Jun 2024 03:40:59 +0000
From: Jason-JH Lin (林睿祥) <Jason-JH.Lin@...iatek.com>
To: "jassisinghbrar@...il.com" <jassisinghbrar@...il.com>,
"angelogioacchino.delregno@...labora.com"
<angelogioacchino.delregno@...labora.com>
CC: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mediatek@...ts.infradead.org" <linux-mediatek@...ts.infradead.org>,
Singo Chang (張興國) <Singo.Chang@...iatek.com>,
"chunkuang.hu@...nel.org" <chunkuang.hu@...nel.org>,
Nancy Lin (林欣螢) <Nancy.Lin@...iatek.com>,
Project_Global_Chrome_Upstream_Group
<Project_Global_Chrome_Upstream_Group@...iatek.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "matthias.bgg@...il.com"
<matthias.bgg@...il.com>
Subject: Re: [PATCH 2/2] mailbox: mtk-cmdq: Move pm_runimte_get and put to
mbox_chan_ops API
On Mon, 2024-06-24 at 12:45 -0500, Jassi Brar wrote:
>
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
> On Mon, Jun 24, 2024 at 6:29 AM AngeloGioacchino Del Regno
> <angelogioacchino.delregno@...labora.com> wrote:
> >
> > Il 20/06/24 16:39, Jassi Brar ha scritto:
> > > On Thu, Jun 20, 2024 at 1:33 AM Jason-JH Lin (林睿祥)
> > > <Jason-JH.Lin@...iatek.com> wrote:
> > >>
> > >> On Wed, 2024-06-19 at 10:38 -0500, Jassi Brar wrote:
> > >>>
> > >>> External email : Please do not click links or open attachments
> until
> > >>> you have verified the sender or the content.
> > >>> On Wed, Jun 19, 2024 at 3:18 AM AngeloGioacchino Del Regno
> > >>> <angelogioacchino.delregno@...labora.com> wrote:
> > >>>> Il 18/06/24 17:59, Jassi Brar ha scritto:
> > >>> .....
> > >>>
> > >>>> For example, when static content is displayed on screen, the
> CMDQ
> > >>> mailbox never
> > >>>> gets shut down, but no communication happens for a relatively
> long
> > >>> time; the
> > >>>> overhead of actually shutting down the mailbox and setting it
> back
> > >>> up would be
> > >>>> increasing latency in an unacceptable manner.
> > >>>>
> > >>> Hmm... in your driver, startup() is _empty_ and shutdown()
> is
> > >>> all
> > >>> under a spin-lock with irqs disabled, so that too shouldn't be
> > >>> expensive. Right?
> > >>> Then what causes unacceptable latencies?
> > >>>
> > >>
> > >> I found that the BUG report only occurred when opening the
> camera APP.
> > >> Maybe something in imgsys_cmdq_sendtask() is too expensive or
> somewhere
> > >> else in imgsys driver.
> > >>
> > > If you move anything from submit() into startup(), which is once
> per
> > > lifetime of a channel, it will only make imgsys_cmdq_sendtask()
> > > cheaper.
> > > Btw, the imgsys code is not public, I don't know how it looks.
> > >
It'll be upstreamed in the future, so I can only test it locally to
find the information we want.
> > >
> > >> I'm wondering why this BUG report is not occurred in display use
> case
> > >> which is more frequent than imgsys use case.
> > >> Does this mean sleep() is not always called in
> pm_runtime_get_sync()
> > >> and if we can guarantee this sleep() won't be called, then using
> > >> pm_runtime_get_sync() in atomic context is OK?
> > >>
> > > Instead of hacking around, maybe try using startup() and
> shutdown()
> > > which is for such uses? Maybe request/release channel as part of
> RPM
> > > in your client driver if you are worried about the noise?
> > >
> > >
> > >>>> This is why I opted for autosuspend - it's only bringing down
> > >>> certain clocks for
> > >>>> the CMDQ HW, adding up a bit of power saving to the mix which,
> for
> > >>> some use cases
> > >>>> such as mobile devices with relatively small batteries, is
> > >>> definitely important.
> > >>>>
> > >>>> I'll also briefly (and only briefly) mention that 120Hz
> displays
> > >>> are already a
> > >>>> common thing and in this case the gap between TX and ACK is
> ~8.33ms
> > >>> instead, let
> > >>>> alone that displays with a framerate of more than 120Hz also
> do
> > >>> exist even though
> > >>>> they're less common.
> > >>>>
> > >>> I don't know how even busier channels help your point.
> > >>>
> > >>>> All of the above describes a few of the reasons why
> autosuspend is
> > >>> a good choice
> > >>>> here, instead of a shutdown->startup flow.
> > >>>> And again - I can place some bets that PM would also be
> applicable
> > >>> to SoCs from
> > >>>> other vendors as well, with most probably different benefits
> (but
> > >>> still with some
> > >>>> power related benefits!) compared to MediaTek.
> > >>
> > >> I agree with Angelo's point!
> > >>
> > > Ok, but you or Angelo still don't explain "unacceptable
> latencies"
> > > when your startup() and shutdown() are empty. You just want api
> > > changed, which we can but at least do you part and tell me where
> the
> > > bottleneck (unexpected latency) comes from.
> > >
> >
> > "I want numbers" is a sensible request, honestly I would do the
> same so I totally
> > understand that.
> >
> > Jason, can you please perform latency measurements on 60Hz and
> *especially* ISP/cam
> > cases while "continuously" calling startup() and shutdown() for
> every power saving
> > operation?
You mean moving clk_builk_enable() and clk_bulk_disable() to startup()
and shutdown(), and then calling startup() in cmdq_runtime_resume() and
calling shutdown() in cmdq_runtime_suspend()?
If we do that, I think GCE clocks will never be closed because
mbox_request_channel() and mbox_free_channel() will call startup() and
shutdown() already.
I found that display thread:0 will call mbox_send_message() every 16ms
when I moving UI continuously:
[ 61.503604] thread_id:0, send_diff:16 ms
[ 61.520266] thread_id:0, send_diff:16 ms
[ 61.537143] thread_id:0, send_diff:16 ms
[ 61.553609] thread_id:0, send_diff:16 ms
[ 61.570811] thread_id:0, send_diff:17 ms
[ 61.587075] thread_id:0, send_diff:16 ms
[ 61.603527] thread_id:0, send_diff:16 ms
[ 61.620082] thread_id:0, send_diff:16 ms
[ 61.637496] thread_id:0, send_diff:17 ms
[ 61.653876] thread_id:0, send_diff:16 ms
[ 61.670162] thread_id:0, send_diff:16 ms
[ 61.687318] thread_id:0, send_diff:17 ms
[ 61.703747] thread_id:0, send_diff:16 ms
[ 61.720656] thread_id:0, send_diff:16 ms
If I didn't do anything to UI, I won't see any mbox_send_message()
logs.
It means mbox_send_message() will only be called when the display is
updated:
[ 782.777205] thread_id:0, send_diff:20072 ms
[ 820.296738] thread_id:0, send_diff:37519 ms
[ 863.034270] thread_id:0, send_diff:42737 ms
[ 880.297759] thread_id:0, send_diff:17263 ms
[ 917.795432] thread_id:0, send_diff:37497 ms
When opening camera APP, it seems to become 30fps:
[ 120.551778] BUG: sleeping function called from invalid context at
drivers/base/power/runtime.c:1164
[ 120.560844] in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid:
201, name: kworker/u17:1
[ 120.569363] preempt_count: 1, expected: 0
[ 120.573367] RCU nest depth: 0, expected: 0
[ 120.577457] INFO: lockdep is turned off.
[ 120.581372] irq event stamp: 0
[ 120.584420] hardirqs last enabled at (0): [<0000000000000000>] 0x0
[ 120.590687] hardirqs last disabled at (0): [<ffffffd3624a86e0>]
copy_process+0x578/0xf10
[ 120.598780] softirqs last enabled at (0): [<ffffffd3624a86ec>]
copy_process+0x584/0xf10
[ 120.606865] softirqs last disabled at (0): [<0000000000000000>] 0x0
[ 120.613128] CPU: 0 PID: 201 Comm: kworker/u17:1 Not tainted 6.1.92-
lockdep-14419-g8c5f20e56598-dirty #1
e971112e83e796239a41e46e8747794c737fb18c
[ 120.626073] Hardware name: Google Ciri sku0/unprovisioned board (DT)
[ 120.632420] Workqueue: imgsys_runner imgsys_runner_func
[ 120.637647] Call trace:
[ 120.640086] dump_backtrace+0x100/0x120
[ 120.643919] show_stack+0x20/0x2c
[ 120.647230] dump_stack_lvl+0x84/0xb4
[ 120.650891] dump_stack+0x18/0x44
[ 120.654203] __might_resched+0x204/0x214
[ 120.658123] __might_sleep+0x50/0x80
[ 120.661692] __pm_runtime_resume+0x48/0xbc
[ 120.665786] cmdq_mbox_send_data+0x8c/0x434
[ 120.669966] msg_submit+0x84/0x104
[ 120.673362] mbox_send_message+0xb8/0x124
[ 120.677365] imgsys_cmdq_sendtask+0x688/0x928
[ 120.681717] imgsys_runner_func+0x44/0x78
[ 120.685722] process_one_work+0x274/0x574
[ 120.689731] worker_thread+0x240/0x3f0
[ 120.693475] kthread+0xfc/0x11c
[ 120.696612] ret_from_fork+0x10/0x20
[ 120.710570] mtk-cam raw 16030000.cam-raw-a: [SOF] frame done reading
lost frames. req(7),ts(120697780083)
[ 120.720352] mtk-cam raw 16030000.cam-raw-a: [SOF] SCQ_DELAY state:4
ts:120697780083
[ 120.990091] thread_id:0, send_diff:1470 ms
[ 121.007210] thread_id:0, send_diff:17 ms
[ 121.016484] thread_id:0, send_diff:9 ms
[ 121.032652] thread_id:0, send_diff:16 ms
[ 121.049366] thread_id:0, send_diff:16 ms
[ 121.066880] thread_id:0, send_diff:17 ms
[ 121.087023] thread_id:0, send_diff:20 ms
[ 121.099325] thread_id:0, send_diff:12 ms
[ 121.135286] thread_id:0, send_diff:35 ms
[ 121.149349] thread_id:0, send_diff:14 ms
[ 121.179259] thread_id:0, send_diff:29 ms
[ 121.203183] thread_id:0, send_diff:23 ms
[ 121.242234] thread_id:0, send_diff:39 ms
[ 121.280645] thread_id:0, send_diff:38 ms
[ 121.318651] thread_id:0, send_diff:38 ms
[ 121.356366] thread_id:0, send_diff:37 ms
[ 121.394484] thread_id:0, send_diff:38 ms
The cmdq_runtime_suspend() will only be called when I disabled camera
APP:
[ 468.137530] pm_on ts:465259, pm_off ts:4681375, diff:2877 ms
and the diff is the duration from opening the camera APP to closing the
camera APP.
From booting to seeing the UI and any operation of UI, I didn't see
any cmdq_runtime_suspend() logs from display mbox channel.
So cmdq_runtime_resume() and cmdq_runtime_suspend() are never called by
display mbox channel.
This can tell why only opening the camera APP can see this BUG report.
> >
> To be clear, do with the mailbox channel that you do with the clocks
> now, because your startup() is literally and shutdown() is
> practically
> empty.
> Call shutdown() when no request has come in for a while, so you know
> the client has quiesced likely. So try putting request/release in
> higher level RPM with autosuspend.
I'm not quit sure how to modify this from current usage.
Do you mean:
1. Move pm_runtime_get_sync() from cmdq_mbox_semd_data() to startup()
and keep pm_runtime_put_autosuspend() in cmdq_mbox_semd_data()?
2. Call shutdown() in cmdq_runtime_suspend()?
Regards,
Jason-JH.Lin
>
> thanks
Powered by blists - more mailing lists