[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK8P3a3PAxkctN6AXOsoTBTFgwHhk7_OSYwJ4Rgk7Dbs+bc0Qw@mail.gmail.com>
Date: Fri, 5 Aug 2022 21:32:13 +0200
From: Arnd Bergmann <arnd@...nel.org>
To: Nathan Chancellor <nathan@...nel.org>
Cc: Harry Wentland <harry.wentland@....com>,
"Siqueira, Rodrigo" <Rodrigo.Siqueira@....com>,
clang-built-linux <llvm@...ts.linux.dev>,
David Airlie <airlied@...ux.ie>,
"Pan, Xinhui" <Xinhui.Pan@....com>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
amd-gfx list <amd-gfx@...ts.freedesktop.org>,
Christian König <christian.koenig@....com>,
dri-devel <dri-devel@...ts.freedesktop.org>,
Alex Deucher <alexander.deucher@....com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
"Sudip Mukherjee (Codethink)" <sudipm.mukherjee@...il.com>
Subject: Re: mainline build failure for x86_64 allmodconfig with clang
On Fri, Aug 5, 2022 at 8:02 PM Nathan Chancellor <nathan@...nel.org> wrote:
> On Fri, Aug 05, 2022 at 06:16:45PM +0200, Arnd Bergmann wrote:
> > On Fri, Aug 5, 2022 at 5:32 PM Harry Wentland <harry.wentland@....com> wrote:
> > While splitting out sub-functions can help reduce the maximum stack
> > usage, it seems that in this case it makes the actual problem worse:
> > I see 2168 bytes for the combined
> > dml32_ModeSupportAndSystemConfigurationFull(), but marking
> > mode_support_configuration() as noinline gives me 1992 bytes
> > for the outer function plus 384 bytes for the inner one. So it does
> > avoid the warning (barely), but not the problem that the warning tries
> > to point out.
>
> I haven't had a chance to take a look at splitting things up yet, would
> you recommend a different approach?
Splitting up large functions can help when you have large local variables
that are used in different parts of the function, and the split gets the
compiler to reuse stack locations.
I think in this particular function, the problem isn't actually local variables
but either pushing variables on the stack for argument passing,
or something that causes the compiler to run out of registers so it
has to spill registers to the stack.
In either case, one has to actually look at the generated output
and then try to rearrange the codes so this does not happen.
One thing to try would be to condense a function call like
dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
&v->dummy_vars.dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport,
mode_lib->vba.USRRetrainingRequiredFinal,
mode_lib->vba.UsesMALLForPStateChange,
mode_lib->vba.PrefetchModePerState[mode_lib->vba.VoltageLevel][mode_lib->vba.maxMpcComb],
mode_lib->vba.NumberOfActiveSurfaces,
mode_lib->vba.MaxLineBufferLines,
mode_lib->vba.LineBufferSizeFinal,
mode_lib->vba.WritebackInterfaceBufferSize,
mode_lib->vba.DCFCLK,
mode_lib->vba.ReturnBW,
mode_lib->vba.SynchronizeTimingsFinal,
mode_lib->vba.SynchronizeDRRDisplaysForUCLKPStateChangeFinal,
mode_lib->vba.DRRDisplay,
v->dpte_group_bytes,
v->meta_row_height,
v->meta_row_height_chroma,
v->dummy_vars.DISPCLKDPPCLKDCFCLKDeepSleepPrefetchParametersWatermarksAndPerformanceCalculation.mmSOCParameters,
mode_lib->vba.WritebackChunkSize,
mode_lib->vba.SOCCLK,
v->DCFCLKDeepSleep,
mode_lib->vba.DETBufferSizeY,
mode_lib->vba.DETBufferSizeC,
mode_lib->vba.SwathHeightY,
mode_lib->vba.SwathHeightC,
mode_lib->vba.LBBitPerPixel,
v->SwathWidthY,
v->SwathWidthC,
mode_lib->vba.HRatio,
mode_lib->vba.HRatioChroma,
mode_lib->vba.vtaps,
mode_lib->vba.VTAPsChroma,
mode_lib->vba.VRatio,
mode_lib->vba.VRatioChroma,
mode_lib->vba.HTotal,
mode_lib->vba.VTotal,
mode_lib->vba.VActive,
mode_lib->vba.PixelClock,
mode_lib->vba.BlendingAndTiming,
.... /* more arguments */);
into calling conventions that take a pointer to 'mode_lib->vba' and another
one to 'v', so these are no longer passed on the stack individually.
Arnd
Powered by blists - more mailing lists