lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 19 Feb 2019 11:01:51 -0800
From:   Nick Desaulniers <ndesaulniers@...gle.com>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     Hans Verkuil <hans.verkuil@...co.com>,
        Mauro Carvalho Chehab <mchehab@...nel.org>,
        Mark Brown <broonie@...nel.org>,
        Nathan Chancellor <natechancellor@...il.com>,
        Dafna Hirschfeld <dafna3@...il.com>,
        Tom aan de Wiel <tom.aandewiel@...il.com>,
        linux-media@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 2/3] media: vicodec: avoic clang frame size warning

On Tue, Feb 19, 2019 at 9:02 AM Arnd Bergmann <arnd@...db.de> wrote:
>
> Clang-9 makes some different inlining decisions compared to gcc, which
> leads to a warning about a possible stack overflow problem when building
> with CONFIG_KASAN, including when setting asan-stack=0, which avoids
> most other frame overflow warnings:
>
> drivers/media/platform/vicodec/codec-fwht.c:673:12: error: stack frame size of 2224 bytes in function 'encode_plane'
>
> Manually adding noinline_for_stack annotations in those functions

Thanks for the fix! In general, for -Wstack-frame-larger-than=
warnings, is it possible that these sets of stack frames are already
too large if entered?  Sure, inlining was a little aggressive, causing
more stack space use than maybe otherwise necessary at runtime, but
isn't it also possible that "no inlining" a stack frame can still be a
problem should the stack frame be entered?  Doesn't the kernel have a
way of estimating the stack depth for any given frame?  I guess I was
always curious if the best fix for these kind of warnings was to
non-stack allocate (kmalloc) certain locally allocated structs, or
no-inline the function.  Surely there's cases where no-inlining is
safe, but I was curious if it's still maybe dangerous to enter the
problematic child most stack frame?

> called by encode_plane() or decode_plane() that require a significant
> amount of kernel stack makes this impossible to happen with any
> compiler.
>
> Signed-off-by: Arnd Bergmann <arnd@...db.de>
> ---
>  drivers/media/platform/vicodec/codec-fwht.c | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/media/platform/vicodec/codec-fwht.c b/drivers/media/platform/vicodec/codec-fwht.c
> index d1d6085da9f1..135d56bcc2c5 100644
> --- a/drivers/media/platform/vicodec/codec-fwht.c
> +++ b/drivers/media/platform/vicodec/codec-fwht.c
> @@ -47,7 +47,7 @@ static const uint8_t zigzag[64] = {
>  };
>
>
> -static int rlc(const s16 *in, __be16 *output, int blocktype)
> +static int noinline_for_stack rlc(const s16 *in, __be16 *output, int blocktype)
>  {
>         s16 block[8 * 8];
>         s16 *wp = block;
> @@ -106,8 +106,8 @@ static int rlc(const s16 *in, __be16 *output, int blocktype)
>   * This function will worst-case increase rlc_in by 65*2 bytes:
>   * one s16 value for the header and 8 * 8 coefficients of type s16.
>   */
> -static u16 derlc(const __be16 **rlc_in, s16 *dwht_out,
> -                const __be16 *end_of_input)
> +static noinline_for_stack u16
> +derlc(const __be16 **rlc_in, s16 *dwht_out, const __be16 *end_of_input)
>  {
>         /* header */
>         const __be16 *input = *rlc_in;
> @@ -373,7 +373,8 @@ static void fwht(const u8 *block, s16 *output_block, unsigned int stride,
>   * Furthermore values can be negative... This is just a version that
>   * works with 16 signed data
>   */
> -static void fwht16(const s16 *block, s16 *output_block, int stride, int intra)
> +static void noinline_for_stack
> +fwht16(const s16 *block, s16 *output_block, int stride, int intra)
>  {
>         /* we'll need more than 8 bits for the transformed coefficients */
>         s32 workspace1[8], workspace2[8];
> @@ -456,7 +457,8 @@ static void fwht16(const s16 *block, s16 *output_block, int stride, int intra)
>         }
>  }
>
> -static void ifwht(const s16 *block, s16 *output_block, int intra)
> +static noinline_for_stack void
> +ifwht(const s16 *block, s16 *output_block, int intra)
>  {
>         /*
>          * we'll need more than 8 bits for the transformed coefficients
> @@ -604,9 +606,9 @@ static int var_inter(const s16 *old, const s16 *new)
>         return ret;
>  }
>
> -static int decide_blocktype(const u8 *cur, const u8 *reference,
> -                           s16 *deltablock, unsigned int stride,
> -                           unsigned int input_step)
> +static noinline_for_stack int
> +decide_blocktype(const u8 *cur, const u8 *reference, s16 *deltablock,
> +                unsigned int stride, unsigned int input_step)
>  {
>         s16 tmp[64];
>         s16 old[64];
> --
> 2.20.0
>


-- 
Thanks,
~Nick Desaulniers

Powered by blists - more mailing lists