lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 22 Dec 2022 18:21:03 +0530
From:   Prashanth K <quic_prashk@...cinc.com>
To:     David Laight <David.Laight@...LAB.COM>,
        'Greg Kroah-Hartman' <gregkh@...uxfoundation.org>,
        Dan Carpenter <error27@...il.com>
CC:     "Gustavo A . R . Silva" <gustavoars@...nel.org>,
        Shuah Khan <skhan@...uxfoundation.org>,
        John Keeping <john@...anate.com>,
        Pratham Pratap <quic_ppratap@...cinc.com>,
        Vincent Pelletier <plr.vincent@...il.com>,
        "Udipto Goswami" <quic_ugoswami@...cinc.com>,
        "linux-usb@...r.kernel.org" <linux-usb@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "# 5 . 15" <stable@...r.kernel.org>
Subject: Re: usb: f_fs: Fix CFI failure in ki_complete



On 14-12-22 11:05 pm, David Laight wrote:
> From: Greg Kroah-Hartman
>> Sent: 12 December 2022 13:35
>>
>> On Mon, Dec 12, 2022 at 06:54:24PM +0530, Prashanth K wrote:
>>> Function pointer ki_complete() expects 'long' as its second
>>> argument, but we pass integer from ffs_user_copy_worker. This
>>> might cause a CFI failure, as ki_complete is an indirect call
>>> with mismatched prototype. Fix this by typecasting the second
>>> argument to long.
>>
>> "might"?  Does it or not?  If it does, why hasn't this been reported
>> before?
> 
> Does the cast even help at all.
Actually I also have these same questions
- why we haven't seen any instances other than this one?
- why its not seen on other indirect function calls?

Here is the the call stack of the failure that we got.

[  323.288681][    T7] Kernel panic - not syncing: CFI failure (target: 
0xffffffe5fc811f98)
[  323.288710][    T7] CPU: 6 PID: 7 Comm: kworker/u16:0 Tainted: G S 
    W  OE     5.15.41-android13-8-g5ffc5644bd20 #1
[  323.288730][    T7] Workqueue: adb ffs_user_copy_worker.cfi_jt
[  323.288752][    T7] Call trace:
[  323.288755][    T7]  dump_backtrace.cfi_jt+0x0/0x8
[  323.288772][    T7]  dump_stack_lvl+0x80/0xb8
[  323.288785][    T7]  panic+0x180/0x444
[  323.288797][    T7]  find_check_fn+0x0/0x218
[  323.288810][    T7]  ffs_user_copy_worker+0x1dc/0x204
[  323.288822][    T7]  kretprobe_trampoline.cfi_jt+0x0/0x8
[  323.288837][    T7]  worker_thread+0x3ec/0x920
[  323.288850][    T7]  kthread+0x168/0x1dc
[  323.288859][    T7]  ret_from_fork+0x10/0x20
[  323.288866][    T7] SMP: stopping secondary CPUs

And from address to line translation, we got know the issue is from
ffs_user_copy_worker+0x1dc/0x204
		||
io_data->kiocb->ki_complete(io_data->kiocb, ret);

And "find_check_fn" was getting invoked from ki_complete. Only thing 
that I found suspicious about ki_complete() is its argument types. 
That's why I pushed this patch here, so that we can discuss this out here.

Thanks in advance

> 
> ...
>>> -	io_data->kiocb->ki_complete(io_data->kiocb, ret);
>>> +	io_data->kiocb->ki_complete(io_data->kiocb, (long)ret);
> ...
> 
> If definition of the parameter in the structure member ki_complete()
> definition is 'long' then the compiler has to promote 'ret' to long
> anyway. CFI has nothing to do with it.
> 
> OTOH if you've used a cast to assign a function with a
> different prototype to ki_complete then 'all bets are off'
> and you get all the run time errors you deserve.
> CFI just converts some of them to compile time errors.
> 
> For instance if you assign xx_complete(long) to (*ki_complete)(int)
> then it is very likely that xx_complete() will an argument
> with some of the high bits set.
> But adding a cast to the call - ki_complete((long)int_var)
> will make absolutely no difference.
> The compiler wont zero/sign extend int_var to 64bits for you,
> that will just get optimised away and the high bits will
> be unchanged.
> 
> You're description seems to be the other way around (which might
> be safe, but CFI probably still barfs).
> But you need to fix the indirect calls so the function types
> match.
So does that mean, we need to add casts in al indirect calls to match 
the function signature?
> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ