lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0hcwCqfSh32U5zVnpBxmziJHYYCfhJ9uJLJ0gVvkrP-5w@mail.gmail.com>
Date:   Wed, 9 Nov 2022 20:07:48 +0100
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Guenter Roeck <linux@...ck-us.net>
Cc:     "Rafael J . Wysocki" <rafael@...nel.org>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Amit Kucheria <amitk@...nel.org>,
        Zhang Rui <rui.zhang@...el.com>, linux-pm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/9] thermal/core: Ensure that thermal device is
 registered in thermal_zone_get_temp

On Mon, Oct 17, 2022 at 3:09 PM Guenter Roeck <linux@...ck-us.net> wrote:
>
> Calls to thermal_zone_get_temp() are not protected against thermal zone
> device removal. As result, it is possible that the thermal zone operations
> callbacks are no longer valid when thermal_zone_get_temp() is called.
> This may result in crashes such as
>
> BUG: unable to handle page fault for address: ffffffffc04ef420
>  #PF: supervisor read access in kernel mode
>  #PF: error_code(0x0000) - not-present page
> PGD 5d60e067 P4D 5d60e067 PUD 5d610067 PMD 110197067 PTE 0
> Oops: 0000 [#1] PREEMPT SMP NOPTI
> CPU: 1 PID: 3209 Comm: cat Tainted: G        W         5.10.136-19389-g615abc6eb807 #1 02df41ac0b12f3a64f4b34245188d8875bb3bce1
> Hardware name: Google Coral/Coral, BIOS Google_Coral.10068.92.0 11/27/2018
> RIP: 0010:thermal_zone_get_temp+0x26/0x73
> Code: 89 c3 eb d3 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 53 48 85 ff 74 50 48 89 fb 48 81 ff 00 f0 ff ff 77 44 48 8b 83 98 03 00 00 <48> 83 78 10 00 74 36 49 89 f6 4c 8d bb d8 03 00 00 4c 89 ff e8 9f
> RSP: 0018:ffffb3758138fd38 EFLAGS: 00010287
> RAX: ffffffffc04ef410 RBX: ffff98f14d7fb000 RCX: 0000000000000000
> RDX: ffff98f17cf90000 RSI: ffffb3758138fd64 RDI: ffff98f14d7fb000
> RBP: ffffb3758138fd50 R08: 0000000000001000 R09: ffff98f17cf90000
> R10: 0000000000000000 R11: ffffffff8dacad28 R12: 0000000000001000
> R13: ffff98f1793a7d80 R14: ffff98f143231708 R15: ffff98f14d7fb018
> FS:  00007ec166097800(0000) GS:ffff98f1bbd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffc04ef420 CR3: 000000010ee9a000 CR4: 00000000003506e0
> Call Trace:
>  temp_show+0x31/0x68
>  dev_attr_show+0x1d/0x4f
>  sysfs_kf_seq_show+0x92/0x107
>  seq_read_iter+0xf5/0x3f2
>  vfs_read+0x205/0x379
>  __x64_sys_read+0x7c/0xe2
>  do_syscall_64+0x43/0x55
>  entry_SYSCALL_64_after_hwframe+0x61/0xc6
>
> if a thermal device is removed while accesses to its device attributes
> are ongoing.
>
> The problem is exposed by code in iwl_op_mode_mvm_start(), which registers
> a thermal zone device only to unregister it shortly afterwards if an
> unrelated failure is encountered while accessing the hardware.
>
> Check if the thermal zone device is registered after acquiring the
> thermal zone device mutex to ensure this does not happen.
>
> The code was tested by triggering the failure in iwl_op_mode_mvm_start()
> on purpose. Without this patch, the kernel crashes reliably. The crash
> is no longer observed after applying this and the preceding patches.
>
> Signed-off-by: Guenter Roeck <linux@...ck-us.net>
> ---
>  drivers/thermal/thermal_helpers.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/drivers/thermal/thermal_helpers.c b/drivers/thermal/thermal_helpers.c
> index c65cdce8f856..3bac0b7a4c62 100644
> --- a/drivers/thermal/thermal_helpers.c
> +++ b/drivers/thermal/thermal_helpers.c
> @@ -115,7 +115,12 @@ int thermal_zone_get_temp(struct thermal_zone_device *tz, int *temp)
>         int ret;
>
>         mutex_lock(&tz->lock);
> +       if (!device_is_registered(&tz->device)) {
> +               ret = -ENODEV;
> +               goto unlock;
> +       }
>         ret = __thermal_zone_get_temp(tz, temp);
> +unlock:

I would do it this way:

if (device_is_registered(&tz->device))
        ret = __thermal_zone_get_temp(tz, temp);
else
        ret = -ENODEV;

>         mutex_unlock(&tz->lock);
>
>         return ret;
> --

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ