linux-kernel - Re: [PATCH v4 11/18] PM: EM: Add runtime update interface to modify EM power

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <91d6e9be-d50c-d157-55a0-79134cbd01fb@arm.com>
Date:   Mon, 2 Oct 2023 15:09:57 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     "Rafael J. Wysocki" <rafael@...nel.org>
Cc:     linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        dietmar.eggemann@....com, rui.zhang@...el.com,
        amit.kucheria@...durent.com, amit.kachhap@...il.com,
        daniel.lezcano@...aro.org, viresh.kumar@...aro.org,
        len.brown@...el.com, pavel@....cz, mhiramat@...nel.org,
        qyousef@...alina.io, wvw@...gle.com
Subject: Re: [PATCH v4 11/18] PM: EM: Add runtime update interface to modify
 EM power



On 9/29/23 14:18, Rafael J. Wysocki wrote:
> On Fri, Sep 29, 2023 at 11:59 AM Lukasz Luba <lukasz.luba@....com> wrote:
>>

[snip]

>>
>> It's done above, next to '!dev || !dev->em_pd'
> 
> Yes, it is, I meant something like this:
> 
>      if (!cb || !cb->update_power || !dev)
>          return -EINVAL;
> 
>      mutex_lock(&em_pd_mutex);
> 
>      pd = dev->em_pd;
>      if (!pd) {
>          ret = -EINVAL; /* or perhaps -ENODATA */
>          goto unlock_em;
>      }
> 
> 

OK, I see what you mean. Let me change that.

>>>
>>>> +
>>>> +       runtime_table = kzalloc(sizeof(*runtime_table), GFP_KERNEL);
>>>> +       if (!runtime_table) {
>>>> +               ret = -ENOMEM;
>>>> +               goto unlock_em;
>>>> +       }
>>>> +
>>>> +       runtime_table->state = kcalloc(pd->nr_perf_states,
>>>> +                                      sizeof(struct em_perf_state),
>>>> +                                      GFP_KERNEL);
>>>> +       if (!runtime_table->state) {
>>>> +               ret = -ENOMEM;
>>>> +               goto free_runtime_table;
>>>> +       }
>>>
>>> The above allocations can be merged into one and allocating memory
>>> under the mutex is questionable.
>>
>> So how to make sure that there is no 2 callers trying to update the
>> same EM or unregistration is not in the background?
> 
> You can allocate memory upfront and take the mutex before accessing
> the shared data structures.  If there's an error in the code running
> under the mutex, release it and then free the memory.
> 
> Allocating memory is one operation, updating the shared data
> structures to use it is another one.  The former doesn't affect the
> shared state in any way, so why do it under the mutex?

Yes, make sense. I will shrink that critical section. Good catch,
thanks!

> 
>> [snip]
>>
>>>>
>>>> @@ -501,9 +598,23 @@ void em_dev_unregister_perf_domain(struct device *dev)
>>>>
>>>>           runtime_table = pd->runtime_table;
>>>>
>>>> +       /*
>>>> +        * Safely destroy runtime modifiable EM. By using the call
>>>> +        * synchronize_rcu() we make sure we don't progress till last user
>>>> +        * finished the RCU section and our update got applied.
>>>> +        */
>>>>           rcu_assign_pointer(pd->runtime_table, NULL);
>>>>           synchronize_rcu();
>>>>
>>>> +       /*
>>>> +        * After the sync no updates will be in-flight, so free the
>>>> +        * memory allocated for runtime table (if there was such).
>>>> +        */
>>>> +       if (runtime_table != pd->default_table) {
>>>> +               kfree(runtime_table->state);
>>>> +               kfree(runtime_table);
>>>> +       }
>>>
>>> Can't this race with the RCU callback freeing the runtime table?
>>
>> That's why there is this 'synchronize_rcu()' above and the mutex. The
>> updating caller if finished the update, would unlock the mutex and this
>> unregister code can go. Here we call the synchronize_rcu() so we assure
>> the callback has finished for the update path and than we explicitly
>> free the saved 'runtime_table' here. So all data should be freed and
>> code serialized in those two paths.
> 
> This doesn't quite agree with my understanding of what synchronize_rcu() does.
> 
> IIUC, RCU callbacks can run as soon as the grace period has elapsed
> and they need not wait for synchronize_rcu() to return.  Conversely,
> synchronize_rcu() doesn't wait for all of the RCU callbacks to
> complete.
> 
> Now, em_destroy_rt_table_rcu() doesn't actually use the mutex, so how
> exactly is it protected against racing with this code?


I'll try to draw in on some pictures...

(previous instance )
+---------------------+
|                     |
| runtime table    #1 |
|                     |
+---------------------+


(next instance )
+---------------------+
|                     |
| runtime table    #2 |
|                     |
+---------------------+


(not possible new instance)
+.....................+
.                     .
. runtime table    #3 .
.                     .
+.....................+



    cpu A - "updater"          |    cpu B - "remover"
                               |
------------------------------|------------------------------
    lock em_pd_mutex           |
                               |
       alloc runtime table #2  |   lock em_pd_mutex
                               |   (waiting)
       async free instance #1  |    .
                               |    .
    unlock em_pd_mutex         |    .
                               |   (enter critical section)
                               |
    lock em_pd_mutex           |   set NULL to runtime table ptr
    (waiting)                  |
    (wanted to create #3 inst) |   synchronize rcu to make it is visible
    .                          |
    .                          |   implicit free instance #2
    .                          |
    .                          |   free the rest of EM and EM
    .                          |
    .                          |   unlock em_pd_mutex
    (enter critical section)   |
    !dev->em_pd so             |
    unlock & exit              |
                               |
                               |


This should clean all involved memory and also prevent
of allocating new instance, when we unregister EM.