linux-kernel - Re: [PATCH v2 ] devcoredump : Serialize devcd

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87levt14kn.ffs@tglx>
Date:   Mon, 25 Apr 2022 19:00:08 +0200
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Mukesh Ojha <quic_mojha@...cinc.com>, linux-kernel@...r.kernel.org
Cc:     sboyd@...nel.org, rafael@...nel.org, johannes@...solutions.net,
        gregkh@...uxfoundation.org, Mukesh Ojha <quic_mojha@...cinc.com>
Subject: Re: [PATCH v2 ] devcoredump : Serialize devcd_del work

On Mon, Apr 25 2022 at 18:39, Mukesh Ojha wrote:
> v1->v2:
>  - Added del_wk_queued to serialize the race between devcd_data_write()
>    and disabled_store().

How so?

Neither the flag nor the mutex can prevent the race between the work
being executed in parallel.

disabled_store()                                worker()    

  class_for_each_device(&devcd_class, NULL, NULL, devcd_free)
    ...
    while ((dev = class_dev_iter_next(&iter)) {
    						devcd_del()
                                                 device_del()
                                                 put_device() <- last reference
          error = fn(dev, data)                   devcd_dev_release()
            devcd_free(dev, data)                  kfree(devcd)
              mutex_lock(&devcd->mutex);

There is zero protection of the class iterator against the work being
executed and removing the device and freeing its data. IOW, at the
point where fn(), i.e. devcd_free(), dereferences 'dev' to acquire the
mutex, it might be gone. No?

If disabled_store() really needs to flush all instances immediately,
then it requires global serialization, not device specific serialization.

Johannes, can you please explain whether this immediate flush in
disabled_store() is really required and if so, why?

Thanks,

        tglx