linux-kernel - Re: [patch, minor] workqueue: consistently use 'err' in __create_workqueue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b647ffbd0807290720r3223461dt9d4f7f248e5e3c37@mail.gmail.com>
Date:	Tue, 29 Jul 2008 16:20:28 +0200
From:	"Dmitry Adamushko" <dmitry.adamushko@...il.com>
To:	"Oleg Nesterov" <oleg@...sign.ru>
Cc:	linux-kernel@...r.kernel.org, "Ingo Molnar" <mingo@...e.hu>
Subject: Re: [patch, minor] workqueue: consistently use 'err' in __create_workqueue_key()

2008/7/29 Oleg Nesterov <oleg@...sign.ru>:
> On 07/29, Oleg Nesterov wrote:
>>
>> On 07/29, Dmitry Adamushko wrote:
>> >
>> > And I'd say this behavior (of having a partially-created object
>> > visible to the outside world) is not that robust. e.g. the
>> > aforementioned race would be eliminated if we place a wq on the global
>> > list only when it's been successfully initialized.
>>
>> Yes, we can change __create_workqueue_key() to check err == 0 before
>> list_add(),
>
> Well no, we can't do even this.
>
> Then we have another race with cpu-hotplug. Suppose we have CPUs 0, 1, 2.
> create_workqueue() fails to create cwq->thread for CPU 2 and calls
> destroy_workqueue(). Before it takes the cpu_add_remove_lock, _cpu_down()
> removes CPU 1 from cpu_populated_map, but since we didn't add this wq
> on the global list, cwq[1]->thread remains alive.
>
> destroy_workqueue() takes cpu_add_remove_lock, and calls
> cleanup_workqueue_thread() for CPUs 0 and 2. cwq[1]->thread is lost.

Yes, I've actually seen this case and that's why I said "the cleanup
path in __create_workqueue_key() would need
to be altered" :-) likely, to the extent that it would not be a call
to destroy_workqueue() anymore.

either something that only does

for_each_cpu_mask_nr(cpu, *cpu_map)
          cleanup_workqueue_thread(per_cpu_ptr(wq->cpu_wq, cpu));

and from the _same_ 'cpu_add_remove_lock' section which is used to
create a wq (so we don't drop a lock);

_or_ do it outside of the locked section _but_ don't rely on
for_each_cpu_mask_nr(cpu, *cpu_map)... e.g. just delete all per-cpu
wq->cpu_wq structures that have been initialized (that's no matter if
their respective cpus are online/offline now).

yes, maybe this cleanup path would not look all that fancy (but I
didn't try) but I do think that by not exposing "partially-initialized
object to the outside world" (e.g. cpu-hotplug events won't see them)
this code would become more straightforward and less prone to possible
errors/races.

e.g. all these "create_workqueue_key() may race with cpu-hotplug" would be gone.

-- 
Best regards,
Dmitry Adamushko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/