linux-kernel - Re: [RFC PATCH 2/8] workqueue: Make create_worker() safe against prematurely wakeups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YvwQI/83USNc8610@slm.duckdns.org>
Date:   Tue, 16 Aug 2022 11:46:11 -1000
From:   Tejun Heo <tj@...nel.org>
To:     Lai Jiangshan <jiangshanlai@...il.com>
Cc:     linux-kernel@...r.kernel.org,
        Lai Jiangshan <jiangshan.ljs@...group.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Eric W. Biederman" <ebiederm@...ssion.com>,
        Petr Mladek <pmladek@...e.com>, Michal Hocko <mhocko@...e.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Wedson Almeida Filho <wedsonaf@...gle.com>,
        Waiman Long <longman@...hat.com>
Subject: Re: [RFC PATCH 2/8] workqueue: Make create_worker() safe against
 prematurely wakeups

On Thu, Aug 04, 2022 at 04:41:29PM +0800, Lai Jiangshan wrote:
...
> Any kthread is supposed to stay in TASK_UNINTERRUPTIBLE sleep
> until it is explicitly woken. But a spurious wakeup might
> break this expectation.

I'd rephrase the above. It's more that we can't assume that a sleeping task
will stay sleeping and should expect spurious wakeups.

> @@ -176,6 +176,7 @@ struct worker_pool {
>  						/* L: hash of busy workers */
>  
>  	struct worker		*manager;	/* L: purely informational */
> +	struct completion	created;	/* create_worker(): worker created */

Can we define sth like worker_create_args struct which contains a
completion, pointers to the worker and pool and use an on-stack instance to
carry the create parameters to the new worker thread? It's kinda odd to have
a persistent copy of completion in the pool

> @@ -1949,6 +1951,7 @@ static struct worker *create_worker(struct worker_pool *pool)
>  	else
>  		snprintf(id_buf, sizeof(id_buf), "u%d:%d", pool->id, id);
>  
> +	reinit_completion(&pool->created);

which keeps getting reinitialized.

> @@ -2383,10 +2380,17 @@ static int worker_thread(void *__worker)
>  	struct worker *worker = __worker;
>  	struct worker_pool *pool = worker->pool;
>  
> +	/* attach the worker to the pool */
> +	worker_attach_to_pool(worker, pool);

It's also odd for the new worker to have pool already set and then we attach
to that pool.

> @@ -37,8 +37,15 @@ struct worker {
>  	/* 64 bytes boundary on 64bit, 32 on 32bit */
>  
>  	struct task_struct	*task;		/* I: worker task */
> -	struct worker_pool	*pool;		/* A: the associated pool */
> -						/* L: for rescuers */
> +
> +	/*
> +	 * The associated pool, locking rules:
> +	 *   PF_WQ_WORKER: from the current worker
> +	 *   PF_WQ_WORKER && wq_pool_attach_mutex: from remote tasks
> +	 *   None: from the current worker when the worker is coming up
> +	 */
> +	struct worker_pool	*pool;

I have a difficult time understanding the above comment. Can you please
follow the same style as others?

I was hoping that this problem would be fixed through kthread changes but
that doesn't seem to have happened yet and given that we need to keep
modifying cpumasks dynamically anyway (e.g. for unbound pool config change),
solving it from wq side is fine too especially if we can leverage the same
code paths that the dynamic changes are using.

That said, some of complexities are from CPU hotplug messing with worker
cpumasks and wq trying to restore them and it seems likely that all these
will be simpler with the persistent cpumask that Waiman is working on. Lai,
can you please take a look at that patchset?

Thanks.

-- 
tejun