linux-kernel - Re: [RFC PATCH v4 05/13] workqueue, ktask: renice helper threads to prevent starvation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181113163400.GK2509588@devbig004.ftw2.facebook.com>
Date:   Tue, 13 Nov 2018 08:34:00 -0800
From:   Tejun Heo <tj@...nel.org>
To:     Daniel Jordan <daniel.m.jordan@...cle.com>
Cc:     linux-mm@...ck.org, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, aarcange@...hat.com,
        aaron.lu@...el.com, akpm@...ux-foundation.org,
        alex.williamson@...hat.com, bsd@...hat.com,
        darrick.wong@...cle.com, dave.hansen@...ux.intel.com,
        jgg@...lanox.com, jwadams@...gle.com, jiangshanlai@...il.com,
        mhocko@...nel.org, mike.kravetz@...cle.com,
        Pavel.Tatashin@...rosoft.com, prasad.singamsetty@...cle.com,
        rdunlap@...radead.org, steven.sistare@...cle.com,
        tim.c.chen@...el.com, vbabka@...e.cz
Subject: Re: [RFC PATCH v4 05/13] workqueue, ktask: renice helper threads to
 prevent starvation

Hello, Daniel.

On Mon, Nov 05, 2018 at 11:55:50AM -0500, Daniel Jordan wrote:
>  static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
> -			     bool from_cancel)
> +			     struct nice_work *nice_work, int flags)
>  {
>  	struct worker *worker = NULL;
>  	struct worker_pool *pool;
> @@ -2868,11 +2926,19 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
>  	if (pwq) {
>  		if (unlikely(pwq->pool != pool))
>  			goto already_gone;
> +
> +		/* not yet started, insert linked work before work */
> +		if (unlikely(flags & WORK_FLUSH_AT_NICE))
> +			insert_nice_work(pwq, nice_work, work);

So, I'm not sure this works that well.  e.g. what if the work item is
waiting for other work items which are at lower priority?  Also, in
this case, it'd be a lot simpler to simply dequeue the work item and
execute it synchronously.

>  	} else {
>  		worker = find_worker_executing_work(pool, work);
>  		if (!worker)
>  			goto already_gone;
>  		pwq = worker->current_pwq;
> +		if (unlikely(flags & WORK_FLUSH_AT_NICE)) {
> +			set_user_nice(worker->task, nice_work->nice);
> +			worker->flags |= WORKER_NICED;
> +		}
>  	}

I'm not sure about this.  Can you see whether canceling & executing
synchronously is enough to address the latency regression?

Thanks.

-- 
tejun