lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 27 Jan 2023 13:31:54 +0000
From:   Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com>
To:     Michal Koutný <mkoutny@...e.com>
Cc:     Intel-gfx@...ts.freedesktop.org, dri-devel@...ts.freedesktop.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
        Tejun Heo <tj@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Zefan Li <lizefan.x@...edance.com>,
        Dave Airlie <airlied@...hat.com>,
        Daniel Vetter <daniel.vetter@...ll.ch>,
        Rob Clark <robdclark@...omium.org>,
        Stéphane Marchesin <marcheu@...omium.org>,
        "T . J . Mercier" <tjmercier@...gle.com>, Kenny.Ho@....com,
        Christian König <christian.koenig@....com>,
        Brian Welty <brian.welty@...el.com>,
        Tvrtko Ursulin <tvrtko.ursulin@...el.com>
Subject: Re: [RFC 10/12] cgroup/drm: Introduce weight based drm cgroup control


On 27/01/2023 13:01, Michal Koutný wrote:
> On Thu, Jan 12, 2023 at 04:56:07PM +0000, Tvrtko Ursulin <tvrtko.ursulin@...ux.intel.com> wrote:
>> +static int drmcs_can_attach(struct cgroup_taskset *tset)
>> +{
>> +	int ret;
>> +
>> +	/*
>> +	 * As processes are getting moved between groups we need to ensure
>> +	 * both that the old group does not see a sudden downward jump in the
>> +	 * GPU utilisation, and that the new group does not see a sudden jump
>> +	 * up with all the GPU time clients belonging to the migrated process
>> +	 * have accumulated.
>> +	 *
>> +	 * To achieve that we suspend the scanner until the migration is
>> +	 * completed where the resume at the end ensures both groups start
>> +	 * observing GPU utilisation from a reset state.
>> +	 */
>> +
>> +	ret = mutex_lock_interruptible(&drmcg_mutex);
>> +	if (ret)
>> +		return ret;
>> +	start_suspend_scanning();
>> +	mutex_unlock(&drmcg_mutex);
>> +
>> +	finish_suspend_scanning();
> 
> Here's scanning suspension, communicated via
> 
> 	root_drmcs.scanning_suspended = true;
> 	root_drmcs.suspended_period_us = root_drmcs.period_us;
> 	root_drmcs.period_us = 0;
> 
> but I don't see those used in scan_worker() and the scanning traversal
> can apparently run concurrently with a task migration.

I think you missed the finish_suspend_scanning() part:

	if (root_drmcs.suspended_period_us)
		cancel_delayed_work_sync(&root_drmcs.scan_work);

So if scanning was in progress migration will wait until it finishes. 
And re-start only when migration is done (drmcs_attach), or it failed 
(drmcs_cancel_attach).

Not claiming I did not miss something because I was totally new with 
cgroup internals when I started working on this. So it is definitely 
useful to have more eyes looking.

>> [...]
>> +static bool
>> +__start_scanning(struct drm_cgroup_state *root, unsigned int period_us)
>> [...]
>> +	css_for_each_descendant_post(node, &root->css) {
>> [...]
>> +		active = drmcs_get_active_time_us(drmcs);
>> +		if (period_us && active > drmcs->prev_active_us)
>> +			drmcs->active_us += active - drmcs->prev_active_us;
>> +		drmcs->prev_active_us = active;
> 
> drmcs_get_active_time_us() could count a task's contribution here,
> the task would migrate to a different drmcs,
> and it'd be counted 2nd time.

Lets see.. __start_scanning() can be called from the worker, so max one 
instance at a time, no issue.

Then from resume scanning, so it is guaranteed worker is not running and 
can't restart since mutex guards the re-start.

Finally from drmcs_write_period_us() - yes there __start_scanning() can 
race with it being invoked by the worker - oops! However.. this is just 
a debugging aid as the cover letter explains. This file is not intended 
to be present in the final version, rather as per earlier discussion 
with Tejun the idea is to only have boot time option to control the 
functionality (enable/disable or period).

I will nevertheless try to fix this race up for the next posting to 
avoid further confusion!

Regards,

Tvrtko

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ