lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <14f8c395-9d13-dbaa-9180-46e0148556c5@amd.com>
Date:   Fri, 19 May 2023 17:35:53 +0530
From:   Raghavendra K T <raghavendra.kt@....com>
To:     Bharata B Rao <bharata@....com>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Andrew Morton <akpm@...ux-foundation.org>,
        David Hildenbrand <david@...hat.com>, rppt@...nel.org,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Aithal Srikanth <sraithal@....com>,
        kernel test robot <oliver.sang@...el.com>
Subject: Re: [RFC PATCH V2 1/1] sched/numa: Fix disjoint set vma scan
 regression

On 5/19/2023 1:26 PM, Bharata B Rao wrote:
> On 16-May-23 2:49 PM, Raghavendra K T wrote:
>>   With the numa scan enhancements [1], only the threads which had previously
[...]
>> -#define VMA_PID_RESET_PERIOD (4 * sysctl_numa_balancing_scan_delay)
>> +#define VMA_PID_RESET_PERIOD		(4 * sysctl_numa_balancing_scan_delay)
>> +#define DISJOINT_VMA_SCAN_RENEW_THRESH	16
>>   
>>   /*
>>    * The expensive part of numa migration is done from task_work context.
>> @@ -3058,6 +3072,8 @@ static void task_numa_work(struct callback_head *work)
>>   			/* Reset happens after 4 times scan delay of scan start */
>>   			vma->numab_state->next_pid_reset =  vma->numab_state->next_scan +
>>   				msecs_to_jiffies(VMA_PID_RESET_PERIOD);
>> +
>> +			WRITE_ONCE(vma->numab_state->scan_counter, 0);
>>   		}
>>   
>>   		/*
>> @@ -3068,6 +3084,13 @@ static void task_numa_work(struct callback_head *work)
>>   						vma->numab_state->next_scan))
>>   			continue;
>>   
>> +		/*
>> +		 * For long running tasks, renew the disjoint vma scanning
>> +		 * periodically.
>> +		 */
>> +		if (mm->numa_scan_seq && !(mm->numa_scan_seq % DISJOINT_VMA_SCAN_RENEW_THRESH))
> 
> Don't you need a READ_ONCE() accessor for mm->numa_scan_seq?
> 

Hello Bharata,

Yes.. Thanks for pointing out.. V1 I did ensure that, But in V2 somehow
leftout :( .

On the other-hand I see vma->numab_state->scan_counter does not need
READ_ONCE/WRITE_ONCE since it is not modified out of this function
(i.e. it is all done after cmpxchg above)..

Also thinking more, DISJOINT_VMA_SCAN_RENEW_THRESH reset change itself
  may need some correction, and doesn't seem to be absolutely necessary
here. (will post that separately for improving long running benchmark as
per my experiment with more detail)

will wait for any confirmation of reported regression fix with this
  patch and/or any better idea/ack for a while and repost.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ