lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <877cm9so25.fsf@devkernel.io>
Date:   Wed, 22 Nov 2023 09:16:47 -0800
From:   Stefan Roesch <shr@...kernel.io>
To:     Stefan Roesch <shr@...kernel.io>
Cc:     David Hildenbrand <david@...hat.com>,
        kernel test robot <oliver.sang@...el.com>,
        oe-lkp@...ts.linux.dev, lkp@...el.com,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Rik van Riel <riel@...riel.com>, linux-mm@...ck.org,
        ltp@...ts.linux.it
Subject: Re: [linus:master] [mm/ksm] 5e924ff54d: ltp.ksm01.fail


Stefan Roesch <shr@...kernel.io> writes:

> David Hildenbrand <david@...hat.com> writes:
>
>> On 16.11.23 05:39, kernel test robot wrote:
>>> hi, Stefan Roesch,
>>> we reported
>>> "[linux-next:master] [mm/ksm]  5e924ff54d: ltp.ksm01_1.fail"
>>> in
>>> https://lore.kernel.org/all/202311031548.66780ff5-oliver.sang@intel.com/
>>> when this commit is in linux-next/master.
>>> now we noticed this commit is merged in mainline, and we still observed
>>> same issue. just FYI.
>>> Hello,
>>> kernel test robot noticed "ltp.ksm01.fail" on:
>>> commit: 5e924ff54d088828794d9f1a4d5bf17808f7270e ("mm/ksm: add "smart" page
>>> scanning mode")
>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>>> [test failed on linus/master 3ca112b71f35dd5d99fc4571a56b5fc6f0c15814]
>>> [test failed on linux-next/master 8728c14129df7a6e29188a2e737b4774fb200953]
>>> in testcase: ltp
>>> version: ltp-x86_64-14c1f76-1_20230715
>>> with following parameters:
>>> 	disk: 1HDD
>>> 	test: mm-00/ksm01
>>> compiler: gcc-12
>>> test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz (Kaby Lake) with 32G memory
>>> (please refer to attached dmesg/kmsg for entire log/backtrace)
>>> If you fix the issue in a separate patch/commit (i.e. not just a new version
>>> of
>>> the same patch/commit), kindly add following tags
>>> | Reported-by: kernel test robot <oliver.sang@...el.com>
>>> | Closes: https://lore.kernel.org/oe-lkp/202311161132.13d8ce5a-oliver.sang@intel.com
>>> Running tests.......
>>> <<<test_start>>>
>>> tag=ksm01 stime=1699563923
>>> cmdline="ksm01"
>>> contacts=""
>>> analysis=exit
>>> <<<test_output>>>
>>> tst_kconfig.c:87: TINFO: Parsing kernel config '/proc/config.gz'
>>> tst_test.c:1558: TINFO: Timeout per run is 0h 00m 30s
>>> mem.c:422: TINFO: wait for all children to stop.
>>> mem.c:388: TINFO: child 0 stops.
>>> mem.c:388: TINFO: child 1 stops.
>>> mem.c:388: TINFO: child 2 stops.
>>> mem.c:495: TINFO: KSM merging...
>>> mem.c:434: TINFO: resume all children.
>>> mem.c:422: TINFO: wait for all children to stop.
>>> mem.c:344: TINFO: child 1 continues...
>>> mem.c:347: TINFO: child 1 allocates 128 MB filled with 'a'
>>> mem.c:344: TINFO: child 2 continues...
>>> mem.c:347: TINFO: child 2 allocates 128 MB filled with 'a'
>>> mem.c:344: TINFO: child 0 continues...
>>> mem.c:347: TINFO: child 0 allocates 128 MB filled with 'c'
>>> mem.c:400: TINFO: child 1 stops.
>>> mem.c:400: TINFO: child 0 stops.
>>> mem.c:400: TINFO: child 2 stops.
>>> ksm_helper.c:36: TINFO: ksm daemon takes 2s to run two full scans
>>> mem.c:264: TINFO: check!
>>> mem.c:255: TPASS: run is 1.
>>> mem.c:255: TPASS: pages_shared is 2.
>>> ....
>>> mem.c:255: TPASS: pages_shared is 1.
>>> mem.c:255: TPASS: pages_sharing is 98302.
>>> mem.c:252: TFAIL: pages_volatile is not 0 but 1.     <-----
>>> mem.c:252: TFAIL: pages_unshared is not 1 but 0.     <-----
>>
>> @Stefan, is this simply related to the new scanning optimization (skip and
>> eventually not merge a pages within the "2 scans" windows, whereby previously,
>> they would have gotten merged)?
>>
>> If so, we might just want to disable that optimization for that test case?
>>
>> Alternatively, maybe we have to wait for "more" scan cycles instead of only 2?
>
> I'd expect this is caused by "smart scan", where we can skip pages.
> The best is probably to disable the smart scan feature for this test.
> The smart scan feature can be disabled by:
>
>     echo 0 > /sys/kernel/mm/ksm/smart_scan
>
> I'll have a look at it today.
>

If I disable "smart scan", the testcase completes successully. This is
simply the case that for the testcase, it can "skip" a page (as the
"smart scan" feature is enabled).

The easiest fix is to disable smart scan for the ksm cases. I'll send an
ltp patch a bit later to address this issue.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ