lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 21 Dec 2019 15:59:24 -0800
From:   John Hubbard <>
To:     Leon Romanovsky <>
CC:     Jason Gunthorpe <>,
        Andrew Morton <>,
        Al Viro <>,
        Alex Williamson <>,
        Benjamin Herrenschmidt <>,
        Björn Töpel <>,
        Christoph Hellwig <>,
        Dan Williams <>,
        Daniel Vetter <>,
        Dave Chinner <>,
        David Airlie <>,
        "David S . Miller" <>,
        Ira Weiny <>, Jan Kara <>,
        Jens Axboe <>, Jonathan Corbet <>,
        Jérôme Glisse <>,
        Magnus Karlsson <>,
        Mauro Carvalho Chehab <>,
        Michael Ellerman <>,
        Michal Hocko <>,
        Mike Kravetz <>,
        Paul Mackerras <>,
        Shuah Khan <>,
        Vlastimil Babka <>, <>,
        <>, <>,
        <>, <>,
        <>, <>,
        <>, <>,
        <>, <>,
        <>, LKML <>,
        Maor Gottlieb <>
Subject: Re: [PATCH v11 00/25] mm/gup: track dma-pinned pages: FOLL_PIN

On 12/21/19 2:08 AM, Leon Romanovsky wrote:
> On Fri, Dec 20, 2019 at 03:54:55PM -0800, John Hubbard wrote:
>> On 12/20/19 10:29 AM, Leon Romanovsky wrote:
>> ...
>>>> $ ./
>>>> $ build/bin/
>>>> If you get things that far I think Leon can get a reproduction for you
>>> I'm not so optimistic about that.
>> OK, I'm going to proceed for now on the assumption that I've got an overflow
>> problem that happens when huge pages are pinned. If I can get more information,
>> great, otherwise it's probably enough.
>> One thing: for your repro, if you know the huge page size, and the system
>> page size for that case, that would really help. Also the number of pins per
>> page, more or less, that you'd expect. Because Jason says that only 2M huge
>> pages are used...
>> Because the other possibility is that the refcount really is going negative,
>> likely due to a mismatched pin/unpin somehow.
>> If there's not an obvious repro case available, but you do have one (is it easy
>> to repro, though?), then *if* you have the time, I could point you to a github
>> branch that reduces GUP_PIN_COUNTING_BIAS by, say, 4x, by applying this:
> I'll see what I can do this Sunday.

The other data point that might shed light on whether it's a mismatch (this only
works if the system is not actually crashing, though), is checking the new
vmstat items, like this:

$ grep foll_pin /proc/vmstat
nr_foll_pin_requested 16288188
nr_foll_pin_returned 16288188

...but OTOH, if you've got long-term pins, then those are *supposed* to be
mismatched, so it only really helps in between tests.

John Hubbard

Powered by blists - more mailing lists