[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d77d1e86-ac99-8c18-658c-d8150a71b11e@de.ibm.com>
Date: Thu, 30 Apr 2020 20:12:00 +0200
From: Christian Borntraeger <borntraeger@...ibm.com>
To: Dave Hansen <dave.hansen@...el.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
akpm@...ux-foundation.org, jack@...e.cz, kirill@...temov.name
Cc: david@...hat.com, aarcange@...hat.com, linux-mm@...ck.org,
frankja@...ux.ibm.com, sfr@...b.auug.org.au, jhubbard@...dia.com,
linux-kernel@...r.kernel.org, linux-s390@...r.kernel.org,
peterz@...radead.org, sean.j.christopherson@...el.com
Subject: Re: [PATCH v1 1/1] fs/splice: add missing callback for inaccessible
pages
On 29.04.20 18:07, Dave Hansen wrote:
> On 4/28/20 3:50 PM, Claudio Imbrenda wrote:
>> If a page is inaccesible and it is used for things like sendfile, then
>> the content of the page is not always touched, and can be passed
>> directly to a driver, causing issues.
>>
>> This patch fixes the issue by adding a call to arch_make_page_accessible
>> in page_cache_pipe_buf_confirm; this fixes the issue.
>
> I spent about 5 minutes putting together a patch:
>
> https://sr71.net/~dave/intel/accessible.patch
You only set the page flag for compound pages. that of course leaves a big pile
of pages marked a not accessible, thus explaining the sendto trace and all kind
of other random traces.
What do you see when you also do the SetPageAccessible(page);
in the else page of prep_new_page (order == 0).
(I do get > 10000 of these non compound page allocs just during boot).
>
> It adds a page flag ("daccess") which starts out set. It clears the
> flag it when the page is added to the page cache or mapped as anonymous.
> This are presumably the the two mostly likely kinds of pages to be
> problematic. It re-sets the flag when it hits the new hook for s390:
> arch_make_page_accessible().
>
> It then patches the DMA mapping API. If a page gets to the DMA mapping
> API without being accessible, it hits a tracepoint.
>
> It goes boom shortly after hitting userspace underneath a sys_sendto().
> That code uses lib/iov_iter.c which does get_user_pages_fast() and
> apparently does not set FOLL_PIN, so never hits the s390 arch hooks.
>
> I hacked out the FOLL_PIN check and just universally call the hook for
> all gup_pte_range() calls. I think you'll need to do that as well. I
> don't think the assumptions about FOLL_PIN always preceding I/O is true
> universally. Hacking out FOLL_PIN quiets down the warning spew quite a
> bit, but it still hits a few of them.
>
> Here's one example:
>
> 0) sd-reso-410 | | /* mm_accessible_error: ...
> sd-resolve-410 [000] .... 212.918838: <stack trace>
> => trace_event_raw_event_mm_accessible_error
> => check_page_accessible
> => e1000_xmit_frame
> => dev_hard_start_xmit
> => sch_direct_xmit
> => __qdisc_run
> => __dev_queue_xmit
> => ip_finish_output2
> => ip_output
> => ip_send_skb
> => udp_send_skb.isra.59
> => udp_sendmsg
> => ____sys_sendmsg
> => ___sys_sendmsg
> => __sys_sendmmsg
> => __x64_sys_sendmmsg
> => do_syscall_64
> => entry_SYSCALL_64_after_hwframe
>
> This is just from booting and sitting on an idle Ubuntu 16.04.6 system.
> I think the process in question here is the systemd resolver.
>
Powered by blists - more mailing lists