linux-kernel - Re: [PATCH 2/2] mm: shmem: improve the tmpfs large folio read performance

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <254f1c38-b2ce-4c83-a13d-54e7c0271dcb@linux.alibaba.com>
Date: Thu, 17 Oct 2024 10:46:32 +0800
From: Baolin Wang <baolin.wang@...ux.alibaba.com>
To: Kefeng Wang <wangkefeng.wang@...wei.com>, akpm@...ux-foundation.org,
 hughd@...gle.com
Cc: willy@...radead.org, david@...hat.com, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] mm: shmem: improve the tmpfs large folio read
 performance



On 2024/10/16 20:36, Kefeng Wang wrote:
> 
> 
> On 2024/10/16 18:09, Baolin Wang wrote:
>> The tmpfs has already supported the PMD-sized large folios, but the tmpfs
>> read operation still performs copying at the PAGE SIZE granularity, which
>> is unreasonable. This patch changes to copy data at the folio 
>> granularity,
>> which can improve the read performance, as well as changing to use folio
>> related functions.
>>
>> Use 'fio bs=64k' to read a 1G tmpfs file populated with 2M THPs, and I 
>> can
>> see about 20% performance improvement, and no regression with bs=4k.
>> Before the patch:
>> READ: bw=10.0GiB/s
>>
>> After the patch:
>> READ: bw=12.0GiB/s
>>
>> Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
>> ---
>>   mm/shmem.c | 22 ++++++++++++----------
>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/shmem.c b/mm/shmem.c
>> index edab02a26aac..7e79b6a96da0 100644
>> --- a/mm/shmem.c
>> +++ b/mm/shmem.c
>> @@ -3108,13 +3108,12 @@ static ssize_t shmem_file_read_iter(struct 
>> kiocb *iocb, struct iov_iter *to)
>>       ssize_t retval = 0;
>>       index = iocb->ki_pos >> PAGE_SHIFT;
>> -    offset = iocb->ki_pos & ~PAGE_MASK;
>>       for (;;) {
>>           struct folio *folio = NULL;
>> -        struct page *page = NULL;
>>           unsigned long nr, ret;
>>           loff_t end_offset, i_size = i_size_read(inode);
>> +        size_t fsize;
>>           if (unlikely(iocb->ki_pos >= i_size))
>>               break;
>> @@ -3128,8 +3127,9 @@ static ssize_t shmem_file_read_iter(struct kiocb 
>> *iocb, struct iov_iter *to)
>>           if (folio) {
>>               folio_unlock(folio);
>> -            page = folio_file_page(folio, index);
>> -            if (PageHWPoison(page)) {
>> +            if (folio_test_hwpoison(folio) ||
>> +                (folio_test_large(folio) &&
>> +                 folio_test_has_hwpoisoned(folio))) {
>>                   folio_put(folio);
>>                   error = -EIO;
>>                   break;
>> @@ -3147,7 +3147,12 @@ static ssize_t shmem_file_read_iter(struct 
>> kiocb *iocb, struct iov_iter *to)
>>               break;
>>           }
>>           end_offset = min_t(loff_t, i_size, iocb->ki_pos + to->count);
>> -        nr = min_t(loff_t, end_offset - iocb->ki_pos, PAGE_SIZE - 
>> offset);
>> +        if (folio)
>> +            fsize = folio_size(folio);
>> +        else
>> +            fsize = PAGE_SIZE;
>> +        offset = iocb->ki_pos & (fsize - 1);
>> +        nr = min_t(loff_t, end_offset - iocb->ki_pos, fsize - offset);
>>           if (folio) {
>>               /*
>> @@ -3156,7 +3161,7 @@ static ssize_t shmem_file_read_iter(struct kiocb 
>> *iocb, struct iov_iter *to)
>>                * before reading the page on the kernel side.
>>                */
> 
> We'd better to update all the comment from page to folio.

Ack.