lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4350ee16-15d0-1d37-6074-cea5185aac51@redhat.com>
Date:   Wed, 2 Jun 2021 09:44:36 +0200
From:   David Hildenbrand <david@...hat.com>
To:     yong w <yongw.pur@...il.com>
Cc:     minchan@...nel.org, ngupta@...are.org, senozhatsky@...omium.org,
        axboe@...nel.dk, akpm@...ux-foundation.org,
        songmuchun@...edance.com, linux-kernel@...r.kernel.org,
        linux-block@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH V1] zram:calculate available memory when zram is used

On 02.06.21 02:06, yong w wrote:
> Thanks for your reply!
> 
> Indeed, when zram is not triggered, MemAvalible will be greater than Memtotal.
> As you said, this modification will have program compatibility issues,
>   when there
> are some user space scripts use MemAvailable for calculation.
> 
> How about providing another proc interface?
> Such as /proc/xxx, When execute “cat /proc/xxx”, it returns "available
> + swapfree - swapfree * compress ratio" value.

Good question. What would be an appropriate name and description for 
this entry?

We do have in /proc/meminfo already:

MemTotal:
MemAvailable:
SwapTotal:
SwapFree:

I wonder if we could add an entry either for "swapfree - swapfree * 
compress" or "swapfree * compress". But even then, I struggle to find a 
fitting name. We could either define something like

"SwapAvailable:" An estimate of how much swap space is available for 
starting new applications. Note that SwapAvailable is different to 
SwapFree when swap compression, as implemented by zram, is being performed.

So a user can compute

"TotalAvailable = MemAvailable + SwapAvailable".

or if that doesn't make any sense:

"TotalAvailable:" An estimate of how much memory and swap space is 
available for starting new applications. Usually, the value corresponds 
to "MemAvailable + SwapFree", however, swap compression, as implemented 
by zram, might allow for making more efficient use of free swap space.


I'd prefer the first -- if it makes any sense.

You should CC linux-api on follow up patches.

> Thanks in advance!
> 
> 
> David Hildenbrand <david@...hat.com> 于2021年5月31日周一 下午4:17写道:
>>
>> On 30.05.21 19:18, yong w wrote:
>>> When zram is used, available+Swap free memory is obviously bigger than
>>> I actually can use, because zram can compress memory by compression
>>> algorithm and zram compressed data will occupy memory too.
>>>
>>> So, I count the compression rate of zram in the kernel. The available
>>> memory  is calculated as follows:
>>> available + swapfree - swapfree * compress ratio
>>> MemAvailable in /proc/meminfo returns available + zram will save space
>>>
>>
>> This will mean that we can easily have MemAvailable > MemTotal, right?
>> I'm not sure if there are some user space scripts that will be a little
>> disrupted by that. Like calculating "MemUnavailable = MemTotal -
>> MemAvailable".
>>
>> MemAvailable: "An estimate of how much memory is available for starting
>> new applications, without swapping"
>>
>> Although zram isn't "traditional swapping", there is still a performance
>> impact when having to go to zram because it adds an indirection and
>> requires (de)compression. Similar to having very fast swap space (like
>> PMEM). Let's not call something "memory" that doesn't have the same
>> semantics as real memory as in "MemTotal".
>>
>> This doesn't feel right.
>>
>>> Signed-off-by: wangyong <yongw.pur@...il.com <mailto:yongw.pur@...il.com>>
>>> ---
>>>    drivers/block/zram/zcomp.h    |  1 +
>>>    drivers/block/zram/zram_drv.c |  4 ++
>>>    drivers/block/zram/zram_drv.h |  1 +
>>>    fs/proc/meminfo.c             |  2 +-
>>>    include/linux/swap.h          | 10 +++++
>>>    mm/swapfile.c                 | 95
>>> +++++++++++++++++++++++++++++++++++++++++++
>>>    mm/vmscan.c                   |  1 +
>>>    7 files changed, 113 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
>>> index 40f6420..deb2dbf 100644
>>> --- a/drivers/block/zram/zcomp.h
>>> +++ b/drivers/block/zram/zcomp.h
>>> @@ -40,4 +40,5 @@ int zcomp_decompress(struct zcomp_strm *zstrm,
>>>     const void *src, unsigned int src_len, void *dst);
>>>
>>>    bool zcomp_set_max_streams(struct zcomp *comp, int num_strm);
>>> +int get_zram_major(void);
>>>    #endif /* _ZCOMP_H_ */
>>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
>>> index cf8deec..1c6cbd4 100644
>>> --- a/drivers/block/zram/zram_drv.c
>>> +++ b/drivers/block/zram/zram_drv.c
>>> @@ -59,6 +59,10 @@ static void zram_free_page(struct zram *zram, size_t
>>> index);
>>>    static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
>>>     u32 index, int offset, struct bio *bio);
>>>
>>> +int get_zram_major(void)
>>> +{
>>> + return zram_major;
>>> +}
>>>
>>>    static int zram_slot_trylock(struct zram *zram, u32 index)
>>>    {
>>> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
>>> index 6e73dc3..5d8701a 100644
>>> --- a/drivers/block/zram/zram_drv.h
>>> +++ b/drivers/block/zram/zram_drv.h
>>> @@ -88,6 +88,7 @@ struct zram_stats {
>>>     atomic64_t bd_reads; /* no. of reads from backing device */
>>>     atomic64_t bd_writes; /* no. of writes from backing device */
>>>    #endif
>>> + atomic_t min_compr_ratio;
>>>    };
>>>
>>>    struct zram {
>>> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
>>> index 6fa761c..f7bf350 100644
>>> --- a/fs/proc/meminfo.c
>>> +++ b/fs/proc/meminfo.c
>>> @@ -57,7 +57,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
>>>
>>>     show_val_kb(m, "MemTotal:       ", i.totalram);
>>>     show_val_kb(m, "MemFree:        ", i.freeram);
>>> - show_val_kb(m, "MemAvailable:   ", available);
>>> + show_val_kb(m, "MemAvailable:   ", available + count_avail_swaps());
>>>     show_val_kb(m, "Buffers:        ", i.bufferram);
>>>     show_val_kb(m, "Cached:         ", cached);
>>>     show_val_kb(m, "SwapCached:     ", total_swapcache_pages());
>>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>>> index 032485e..3225a2f 100644
>>> --- a/include/linux/swap.h
>>> +++ b/include/linux/swap.h
>>> @@ -514,6 +514,8 @@ extern int init_swap_address_space(unsigned int
>>> type, unsigned long nr_pages);
>>>    extern void exit_swap_address_space(unsigned int type);
>>>    extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
>>>    sector_t swap_page_sector(struct page *page);
>>> +extern void update_zram_zstats(void);
>>> +extern u64 count_avail_swaps(void);
>>>
>>>    static inline void put_swap_device(struct swap_info_struct *si)
>>>    {
>>> @@ -684,6 +686,14 @@ static inline swp_entry_t get_swap_page(struct page
>>> *page)
>>>     return entry;
>>>    }
>>>
>>> +void update_zram_zstats(void)
>>> +{
>>> +}
>>> +
>>> +u64 count_avail_swaps(void)
>>> +{
>>> +}
>>> +
>>>    #endif /* CONFIG_SWAP */
>>>
>>>    #ifdef CONFIG_THP_SWAP
>>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>>> index cbb4c07..93a9dcb 100644
>>> --- a/mm/swapfile.c
>>> +++ b/mm/swapfile.c
>>> @@ -44,6 +44,7 @@
>>>    #include <asm/tlbflush.h>
>>>    #include <linux/swapops.h>
>>>    #include <linux/swap_cgroup.h>
>>> +#include "../drivers/block/zram/zram_drv.h"
>>>
>>>    static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
>>>     unsigned char);
>>> @@ -3408,6 +3409,100 @@ SYSCALL_DEFINE2(swapon, const char __user *,
>>> specialfile, int, swap_flags)
>>>     return error;
>>>    }
>>>
>>> +u64 count_avail_swap(struct swap_info_struct *si)
>>> +{
>>> + u64 result;
>>> + struct zram *z;
>>> + unsigned int free;
>>> + unsigned int ratio;
>>> +
>>> + result = 0;
>>> + if (!si)
>>> + return 0;
>>> +
>>> + //zram calculate available mem
>>> + if (si->flags & SWP_USED && si->swap_map) {
>>> + if (si->bdev->bd_disk->major == get_zram_major()) {
>>> + z = (struct zram *)si->bdev->bd_disk->private_data;
>>> + down_read(&z->init_lock);
>>> + ratio = atomic_read(&z->stats.min_compr_ratio);
>>> + free = (si->pages << (PAGE_SHIFT - 10))
>>> + - (si->inuse_pages << (PAGE_SHIFT - 10));
>>> + if (!ratio)
>>> + result += free / 2;
>>> + else
>>> + result = free * (100 - 10000 / ratio) / 100;
>>> + up_read(&z->init_lock);
>>> + }
>>> + } else
>>> + result += (si->pages << (PAGE_SHIFT - 10))
>>> + - (si->inuse_pages << (PAGE_SHIFT - 10));
>>> +
>>> + return result;
>>> +}
>>> +
>>> +u64 count_avail_swaps(void)
>>> +{
>>> + int type;
>>> + u64 result;
>>> + struct swap_info_struct *si;
>>> +
>>> + result = 0;
>>> + spin_lock(&swap_lock);
>>> + for (type = 0; type < nr_swapfiles; type++) {
>>> + si = swap_info[type];
>>> + spin_lock(&si->lock);
>>> + result += count_avail_swap(si);
>>> + spin_unlock(&si->lock);
>>> + }
>>> + spin_unlock(&swap_lock);
>>> +
>>> + return result;
>>> +}
>>> +
>>> +void update_zram_zstat(struct swap_info_struct *si)
>>> +{
>>> + struct zram *z;
>>> + struct zram_stats *stat;
>>> + int ratio;
>>> + u64 orig_size, compr_data_size;
>>> +
>>> + if (!si)
>>> + return;
>>> +
>>> + //update zram min compress ratio
>>> + if (si->flags & SWP_USED && si->swap_map) {
>>> + if (si->bdev->bd_disk->major == get_zram_major()) {
>>> + z = (struct zram *)si->bdev->bd_disk->private_data;
>>> + down_read(&z->init_lock);
>>> + stat = &z->stats;
>>> + ratio = atomic_read(&stat->min_compr_ratio);
>>> + orig_size = atomic64_read(&stat->pages_stored) << PAGE_SHIFT;
>>> + compr_data_size = atomic64_read(&stat->compr_data_size);
>>> + if (compr_data_size && (!ratio
>>> +     || ((orig_size * 100) / compr_data_size < ratio)))
>>> + atomic_set(&stat->min_compr_ratio,
>>> +    (orig_size * 100) / compr_data_size);
>>> + up_read(&z->init_lock);
>>> + }
>>> + }
>>> +}
>>> +
>>> +void update_zram_zstats(void)
>>> +{
>>> + int type;
>>> + struct swap_info_struct *si;
>>> +
>>> + spin_lock(&swap_lock);
>>> + for (type = 0; type < nr_swapfiles; type++) {
>>> + si = swap_info[type];
>>> + spin_lock(&si->lock);
>>> + update_zram_zstat(si);
>>> + spin_unlock(&si->lock);
>>> + }
>>> + spin_unlock(&swap_lock);
>>> +}
>>> +
>>>    void si_swapinfo(struct sysinfo *val)
>>>    {
>>>     unsigned int type;
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index eb31452..ffaf59b 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -4159,6 +4159,7 @@ static int kswapd(void *p)
>>>     alloc_order);
>>>     reclaim_order = balance_pgdat(pgdat, alloc_order,
>>>     highest_zoneidx);
>>> + update_zram_zstats();
>>>     if (reclaim_order < alloc_order)
>>>     goto kswapd_try_sleep;
>>>     }
>>> --
>>> 2.7.4
>>
>>
>> --
>> Thanks,
>>
>> David / dhildenb
>>
> 


-- 
Thanks,

David / dhildenb

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ