lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <E4DF0FFC-5A45-405D-A942-DD8572CC1BD1@javigon.com>
Date:   Thu, 31 Jan 2019 16:33:06 +0000
From:   Javier González <javier@...igon.com>
To:     Hans Holmberg <hans@...tronix.com>
Cc:     Matias Bjorling <mb@...htnvm.io>,
        Hans Holmberg <hans.holmberg@...xlabs.com>,
        linux-block@...r.kernel.org,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH V2] lightnvm: pblk: prevent stall due to wb threshold



> On 31 Jan 2019, at 11.41, Hans Holmberg <hans@...tronix.com> wrote:
> 
> Hi Javier!
> 
> How did you test this? I'm trying to add a test case to our testing framework.
> 
> This is what i ran in qemu, and I got a hang (with this version of the patch)
> 
> nvme lnvm create -d nvme0n1 -t pblk -n pblk0 -f -b 0 -e 0

I run several low configurations without problem. Can you share the qemu configuration and version?

I’m on travel until Friday - I’ll come back to you over the weekend. 

> 
> kernel log: [  116.381799] pblk pblk0: luns:1, lines:280, secs:212736,
> buf entries:128
> 
> # dd if=/dev/zero of=/dev/pblk0 oflag=direct bs=4k count=1
> 1+0 records in
> 1+0 records out
> 4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000480941 s, 8.5 MB/s
> # dd if=/dev/zero of=/dev/pblk0 oflag=direct bs=64k count=1
> 1+0 records in
> 1+0 records out
> 65536 bytes (66 kB, 64 KiB) copied, 0.000477373 s, 137 MB/s
> # dd if=/dev/zero of=/dev/pblk0 oflag=direct bs=128k count=1
> 1+0 records in
> 1+0 records out
> 131072 bytes (131 kB, 128 KiB) copied, 0.000548722 s, 239 MB/s
> # dd if=/dev/zero of=/dev/pblk0 oflag=direct bs=256k count=1
> 1+0 records in
> 1+0 records out
> 262144 bytes (262 kB, 256 KiB) copied, 0.000718515 s, 365 MB/s
> # dd if=/dev/zero of=/dev/pblk0 oflag=direct bs=512k count=1
> <HANG>


> 
> 
>> On Wed, Jan 30, 2019 at 11:28 AM Javier González <javier@...igon.com> wrote:
>> 
>> In order to respect mw_cuinits, pblk's write buffer maintains a
>> backpointer to protect data not yet persisted; when writing to the write
>> buffer, this backpointer defines a threshold that pblk's rate-limiter
>> enforces.
>> 
>> On small PU configurations, the following scenarios might take place: (i)
>> the threshold is larger than the write buffer and (ii) the threshold is
>> smaller than the write buffer, but larger than the maximun allowed
>> split bio - 256KB at this moment (Note that writes are not always
>> split - we only do this when we the size of the buffer is smaller
>> than the buffer). In both cases, pblk's rate-limiter prevents the I/O to
>> be written to the buffer, thus stalling.
>> 
>> This patch fixes the original backpointer implementation by considering
>> the threshold both on buffer creation and on the rate-limiters path,
>> when bio_split is triggered (case (ii) above).
>> 
>> Fixes: 766c8ceb16fc ("lightnvm: pblk: guarantee that backpointer is respected on writer stall")
>> Signed-off-by: Javier González <javier@...igon.com>
>> ---
>> 
>>  Changes since V1:
>>    - Fix a bad arithmetinc on the rate-limiter max_io calculation (from
>>      Hans)
>> 
>> drivers/lightnvm/pblk-rb.c | 25 +++++++++++++++++++------
>> drivers/lightnvm/pblk-rl.c |  5 ++---
>> drivers/lightnvm/pblk.h    |  2 +-
>> 3 files changed, 22 insertions(+), 10 deletions(-)
>> 
>> diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
>> index d4ca8c64ee0f..a6133b50ed9c 100644
>> --- a/drivers/lightnvm/pblk-rb.c
>> +++ b/drivers/lightnvm/pblk-rb.c
>> @@ -45,10 +45,23 @@ void pblk_rb_free(struct pblk_rb *rb)
>> /*
>>  * pblk_rb_calculate_size -- calculate the size of the write buffer
>>  */
>> -static unsigned int pblk_rb_calculate_size(unsigned int nr_entries)
>> +static unsigned int pblk_rb_calculate_size(unsigned int nr_entries,
>> +                                          unsigned int threshold)
>> {
>> -       /* Alloc a write buffer that can at least fit 128 entries */
>> -       return (1 << max(get_count_order(nr_entries), 7));
>> +       unsigned int thr_sz = 1 << (get_count_order(threshold + NVM_MAX_VLBA));
>> +       unsigned int max_sz = max(thr_sz, nr_entries);
>> +       unsigned int max_io;
>> +
>> +       /* Alloc a write buffer that can (i) fit at least two split bios
>> +        * (considering max I/O size NVM_MAX_VLBA, and (ii) guarantee that the
>> +        * threshold will be respected
>> +        */
>> +       max_io = (1 << max((int)(get_count_order(max_sz)),
>> +                               (int)(get_count_order(NVM_MAX_VLBA << 1))));
>> +       if ((threshold + NVM_MAX_VLBA) >= max_io)
>> +               max_io <<= 1;
>> +
>> +       return max_io;
>> }
>> 
>> /*
>> @@ -67,12 +80,12 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
>>        unsigned int alloc_order, order, iter;
>>        unsigned int nr_entries;
>> 
>> -       nr_entries = pblk_rb_calculate_size(size);
>> +       nr_entries = pblk_rb_calculate_size(size, threshold);
>>        entries = vzalloc(array_size(nr_entries, sizeof(struct pblk_rb_entry)));
>>        if (!entries)
>>                return -ENOMEM;
>> 
>> -       power_size = get_count_order(size);
>> +       power_size = get_count_order(nr_entries);
>>        power_seg_sz = get_count_order(seg_size);
>> 
>>        down_write(&pblk_rb_lock);
>> @@ -149,7 +162,7 @@ int pblk_rb_init(struct pblk_rb *rb, unsigned int size, unsigned int threshold,
>>         * Initialize rate-limiter, which controls access to the write buffer
>>         * by user and GC I/O
>>         */
>> -       pblk_rl_init(&pblk->rl, rb->nr_entries);
>> +       pblk_rl_init(&pblk->rl, rb->nr_entries, threshold);
>> 
>>        return 0;
>> }
>> diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
>> index 76116d5f78e4..e9e0af0df165 100644
>> --- a/drivers/lightnvm/pblk-rl.c
>> +++ b/drivers/lightnvm/pblk-rl.c
>> @@ -207,7 +207,7 @@ void pblk_rl_free(struct pblk_rl *rl)
>>        del_timer(&rl->u_timer);
>> }
>> 
>> -void pblk_rl_init(struct pblk_rl *rl, int budget)
>> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold)
>> {
>>        struct pblk *pblk = container_of(rl, struct pblk, rl);
>>        struct nvm_tgt_dev *dev = pblk->dev;
>> @@ -217,7 +217,6 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
>>        int sec_meta, blk_meta;
>>        unsigned int rb_windows;
>> 
>> -
>>        /* Consider sectors used for metadata */
>>        sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
>>        blk_meta = DIV_ROUND_UP(sec_meta, geo->clba);
>> @@ -234,7 +233,7 @@ void pblk_rl_init(struct pblk_rl *rl, int budget)
>>        /* To start with, all buffer is available to user I/O writers */
>>        rl->rb_budget = budget;
>>        rl->rb_user_max = budget;
>> -       rl->rb_max_io = budget >> 1;
>> +       rl->rb_max_io = budget - threshold;
>>        rl->rb_gc_max = 0;
>>        rl->rb_state = PBLK_RL_HIGH;
>> 
>> diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
>> index 72ae8755764e..a6386d5acd73 100644
>> --- a/drivers/lightnvm/pblk.h
>> +++ b/drivers/lightnvm/pblk.h
>> @@ -924,7 +924,7 @@ int pblk_gc_sysfs_force(struct pblk *pblk, int force);
>> /*
>>  * pblk rate limiter
>>  */
>> -void pblk_rl_init(struct pblk_rl *rl, int budget);
>> +void pblk_rl_init(struct pblk_rl *rl, int budget, int threshold);
>> void pblk_rl_free(struct pblk_rl *rl);
>> void pblk_rl_update_rates(struct pblk_rl *rl);
>> int pblk_rl_high_thrs(struct pblk_rl *rl);
>> --
>> 2.17.1
>> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ