Blame SPECS/0177-features-shard-Fix-extra-unref-when-inode-object-is-.patch

Packit Service 89bf3d
From 4f0aa008ed393d7ce222c4ea4bd0fa6ed52b48f6 Mon Sep 17 00:00:00 2001
Packit Service 89bf3d
From: Krutika Dhananjay <kdhananj@redhat.com>
Packit Service 89bf3d
Date: Fri, 5 Apr 2019 12:29:23 +0530
Packit Service 89bf3d
Subject: [PATCH 177/178] features/shard: Fix extra unref when inode object is
Packit Service 89bf3d
 lru'd out and added back
Packit Service 89bf3d
Packit Service 89bf3d
Long tale of double unref! But do read...
Packit Service 89bf3d
Packit Service 89bf3d
In cases where a shard base inode is evicted from lru list while still
Packit Service 89bf3d
being part of fsync list but added back soon before its unlink, there
Packit Service 89bf3d
could be an extra inode_unref() leading to premature inode destruction
Packit Service 89bf3d
leading to crash.
Packit Service 89bf3d
Packit Service 89bf3d
One such specific case is the following -
Packit Service 89bf3d
Packit Service 89bf3d
Consider features.shard-deletion-rate = features.shard-lru-limit = 2.
Packit Service 89bf3d
This is an oversimplified example but explains the problem clearly.
Packit Service 89bf3d
Packit Service 89bf3d
First, a file is FALLOCATE'd to a size so that number of shards under
Packit Service 89bf3d
/.shard = 3 > lru-limit.
Packit Service 89bf3d
Shards 1, 2 and 3 need to be resolved. 1 and 2 are resolved first.
Packit Service 89bf3d
Resultant lru list:
Packit Service 89bf3d
                               1 -----> 2
Packit Service 89bf3d
refs on base inode -          (1)  +   (1) = 2
Packit Service 89bf3d
3 needs to be resolved. So 1 is lru'd out. Resultant lru list -
Packit Service 89bf3d
		               2 -----> 3
Packit Service 89bf3d
refs on base inode -          (1)  +   (1) = 2
Packit Service 89bf3d
Packit Service 89bf3d
Note that 1 is inode_unlink()d but not destroyed because there are
Packit Service 89bf3d
non-zero refs on it since it is still participating in this ongoing
Packit Service 89bf3d
FALLOCATE operation.
Packit Service 89bf3d
Packit Service 89bf3d
FALLOCATE is sent on all participant shards. In the cbk, all of them are
Packit Service 89bf3d
added to fync_list.
Packit Service 89bf3d
Resulting fsync list -
Packit Service 89bf3d
                               1 -----> 2 -----> 3 (order doesn't matter)
Packit Service 89bf3d
refs on base inode -          (1)  +   (1)  +   (1) = 3
Packit Service 89bf3d
Total refs = 3 + 2 = 5
Packit Service 89bf3d
Packit Service 89bf3d
Now an attempt is made to unlink this file. Background deletion is triggered.
Packit Service 89bf3d
The first $shard-deletion-rate shards need to be unlinked in the first batch.
Packit Service 89bf3d
So shards 1 and 2 need to be resolved. inode_resolve fails on 1 but succeeds
Packit Service 89bf3d
on 2 and so it's moved to tail of list.
Packit Service 89bf3d
lru list now -
Packit Service 89bf3d
                              3 -----> 2
Packit Service 89bf3d
No change in refs.
Packit Service 89bf3d
Packit Service 89bf3d
shard 1 is looked up. In lookup_cbk, it's linked and added back to lru list
Packit Service 89bf3d
at the cost of evicting shard 3.
Packit Service 89bf3d
lru list now -
Packit Service 89bf3d
                              2 -----> 1
Packit Service 89bf3d
refs on base inode:          (1)  +   (1) = 2
Packit Service 89bf3d
fsync list now -
Packit Service 89bf3d
                              1 -----> 2 (again order doesn't matter)
Packit Service 89bf3d
refs on base inode -         (1)  +   (1) = 2
Packit Service 89bf3d
Total refs = 2 + 2 = 4
Packit Service 89bf3d
After eviction, it is found 3 needs fsync. So fsync is wound, yet to be ack'd.
Packit Service 89bf3d
So it is still inode_link()d.
Packit Service 89bf3d
Packit Service 89bf3d
Now deletion of shards 1 and 2 completes. lru list is empty. Base inode unref'd and
Packit Service 89bf3d
destroyed.
Packit Service 89bf3d
In the next batched deletion, 3 needs to be deleted. It is inode_resolve()able.
Packit Service 89bf3d
It is added back to lru list but base inode passed to __shard_update_shards_inode_list()
Packit Service 89bf3d
is NULL since the inode is destroyed. But its ctx->inode still contains base inode ptr
Packit Service 89bf3d
from first addition to lru list for no additional ref on it.
Packit Service 89bf3d
lru list now -
Packit Service 89bf3d
                              3
Packit Service 89bf3d
refs on base inode -         (0)
Packit Service 89bf3d
Total refs on base inode = 0
Packit Service 89bf3d
Unlink is sent on 3. It completes. Now since the ctx contains ptr to base_inode and the
Packit Service 89bf3d
shard is part of lru list, base shard is unref'd leading to a crash.
Packit Service 89bf3d
Packit Service 89bf3d
FIX:
Packit Service 89bf3d
When shard is readded back to lru list, copy the base inode pointer as is into its inode ctx,
Packit Service 89bf3d
even if it is NULL. This is needed to prevent double unrefs at the time of deleting it.
Packit Service 89bf3d
Packit Service 89bf3d
Upstream patch:
Packit Service 89bf3d
> BUG: 1696136
Packit Service 89bf3d
> Upstream patch link: https://review.gluster.org/c/glusterfs/+/22517
Packit Service 89bf3d
> Change-Id: I99a44039da2e10a1aad183e84f644d63ca552462
Packit Service 89bf3d
> Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Packit Service 89bf3d
Packit Service 89bf3d
Change-Id: I99a44039da2e10a1aad183e84f644d63ca552462
Packit Service 89bf3d
Updates: bz#1694595
Packit Service 89bf3d
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Packit Service 89bf3d
Reviewed-on: https://code.engineering.redhat.com/gerrit/172803
Packit Service 89bf3d
Tested-by: RHGS Build Bot <nigelb@redhat.com>
Packit Service 89bf3d
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Packit Service 89bf3d
---
Packit Service 89bf3d
 .../bug-1696136-lru-limit-equals-deletion-rate.t   | 34 ++++++++++++++++++++++
Packit Service 89bf3d
 xlators/features/shard/src/shard.c                 |  6 ++--
Packit Service 89bf3d
 2 files changed, 36 insertions(+), 4 deletions(-)
Packit Service 89bf3d
 create mode 100644 tests/bugs/shard/bug-1696136-lru-limit-equals-deletion-rate.t
Packit Service 89bf3d
Packit Service 89bf3d
diff --git a/tests/bugs/shard/bug-1696136-lru-limit-equals-deletion-rate.t b/tests/bugs/shard/bug-1696136-lru-limit-equals-deletion-rate.t
Packit Service 89bf3d
new file mode 100644
Packit Service 89bf3d
index 0000000..3e4a65a
Packit Service 89bf3d
--- /dev/null
Packit Service 89bf3d
+++ b/tests/bugs/shard/bug-1696136-lru-limit-equals-deletion-rate.t
Packit Service 89bf3d
@@ -0,0 +1,34 @@
Packit Service 89bf3d
+#!/bin/bash
Packit Service 89bf3d
+
Packit Service 89bf3d
+. $(dirname $0)/../../include.rc
Packit Service 89bf3d
+. $(dirname $0)/../../volume.rc
Packit Service 89bf3d
+. $(dirname $0)/../../fallocate.rc
Packit Service 89bf3d
+
Packit Service 89bf3d
+cleanup
Packit Service 89bf3d
+
Packit Service 89bf3d
+TEST glusterd
Packit Service 89bf3d
+TEST pidof glusterd
Packit Service 89bf3d
+TEST $CLI volume create $V0 replica 3 $H0:$B0/${V0}{0,1,2}
Packit Service 89bf3d
+TEST $CLI volume set $V0 features.shard on
Packit Service 89bf3d
+TEST $CLI volume set $V0 features.shard-block-size 4MB
Packit Service 89bf3d
+TEST $CLI volume set $V0 features.shard-lru-limit 120
Packit Service 89bf3d
+TEST $CLI volume set $V0 features.shard-deletion-rate 120
Packit Service 89bf3d
+TEST $CLI volume set $V0 performance.write-behind off
Packit Service 89bf3d
+TEST $CLI volume start $V0
Packit Service 89bf3d
+
Packit Service 89bf3d
+TEST $GFS --volfile-id=$V0 --volfile-server=$H0 $M0
Packit Service 89bf3d
+
Packit Service 89bf3d
+TEST build_tester $(dirname $0)/bug-1696136.c -lgfapi -Wall -O2
Packit Service 89bf3d
+
Packit Service 89bf3d
+# Create a file
Packit Service 89bf3d
+TEST touch $M0/file1
Packit Service 89bf3d
+
Packit Service 89bf3d
+# Fallocate a 500M file. This will make sure number of participant shards are > lru-limit
Packit Service 89bf3d
+TEST $(dirname $0)/bug-1696136 $H0 $V0 "0" "0" "536870912" /file1 `gluster --print-logdir`/glfs-$V0.log
Packit Service 89bf3d
+
Packit Service 89bf3d
+EXPECT_WITHIN $UMOUNT_TIMEOUT "Y" force_umount $M0
Packit Service 89bf3d
+TEST $CLI volume stop $V0
Packit Service 89bf3d
+TEST $CLI volume delete $V0
Packit Service 89bf3d
+rm -f $(dirname $0)/bug-1696136
Packit Service 89bf3d
+
Packit Service 89bf3d
+cleanup
Packit Service 89bf3d
diff --git a/xlators/features/shard/src/shard.c b/xlators/features/shard/src/shard.c
Packit Service 89bf3d
index 3c4bcdc..c1799ad 100644
Packit Service 89bf3d
--- a/xlators/features/shard/src/shard.c
Packit Service 89bf3d
+++ b/xlators/features/shard/src/shard.c
Packit Service 89bf3d
@@ -689,8 +689,7 @@ __shard_update_shards_inode_list(inode_t *linked_inode, xlator_t *this,
Packit Service 89bf3d
             ctx->block_num = block_num;
Packit Service 89bf3d
             list_add_tail(&ctx->ilist, &priv->ilist_head);
Packit Service 89bf3d
             priv->inode_count++;
Packit Service 89bf3d
-            if (base_inode)
Packit Service 89bf3d
-                ctx->base_inode = inode_ref(base_inode);
Packit Service 89bf3d
+            ctx->base_inode = inode_ref(base_inode);
Packit Service 89bf3d
         } else {
Packit Service 89bf3d
             /*If on the other hand there is no available slot for this inode
Packit Service 89bf3d
              * in the list, delete the lru inode from the head of the list,
Packit Service 89bf3d
@@ -765,8 +764,7 @@ __shard_update_shards_inode_list(inode_t *linked_inode, xlator_t *this,
Packit Service 89bf3d
             else
Packit Service 89bf3d
                 gf_uuid_copy(ctx->base_gfid, gfid);
Packit Service 89bf3d
             ctx->block_num = block_num;
Packit Service 89bf3d
-            if (base_inode)
Packit Service 89bf3d
-                ctx->base_inode = inode_ref(base_inode);
Packit Service 89bf3d
+            ctx->base_inode = inode_ref(base_inode);
Packit Service 89bf3d
             list_add_tail(&ctx->ilist, &priv->ilist_head);
Packit Service 89bf3d
         }
Packit Service 89bf3d
     } else {
Packit Service 89bf3d
-- 
Packit Service 89bf3d
1.8.3.1
Packit Service 89bf3d