linux-kernel-test/fs
Tariq Saeed b1b1e15ef6 ocfs2: NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock
NFS on a 2 node ocfs2 cluster each node exporting dir.  The lock causing
the hang is the global bit map inode lock.  Node 1 is master, has the
lock granted in PR mode; Node 2 is in the converting list (PR -> EX).
There are no holders of the lock on the master node so it should
downconvert to NL and grant EX to node 2 but that does not happen.
BLOCKED + QUEUED in lock res are set and it is on osb blocked list.
Threads are waiting in __ocfs2_cluster_lock on BLOCKED.  One thread
wants EX, rest want PR.  So it is as though the downconvert thread needs
to be kicked to complete the conv.

The hang is caused by an EX req coming into __ocfs2_cluster_lock on the
heels of a PR req after it sets BUSY (drops l_lock, releasing EX
thread), forcing the incoming EX to wait on BUSY without doing anything.
PR has called ocfs2_dlm_lock, which sets the node 1 lock from NL -> PR,
queues ast.

At this time, upconvert (PR ->EX) arrives from node 2, finds conflict
with node 1 lock in PR, so the lock res is put on dlm thread's dirty
listt.

After ret from ocf2_dlm_lock, PR thread now waits behind EX on BUSY till
awoken by ast.

Now it is dlm_thread that serially runs dlm_shuffle_lists, ast, bast, in
that order.  dlm_shuffle_lists ques a bast on behalf of node 2 (which
will be run by dlm_thread right after the ast).  ast does its part, sets
UPCONVERT_FINISHING, clears BUSY and wakes its waiters.  Next,
dlm_thread runs bast.  It sets BLOCKED and kicks dc thread.  dc thread
runs ocfs2_unblock_lock, but since UPCONVERT_FINISHING set, skips doing
anything and reques.

Inside of __ocfs2_cluster_lock, since EX has been waiting on BUSY ahead
of PR, it wakes up first, finds BLOCKED set and skips doing anything but
clearing UPCONVERT_FINISHING (which was actually "meant" for the PR
thread), and this time waits on BLOCKED.  Next, the PR thread comes out
of wait but since UPCONVERT_FINISHING is not set, it skips updating the
l_ro_holders and goes straight to wait on BLOCKED.  So there, we have a
hang! Threads in __ocfs2_cluster_lock wait on BLOCKED, lock res in osb
blocked list.  Only when dc thread is awoken, it will run
ocfs2_unblock_lock and things will unhang.

One way to fix this is to wake the dc thread on the flag after clearing
UPCONVERT_FINISHING

Orabug: 20933419
Signed-off-by: Tariq Saeed <tariq.x.saeed@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Reviewed-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Eric Ren <zren@suse.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-01-21 17:20:51 -08:00
..
9p kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
adfs fs/adfs/adfs.h: tidy up comments 2016-01-20 17:09:18 -08:00
affs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
afs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
autofs4 switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
befs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
bfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
btrfs Merge branch 'for-linus-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-01-18 12:44:40 -08:00
cachefiles convert a bunch of open-coded instances of memdup_user_nul() 2016-01-04 10:26:58 -05:00
ceph kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
cifs page-flags: define PG_locked behavior on compound pages 2016-01-15 17:56:32 -08:00
coda kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
configfs Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2016-01-20 17:20:53 -08:00
cramfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
debugfs debugfs: fix refcount imbalance in start_creating 2015-11-11 02:04:44 -05:00
devpts
dlm convert a bunch of open-coded instances of memdup_user_nul() 2016-01-04 10:26:58 -05:00
ecryptfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
efivarfs
efs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
exofs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
exportfs
ext2 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
ext4 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
f2fs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
fat fat: constify fatent_operations structures 2016-01-20 17:09:18 -08:00
freevxfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
fscache FS-Cache: Handle a write to the page immediately beyond the EOF marker 2015-11-11 02:11:02 -05:00
fuse Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse 2016-01-21 12:14:24 -08:00
gfs2 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2016-01-17 19:13:15 -08:00
hfs fs/hfs/catalog.c: use list_for_each_entry in hfs_cat_delete 2016-01-20 17:09:18 -08:00
hfsplus kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
hostfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
hpfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
hugetlbfs mm/hugetlbfs: unmap pages if page fault raced with hole punch 2016-01-15 17:56:32 -08:00
isofs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
jbd2 fs: use block_device name vsprintf helper 2016-01-06 13:03:18 -05:00
jffs2 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
jfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
kernfs Revert "kernfs: do not account ino_ida allocations to memcg" 2016-01-14 16:00:49 -08:00
lockd lockd: constify nlmsvc_binding structure 2016-01-07 10:10:50 -05:00
logfs Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild 2016-01-20 09:45:43 -08:00
minix kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
ncpfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
nfs Merge branch 'akpm' (patches from Andrew) 2016-01-15 11:41:44 -08:00
nfs_common
nfsd Smaller bugfixes and cleanup, including a fix for a failures of 2016-01-15 12:49:44 -08:00
nilfs2 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
nls
notify fsnotify: destroy marks with call_srcu instead of dedicated thread 2016-01-14 16:00:49 -08:00
ntfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
ocfs2 ocfs2: NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock 2016-01-21 17:20:51 -08:00
omfs
openpromfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
overlayfs Merge branch 'akpm' (patches from Andrew) 2016-01-21 12:32:08 -08:00
proc thp: change pmd_trans_huge_lock() interface to return ptl 2016-01-21 17:20:51 -08:00
pstore pstore: fix code comment to match code 2015-11-02 13:41:52 -08:00
qnx4 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
qnx6 kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
quota quota: constify qtree_fmt_operations structures 2016-01-04 10:58:35 +01:00
ramfs don't put symlink bodies in pagecache into highmem 2015-12-08 22:41:36 -05:00
reiserfs reiserfs: fix dereference of ERR_PTR 2016-01-21 17:20:51 -08:00
romfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
squashfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
sysfs platform/chrome: Branch for v4.4 2015-11-13 21:53:18 -08:00
sysv kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
tracefs tracefs: Fix refcount imbalance in start_creating() 2015-11-04 22:13:45 -05:00
ubifs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
udf Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2016-01-15 11:51:51 -08:00
ufs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
xfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
aio.c
anon_inodes.c
attr.c
bad_inode.c fs/bad_inode.c: is_bad_inode can be boolean 2015-12-06 21:17:14 -05:00
binfmt_aout.c
binfmt_elf_fdpic.c libnvdimm for 4.4: 2015-11-10 12:07:22 -08:00
binfmt_elf.c Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-11-11 09:45:24 -08:00
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c
binfmt_script.c
block_dev.c Merge branch 'for-4.5/core' of git://git.kernel.dk/linux-block 2016-01-19 15:03:34 -08:00
buffer.c fs: use block_device name vsprintf helper 2016-01-06 13:03:18 -05:00
char_dev.c
compat_binfmt_elf.c
compat_ioctl.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
compat.c saner calling conventions for copy_mount_options() 2016-01-04 10:28:32 -05:00
coredump.c fs/coredump: prevent "" / "." / ".." core path components 2016-01-20 17:09:18 -08:00
dax.c dax: re-enable dax pmd mappings 2016-01-15 17:56:32 -08:00
dcache.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
dcookies.c
direct-io.c fix the regression from "direct-io: Fix negative return from dio read beyond eof" 2015-12-08 15:02:42 -05:00
drop_caches.c
eventfd.c Documentation: filesystem: Fix typo in fs/eventfd.c 2015-12-08 14:52:03 +01:00
eventpoll.c epoll: add EPOLLEXCLUSIVE flag 2016-01-20 17:09:18 -08:00
exec.c don't carry MAY_OPEN in op->acc_mode 2016-01-04 10:28:40 -05:00
fcntl.c fcntl: allow to set O_DIRECT flag on pipe 2016-01-09 02:55:37 -05:00
fhandle.c
file_table.c
file.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
filesystems.c
fs_pin.c
fs_struct.c
fs-writeback.c cgroup, memcg, writeback: drop spurious rcu locking around mem_cgroup_css_from_page() 2016-01-15 17:56:32 -08:00
inode.c kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
internal.h Merge branch 'for-linus' into work.misc 2016-01-08 21:20:11 -05:00
ioctl.c Merge branch 'work.copy_file_range' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 16:30:34 -08:00
Kconfig dax: re-enable dax pmd mappings 2016-01-15 17:56:32 -08:00
Kconfig.binfmt
libfs.c switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
locks.c Merge branch 'work.copy_file_range' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 16:30:34 -08:00
Makefile ext4: promote ext4 over ext2 in the default probe order 2015-10-15 10:33:21 -04:00
mbcache.c
mount.h
mpage.c mm, fs: introduce mapping_gfp_constraint() 2015-11-06 17:50:42 -08:00
namei.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
namespace.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
no-block.c
nsfs.c
open.c don't carry MAY_OPEN in op->acc_mode 2016-01-04 10:28:40 -05:00
pipe.c fs/pipe.c: return error code rather than 0 in pipe_write() 2015-11-11 02:18:26 -05:00
pnode.c
pnode.h
posix_acl.c xattr handlers: Simplify list operation 2015-12-13 19:46:12 -05:00
proc_namespace.c vfs: show_vfsstat: remove redundant initialization and check of error code 2015-12-06 21:17:16 -05:00
read_write.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00
readdir.c
select.c poll: plug an unused argument to do_poll 2016-01-06 08:26:52 -05:00
seq_file.c fs, seqfile: always allow oom killer 2015-11-06 17:50:42 -08:00
signalfd.c
splice.c fs: __generic_file_splice_read retry lookup on AOP_TRUNCATED_PAGE 2016-01-09 02:55:35 -05:00
stack.c
stat.c fs/stat.c: drop the last new_valid_dev check 2016-01-16 11:17:23 -08:00
statfs.c
super.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial 2016-01-14 17:04:19 -08:00
sync.c fs/sync.c: make sync_file_range(2) use WB_SYNC_NONE writeback 2015-11-06 17:50:42 -08:00
timerfd.c
userfaultfd.c
utimes.c
xattr.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-01-12 17:11:47 -08:00