[PATCH] r/o bind mounts: track numbers of writers to mounts
This is the real meat of the entire series. It actually implements the tracking of the number of writers to a mount. However, it causes scalability problems because there can be hundreds of cpus doing open()/close() on files on the same mnt at the same time. Even an atomic_t in the mnt has massive scalaing problems because the cacheline gets so terribly contended. This uses a statically-allocated percpu variable. All want/drop operations are local to a cpu as long that cpu operates on the same mount, and there are no writer count imbalances. Writer count imbalances happen when a write is taken on one cpu, and released on another, like when an open/close pair is performed on two Upon a remount,ro request, all of the data from the percpu variables is collected (expensive, but very rare) and we determine if there are any outstanding writers to the mount. I've written a little benchmark to sit in a loop for a couple of seconds in several cpus in parallel doing open/write/close loops. http://sr71.net/~dave/linux/openbench.c The code in here is a a worst-possible case for this patch. It does opens on a _pair_ of files in two different mounts in parallel. This should cause my code to lose its "operate on the same mount" optimization completely. This worst-case scenario causes a 3% degredation in the benchmark. I could probably get rid of even this 3%, but it would be more complex than what I have here, and I think this is getting into acceptable territory. In practice, I expect writing more than 3 bytes to a file, as well as disk I/O to mask any effects that this has. (To get rid of that 3%, we could have an #defined number of mounts in the percpu variable. So, instead of a CPU getting operate only on percpu data when it accesses only one mount, it could stay on percpu data when it only accesses N or fewer mounts.) [AV] merged fix for __clear_mnt_mount() stepping on freed vfsmount Acked-by: Al Viro <viro@ZenIV.linux.org.uk> Signed-off-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Dave Hansen <haveblue@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
This commit is contained in:
@@ -14,6 +14,7 @@
|
||||
|
||||
#include <linux/types.h>
|
||||
#include <linux/list.h>
|
||||
#include <linux/nodemask.h>
|
||||
#include <linux/spinlock.h>
|
||||
#include <asm/atomic.h>
|
||||
|
||||
@@ -30,6 +31,7 @@ struct mnt_namespace;
|
||||
#define MNT_RELATIME 0x20
|
||||
|
||||
#define MNT_SHRINKABLE 0x100
|
||||
#define MNT_IMBALANCED_WRITE_COUNT 0x200 /* just for debugging */
|
||||
|
||||
#define MNT_SHARED 0x1000 /* if the vfsmount is a shared mount */
|
||||
#define MNT_UNBINDABLE 0x2000 /* if the vfsmount is a unbindable mount */
|
||||
@@ -62,6 +64,11 @@ struct vfsmount {
|
||||
int mnt_expiry_mark; /* true if marked for expiry */
|
||||
int mnt_pinned;
|
||||
int mnt_ghosts;
|
||||
/*
|
||||
* This value is not stable unless all of the mnt_writers[] spinlocks
|
||||
* are held, and all mnt_writer[]s on this mount have 0 as their ->count
|
||||
*/
|
||||
atomic_t __mnt_writers;
|
||||
};
|
||||
|
||||
static inline struct vfsmount *mntget(struct vfsmount *mnt)
|
||||
|
Reference in New Issue
Block a user