fs: provide rcu-walk aware permission i_ops
Signed-off-by: Nick Piggin <npiggin@kernel.dk>
This commit is contained in:
@@ -47,8 +47,8 @@ ata *);
|
||||
void * (*follow_link) (struct dentry *, struct nameidata *);
|
||||
void (*put_link) (struct dentry *, struct nameidata *, void *);
|
||||
void (*truncate) (struct inode *);
|
||||
int (*permission) (struct inode *, int, struct nameidata *);
|
||||
int (*check_acl)(struct inode *, int);
|
||||
int (*permission) (struct inode *, int, unsigned int);
|
||||
int (*check_acl)(struct inode *, int, unsigned int);
|
||||
int (*setattr) (struct dentry *, struct iattr *);
|
||||
int (*getattr) (struct vfsmount *, struct dentry *, struct kstat *);
|
||||
int (*setxattr) (struct dentry *, const char *,const void *,size_t,int);
|
||||
@@ -76,7 +76,7 @@ follow_link: no
|
||||
put_link: no
|
||||
truncate: yes (see below)
|
||||
setattr: yes
|
||||
permission: no
|
||||
permission: no (may not block if called in rcu-walk mode)
|
||||
check_acl: no
|
||||
getattr: no
|
||||
setxattr: yes
|
||||
|
@@ -316,11 +316,9 @@ The detailed design for rcu-walk is like this:
|
||||
|
||||
The cases where rcu-walk cannot continue are:
|
||||
* NULL dentry (ie. any uncached path element)
|
||||
* parent with d_inode->i_op->permission or ACLs
|
||||
* Following links
|
||||
|
||||
In future patches, permission checks become rcu-walk aware. It may be possible
|
||||
eventually to make following links rcu-walk aware.
|
||||
It may be possible eventually to make following links rcu-walk aware.
|
||||
|
||||
Uncached path elements will always require dropping to ref-walk mode, at the
|
||||
very least because i_mutex needs to be grabbed, and objects allocated.
|
||||
@@ -336,9 +334,49 @@ or stored into. The result is massive improvements in performance and
|
||||
scalability of path resolution.
|
||||
|
||||
|
||||
Interesting statistics
|
||||
======================
|
||||
|
||||
The following table gives rcu lookup statistics for a few simple workloads
|
||||
(2s12c24t Westmere, debian non-graphical system). Ungraceful are attempts to
|
||||
drop rcu that fail due to d_seq failure and requiring the entire path lookup
|
||||
again. Other cases are successful rcu-drops that are required before the final
|
||||
element, nodentry for missing dentry, revalidate for filesystem revalidate
|
||||
routine requiring rcu drop, permission for permission check requiring drop,
|
||||
and link for symlink traversal requiring drop.
|
||||
|
||||
rcu-lookups restart nodentry link revalidate permission
|
||||
bootup 47121 0 4624 1010 10283 7852
|
||||
dbench 25386793 0 6778659(26.7%) 55 549 1156
|
||||
kbuild 2696672 10 64442(2.3%) 108764(4.0%) 1 1590
|
||||
git diff 39605 0 28 2 0 106
|
||||
vfstest 24185492 4945 708725(2.9%) 1076136(4.4%) 0 2651
|
||||
|
||||
What this shows is that failed rcu-walk lookups, ie. ones that are restarted
|
||||
entirely with ref-walk, are quite rare. Even the "vfstest" case which
|
||||
specifically has concurrent renames/mkdir/rmdir/ creat/unlink/etc to excercise
|
||||
such races is not showing a huge amount of restarts.
|
||||
|
||||
Dropping from rcu-walk to ref-walk mean that we have encountered a dentry where
|
||||
the reference count needs to be taken for some reason. This is either because
|
||||
we have reached the target of the path walk, or because we have encountered a
|
||||
condition that can't be resolved in rcu-walk mode. Ideally, we drop rcu-walk
|
||||
only when we have reached the target dentry, so the other statistics show where
|
||||
this does not happen.
|
||||
|
||||
Note that a graceful drop from rcu-walk mode due to something such as the
|
||||
dentry not existing (which can be common) is not necessarily a failure of
|
||||
rcu-walk scheme, because some elements of the path may have been walked in
|
||||
rcu-walk mode. The further we get from common path elements (such as cwd or
|
||||
root), the less contended the dentry is likely to be. The closer we are to
|
||||
common path elements, the more likely they will exist in dentry cache.
|
||||
|
||||
|
||||
Papers and other documentation on dcache locking
|
||||
================================================
|
||||
|
||||
1. Scaling dcache with RCU (http://linuxjournal.com/article.php?sid=7124).
|
||||
|
||||
2. http://lse.sourceforge.net/locking/dcache/dcache.html
|
||||
|
||||
|
||||
|
@@ -379,4 +379,9 @@ where possible.
|
||||
the filesystem provides it), which requires dropping out of rcu-walk mode. This
|
||||
may now be called in rcu-walk mode (nd->flags & LOOKUP_RCU). -ECHILD should be
|
||||
returned if the filesystem cannot handle rcu-walk. See
|
||||
Documentation/filesystems/vfs.txt for more details.
|
||||
|
||||
permission and check_acl are inode permission checks that are called
|
||||
on many or all directory inodes on the way down a path walk (to check for
|
||||
exec permission). These must now be rcu-walk aware (flags & IPERM_RCU). See
|
||||
Documentation/filesystems/vfs.txt for more details.
|
||||
|
@@ -325,7 +325,8 @@ struct inode_operations {
|
||||
void * (*follow_link) (struct dentry *, struct nameidata *);
|
||||
void (*put_link) (struct dentry *, struct nameidata *, void *);
|
||||
void (*truncate) (struct inode *);
|
||||
int (*permission) (struct inode *, int, struct nameidata *);
|
||||
int (*permission) (struct inode *, int, unsigned int);
|
||||
int (*check_acl)(struct inode *, int, unsigned int);
|
||||
int (*setattr) (struct dentry *, struct iattr *);
|
||||
int (*getattr) (struct vfsmount *mnt, struct dentry *, struct kstat *);
|
||||
int (*setxattr) (struct dentry *, const char *,const void *,size_t,int);
|
||||
@@ -414,6 +415,13 @@ otherwise noted.
|
||||
permission: called by the VFS to check for access rights on a POSIX-like
|
||||
filesystem.
|
||||
|
||||
May be called in rcu-walk mode (flags & IPERM_RCU). If in rcu-walk
|
||||
mode, the filesystem must check the permission without blocking or
|
||||
storing to the inode.
|
||||
|
||||
If a situation is encountered that rcu-walk cannot handle, return
|
||||
-ECHILD and it will be called again in ref-walk mode.
|
||||
|
||||
setattr: called by the VFS to set attributes for a file. This method
|
||||
is called by chmod(2) and related system calls.
|
||||
|
||||
|
Reference in New Issue
Block a user