[PATCH] sys_sync_file_range()

Remove the recently-added LINUX_FADV_ASYNC_WRITE and LINUX_FADV_WRITE_WAIT
fadvise() additions, do it in a new sys_sync_file_range() syscall instead.
Reasons:

- It's more flexible.  Things which would require two or three syscalls with
  fadvise() can be done in a single syscall.

- Using fadvise() in this manner is something not covered by POSIX.

The patch wires up the syscall for x86.

The sycall is implemented in the new fs/sync.c.  The intention is that we can
move sys_fsync(), sys_fdatasync() and perhaps sys_sync() into there later.

Documentation for the syscall is in fs/sync.c.

A test app (sync_file_range.c) is in
http://www.zip.com.au/~akpm/linux/patches/stuff/ext3-tools.tar.gz.

The available-to-GPL-modules do_sync_file_range() is for knfsd: "A COMMIT can
say NFS_DATA_SYNC or NFS_FILE_SYNC.  I can skip the ->fsync call for
NFS_DATA_SYNC which is hopefully the more common."

Note: the `async' writeout mode SYNC_FILE_RANGE_WRITE will turn synchronous if
the queue is congested.  This is trivial to fix: add a new flag bit, set
wbc->nonblocking.  But I'm not sure that we want to expose implementation
details down to that level.

Note: it's notable that we can sync an fd which wasn't opened for writing.
Same with fsync() and fdatasync()).

Note: the code takes some care to handle attempts to sync file contents
outside the 16TB offset on 32-bit machines.  It makes such attempts appear to
succeed, for best 32-bit/64-bit compatibility.  Perhaps it should make such
requests fail...

Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This commit is contained in:
Andrew Morton
2006-03-31 02:30:42 -08:00
committed by Linus Torvalds
parent d6dfd1310d
commit f79e2abb9b
8 changed files with 177 additions and 28 deletions

View File

@ -35,17 +35,6 @@
*
* LINUX_FADV_ASYNC_WRITE: push some or all of the dirty pages at the disk.
*
* LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE: push all of the currently
* dirty pages at the disk.
*
* LINUX_FADV_WRITE_WAIT, LINUX_FADV_ASYNC_WRITE, LINUX_FADV_WRITE_WAIT: push
* all of the currently dirty pages at the disk, wait until they have been
* written.
*
* It should be noted that none of these operations write out the file's
* metadata. So unless the application is strictly performing overwrites of
* already-instantiated disk blocks, there are no guarantees here that the data
* will be available after a crash.
*/
asmlinkage long sys_fadvise64_64(int fd, loff_t offset, loff_t len, int advice)
{
@ -129,15 +118,6 @@ asmlinkage long sys_fadvise64_64(int fd, loff_t offset, loff_t len, int advice)
invalidate_mapping_pages(mapping, start_index,
end_index);
break;
case LINUX_FADV_ASYNC_WRITE:
ret = __filemap_fdatawrite_range(mapping, offset, endbyte,
WB_SYNC_NONE);
break;
case LINUX_FADV_WRITE_WAIT:
ret = wait_on_page_writeback_range(mapping,
offset >> PAGE_CACHE_SHIFT,
endbyte >> PAGE_CACHE_SHIFT);
break;
default:
ret = -EINVAL;
}