linux-kernel-test/arch/powerpc/lib
Anton Blanchard 17968fbbd1 powerpc: 64bit optimised __clear_user
I noticed __clear_user high up in a profile of one of my RAID stress
tests. The testcase was doing a dd from /dev/zero which ends up
calling __clear_user.

__clear_user is basically a loop with a single 4 byte store which
is horribly slow. We can do much better by aligning the desination
and doing 32 bytes of 8 byte stores in a loop.

The following testcase was used to verify the patch:

http://ozlabs.org/~anton/junkcode/stress_clear_user.c

To show the improvement in performance I ran a dd from /dev/zero
to /dev/null on a POWER7 box:

Before:

# dd if=/dev/zero of=/dev/null bs=1M count=10000
10485760000 bytes (10 GB) copied, 3.72379 s, 2.8 GB/s

After:

# time dd if=/dev/zero of=/dev/null bs=1M count=10000
10485760000 bytes (10 GB) copied, 0.728318 s, 14.4 GB/s

Over 5x faster.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-07-03 14:14:41 +10:00
..
alloc.c Disintegrate asm/system.h for PowerPC 2012-03-28 18:30:02 +01:00
checksum_32.S
checksum_64.S powerpc: Optimise 64bit csum_partial_copy_generic and add csum_and_copy_from_user 2010-09-02 14:07:30 +10:00
checksum_wrappers_64.c powerpc: various straight conversions from module.h --> export.h 2011-10-31 19:30:44 -04:00
code-patching.c powerpc: Have patch_instruction detect faults 2012-07-03 14:14:38 +10:00
copy_32.S powerpc: Fix incorrect .stabs entry for copy_32.S 2010-09-02 14:07:34 +10:00
copypage_64.S powerpc: Simplify 4k/64k copy_page logic 2011-05-19 14:30:42 +10:00
copyuser_64.S powerpc: Remove CONFIG_POWER4_ONLY 2012-04-30 15:37:26 +10:00
copyuser_power7_vmx.c Disintegrate asm/system.h for PowerPC 2012-03-28 18:30:02 +01:00
copyuser_power7.S powerpc: POWER7 optimised copy_to_user/copy_from_user using VMX 2011-12-19 14:40:40 +11:00
crtsavres.S powerpc: Fix module building for gcc 4.5 and 64 bit 2010-07-08 18:11:38 +10:00
devres.c powerpc: various straight conversions from module.h --> export.h 2011-10-31 19:30:44 -04:00
div64.S
feature-fixups-test.S powerpc: Ensure the else case of feature sections will fit 2011-01-21 14:08:33 +11:00
feature-fixups.c powerpc: Copy down exception vectors after feature fixups 2011-11-16 14:47:54 +11:00
hweight_64.S powerpc: Hardcode popcnt instructions for old assemblers 2010-12-09 15:35:30 +11:00
ldstfp.S powerpc: mtmsrd not defined 2010-09-02 14:07:34 +10:00
locks.c powerpc: Remove FW_FEATURE ISERIES from arch code 2012-03-21 11:16:11 +11:00
Makefile powerpc: 64bit optimised __clear_user 2012-07-03 14:14:41 +10:00
mem_64.S powerpc: Remove CONFIG_POWER4_ONLY 2012-04-30 15:37:26 +10:00
memcpy_64.S powerpc: Remove CONFIG_POWER4_ONLY 2012-04-30 15:37:26 +10:00
rheap.c powerpc: various straight conversions from module.h --> export.h 2011-10-31 19:30:44 -04:00
sstep.c Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc 2011-05-20 13:28:01 -07:00
string_64.S powerpc: 64bit optimised __clear_user 2012-07-03 14:14:41 +10:00
string.S powerpc: 64bit optimised __clear_user 2012-07-03 14:14:41 +10:00
usercopy_64.c