Convert ledtrig-ide-disk code to use the generic API for one-shot LED
blinking.
This patch changes slightly the behaviour of the trigger, as while the
original version kept the LED on under heavy activity, the new one keeps
a constant on-off blink at 1 / (2 * BLINK_DELAY) Hz.
(bryan.wu@canonical.com: remove 2 useless included header files)
Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com>
Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
Add two new functions, led_blink_set_oneshot and
led_trigger_blink_oneshot, to be used by triggers for one-shot blink of
led devices.
This is implemented extending the existing software-blink code, and uses
the same timer and handler function.
The behavior of the code is to do a blink-on, blink-off sequence when
the function is called, ignoring other calls until the sequence is
completed so that the leds keep blinking at constant rate if the
functions are called repeatedly.
This is meant to be used by drivers which needs to trigger on sporadic
event, but doesn't have clear busy/idle trigger points.
After the blink sequence the led remains off. This behavior can be
inverted setting the "invert" argument, which blink the led off, than on
and leave the led on after the sequence.
(bryan.wu@canonical.com: rebase to commit 'leds: don't disable blinking
when writing the same value to delay_on or delay_off')
Signed-off-by: Fabio Baltieri <fabio.baltieri@gmail.com>
Acked-by: Shuah Khan <shuahkhan@gmail.com>
Signed-off-by: Bryan Wu <bryan.wu@canonical.com>
On input packet processing, rt->rt_iif will be zero if we should
use skb->dev->ifindex.
Since we access rt->rt_iif consistently via inet_iif(), that is
the only spot whose interpretation have to adjust.
Signed-off-by: David S. Miller <davem@davemloft.net>
Make it follow device decapsulation, from things such as VLAN and
bonding.
The stuff that actually cares about pre-demuxed device pointers, is
handled by the "orig_dev" variable in __netif_receive_skb(). And
the only consumer of that is the po->origdev feature of AF_PACKET
sockets.
Signed-off-by: David S. Miller <davem@davemloft.net>
Use inet_iif() consistently, and for TCP record the input interface of
cached RX dst in inet sock.
rt->rt_iif is going to be encoded differently, so that we can
legitimately cache input routes in the FIB info more aggressively.
When the input interface is "use SKB device index" the rt->rt_iif will
be set to zero.
This forces us to move the TCP RX dst cache installation into the ipv4
specific code, and as well it should since doing the route caching for
ipv6 is pointless at the moment since it is not inspected in the ipv6
input paths yet.
Also, remove the unlikely on dst->obsolete, all ipv4 dsts have
obsolete set to a non-zero value to force invocation of the check
callback.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull arm-soc defconfig updates from Arnd Bergmann:
"These are changes to the default configuration files, to account for
kernel changes and new hardware."
* tag 'defconfig' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: exynos_defconfig: enable more platforms in defconfig
ARM: imx_v4_v5_defconfig: update features
ARM: imx_v6_v7_defconfig: update features
ARM: mxs: defconfig: Enable CONFIG_COMMON_CLK_DEBUG
ARM: mxs_defconfig: Enable RTC driver
ARM: LPC32xx: Defconfig update
ARM: mxs_defconfig: Let AUART driver be built by default
ARM: mxs: Enable MACH_APX4DEVKIT
ARM: mxs: Let GPMI driver be built by default
ARM: tegra: defconfig updates
Pull support for three new arm SoC types from Arnd Bergmann:
- The mvebu platform includes Marvell's Armada XP and Armada 370 chips,
made by the mvebu business unit inside of Marvell. Since the same
group also made the older but similar platforms we call "orion5x",
"kirkwood", "mv78xx0" and "dove", we plan to move all of them into
the mach-mvebu directory in the future.
- socfpga is Altera's platform based on Cortex-A9 cores and a lot of
FPGA space. This is similar to the Xilinx zynq platform we already
support. The code is particularly clean, which is helped by the fact
that the hardware doesn't do much besides the parts that are expected
to get added in the FPGA.
- The OMAP subarchitecture gains support for the latest generation, the
OMAP5 based on the new Cortex-A15 core. Support is rather
rudimentary for now, but will be extended in the future.
* tag 'newsoc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (25 commits)
ARM: socfpga: initial support for Altera's SOCFPGA platform
arm: mvebu: generate DTBs for supported SoCs
ARM: mvebu: MPIC: read number of interrupts from control register
arm: mach-mvebu: add entry to MAINTAINERS
arm: mach-mvebu: add compilation/configuration change
arm: mach-mvebu: add defconfig
arm: mach-mvebu: add documentation for new device tree bindings
arm: mach-mvebu: add support for Armada 370 and Armada XP with DT
arm: mach-mvebu: add source files
arm: mach-mvebu: add header
clocksource: time-armada-370-xp: Marvell Armada 370/XP SoC timer driver
ARM: Kconfig update to support additional GPIOs in OMAP5
ARM: OMAP5: Add the build support
arm/dts: OMAP5: Add omap5 dts files
ARM: OMAP5: board-generic: Add device tree support
ARM: omap2+: board-generic: clean up the irq data from board file
ARM: OMAP5: Add SMP support
ARM: OMAP5: Add the WakeupGen IP updates
ARM: OMAP5: l3: Add l3 error handler support for omap5
ARM: OMAP5: gpmc: Update gpmc_init()
...
Conflicts:
Documentation/devicetree/bindings/arm/omap/omap.txt
arch/arm/mach-omap2/Makefile
drivers/clocksource/Kconfig
drivers/clocksource/Makefile
Pull arm-soc cleanups, part 2, from Arnd Bergmann:
"These omap cleanups have dependencies on earlier omap branches that in
turn depend on other cleanups, so they could not go into the same
branch."
* tag 'cleanup2' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: OMAP: sdrc: Fix the build break for OMAP4 only builds
ARM: OMAP2+: dmtimer: cleanup fclk usage
ARM: OMAP2+: Fix mismerge for omap_hwmod_get_main_clk() API
ARM: OMAP2+: Remove unnecessary ifdef around __omap2_set_globals
ARM: OMAP2+: am33xx: Change cpu_is_am33xx to soc_is_am33xx
ARM: OMAP2+: am33xx: Make am33xx as a separate class
ARM: OMAP2+: Move omap3 dpll ops to dpll3xxx.c
ARM: OMAP2+: All OMAP2PLUS uses omap-device.o target so add one entry
ARM: OMAP: dmtimer: use devm_ API and do some cleanup in probe()
ARM: OMAP2+: hwmod code: add support to set dmadisable in hwmod framework
ARM: OMAP2+: PRM/CM: Move the stubbed prm and cm functions to prcm.c file and make them __weak
ARM: OMAP2+: hwmod: add omap_hwmod_get_main_clk() API
ARM: OMAP3+: dpll: optimize noncore dpll locking logic
ARM: OMAP3: control: add definition for CONTROL_CAMERA_PHY_CTRL
ARM: OMAP2+: powerdomain code: Fix Wake-up power domain power status
ARM: OMAP4: clockdomain/CM code: Update supported transition modes
ARM: OMAP3/4: omap_hwmod: Add rstst_offs field to struct omap_hwmod_omap4_prcm
ARM: OMAP2+: hwmod: Add new sysc_type3 into omap_hwmod required for am33xx
Pull arm-soc timer updates from Arnd Bergmann:
"This contains two branches dealing with timers, one for the picoxcell
platform that is now using DT with the platform-independent
dw_apb_timer driver. The other change is for the omap-specific
dmtimer driver."
* tag 'timer' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
clocksource: dw_apb_timer: Add common DTS glue for dw_apb_timer
ARM: OMAP2+: Simplify dmtimer clock aliases
ARM: OMAP2+: Move dmtimer clock set function to dmtimer driver
ARM: OMAP1: Fix dmtimer support
ARM: OMAP: Add flag to indicate if a timer needs a manual reset
ARM: OMAP: Remove timer function pointer for context loss counter
ARM: OMAP: Remove loses_context variable from timer platform data
ARM: OMAP2+: Fix external clock support for dmtimers
ARM: OMAP2+: HWMOD: Correct timer device attributes
ARM: OMAP: Add DMTIMER capability variable to represent timer features
ARM: OMAP2+: Add dmtimer platform function to reserve systimers
ARM: OMAP2+: Remove unused max number of timers definition
ARM: OMAP: Remove unnecessary clk structure
Pull arm-soc spi updates from Arnd Bergmann:
"These changes conceptually belong into the spi tree, but we decided to
put them into arm-soc to better deal with interdependencies with other
platform specific patches that are already there."
* tag 'spi' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
spi/s3c64xx: Expand S3C64XX_SPI_{DE,}ACT macros at call sites
spi/s3c64xx: Convert to devm_request_and_ioremap()
spi/s3c64xx: Put the /CS GPIO into output mode
spi/s3c64xx: Fix handling of errors in gpio_request()
Pull arm-soc device tree description updates from Arnd Bergmann:
"This branch contains two kinds of updates: Some platforms in the
process of getting converted to device tree based booting, and the
platform specific patches necessary for that are included here.
Other platforms are already converted, so we just need to update the
actual device tree source files and the binding documents to add
support for new board and new drivers.
In the future we will probably separate those into two branches, and
in the long run, the plan is to move the device tree source files out
of the kernel repository, but that has to wait until we have completed
a much larger portion of the binding documents."
Fix up trivial conflicts in arch/arm/mach-imx/clk-imx6q.c due to newly
added clkdev registers next to a few removed unnecessary ones.
* tag 'dt' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (119 commits)
ARM: LPC32xx: Add PWM to base dts file
ARM: EXYNOS: mark the DMA channel binding for SPI as preliminary
ARM: dts: Add nodes for spi controllers for SAMSUNG EXYNOS5 platforms
ARM: EXYNOS: Enable platform support for SPI controllers for EXYNOS5
ARM: EXYNOS: Add spi clock support for EXYNOS5
ARM: dts: Add nodes for spi controllers for SAMSUNG EXYNOS4 platforms
ARM: EXYNOS: Enable platform support for SPI controllers for EXYNOX4
ARM: EXYNOS: Fix the incorrect hierarchy of spi controller bus clock
ARM: ux500: Remove PMU platform registration when booting with DT
ARM: ux500: Remove temporary snowball_of_platform_devs enablement structure
ARM: ux500: Ensure vendor specific properties have the vendor's identifier
pinctrl: pinctrl-nomadik: Append sleepmode property with vendor specific prefixes
ARM: ux500: Move rtc-pl031 registration to Device Tree when enabled
ARM: ux500: Enable the AB8500 RTC for all DT:ed DB8500 based devices
ARM: ux500: Correctly reference IRQs supplied by the AB8500 from Device Tree
ARM: ux500: Apply ab8500-debug node do the db8500 DT structure
ARM: ux500: Add a ab8500-usb Device Tree node for db8500 based devices
ARM: ux500: Add db8500 Device Tree node for misc/ab8500-pwm
ARM: ux500: Add db8500 Device Tree node for ab8500-sysctrl
ARM: ux500: Enable LED heartbeat functionality on Snowbal via DT
...
Pull samsung arm-soc dma changes from Arnd Bergmann:
"Some platforms are not yet converted to use the dmaengine framework,
including some of the samsung SoCs. In the meantime, we treat this as
platform code and merge the patches through the arm-soc tree."
* tag 'dma' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: SAMSUNG: Fix compiler warning in dma-ops.c file
ASoC: follow the updated samsung DMA common operations
spi/s3c64xx: Add the use of DMA config operation
ARM: SAMSUNG: Add config() function in DMA common operations
Pull arm soc-specific updates from Arnd Bergmann:
"This is stuff that does not fit well into another category and in
particular is not related to a particular board. The largest part in
here is extending the am33xx support in the omap platform."
Fix up trivial conflicts in arch/arm/mach-{imx/mach-mx35_3ds.c, tegra/Makefile}
* tag 'soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (74 commits)
ARM: LPC32xx: Add PWM support
ARM: LPC32xx: Add PWM clock
ARM: LPC32xx: Set system serial based on cpu unique id
ARM: vexpress: Config option for early printk console
ARM: vexpress: Add Device Tree for V2P-CA15_CA7 core tile
ARM: vexpress: Convert V2P-CA15 Device Tree to 64 bit addresses
ARM: vexpress: Add fixed regulator for SMSC
ARM: vexpress: Add missing SP804 interrupt in motherboard's DTS files
ARM: vexpress: Initial common clock support
ARM: SAMSUNG: Introduce Kconfig variable for Samsung custom clk API
ARM: EXYNOS: Add missing static storage class specifier in pmu.c file
ARM: EXYNOS: Make combiner_init function static
ARM: EXYNOS: Update HSOTG PHY clock setting for EXYNOS4X12
ARM: versatile: Make plat-versatile clock optional
ARM: vexpress: Check master site in daughterboard's sysctl operations
ARM: vexpress: remove automatic errata workaround selection
ARM: LPC32xx: Adjust to pl08x DMA interface changes
ARM: EXYNOS: Clear SYS_WDTRESET bit to use watchdog reset
ARM: imx: fix mx51 ehci setup errors
ARM: imx: make ehci power/oc polarities configurable
...
Pull general arm-soc cleanups from Arnd Bergmann:
"These are all boring changes, moving stuff around or renaming things
mostly, and also getting rid of stuff that is duplicate or should not
be there to start with. Platform-wise this is all over the place,
mainly omap, samsung, at91, imx and tegra."
Resolve trivial conflict in arch/arm/mach-omap2/clockdomains3xxx_data.c
* tag 'cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (67 commits)
ARM: clps711x: Remove the setting of the time
ARM: clps711x: Removed superfluous transform virt_to_bus and related functions
ARM: clps711x/p720t: Replace __initcall by .init_early call
ARM: S3C24XX: Remove unused GPIO definitions for Openmoko GTA02 board
ARM: S3C24XX: Remove unused GPIO definitions for port J
ARM: S3C24XX: Remove unused GPA, GPE, GPH bank GPIO aliases
ARM: S3C24XX: Convert the touchscreen setup code to common GPIO API
ARM: S3C24XX: Convert the PM code to gpiolib API
ARM: S3C24XX: Convert QT2410 board file to the gpiolib API
ARM: S3C24XX: Convert SMDK board file to the gpiolib API
ARM: S3C24XX: Free the backlight gpio requested in Mini2440 board code
ARM: imx: remove unused pdata from device macros
ARM: imx: Kconfig: Remove IMX_HAVE_PLATFORM_IMX_SSI from MACH_MX25_3DS
ARM: at91: fix new build errors
ARM: at91: add AIC5 support
ARM: at91: remove mach/irqs.h
ARM: at91: sparse irq support
ARM: at91: at91 based machines specify their own irq handler at run time
ARM: at91: remove static irq priorities for sam9x5
ARM: at91: add of irq priorities support
...
Pull non-critical arm-soc bug fixes from Arnd Bergmann:
"These were submitted as bug fixes before v3.5 but not considered
important enough to be included in it."
* tag 'fixes-non-critical' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (29 commits)
ARM:vt8500: Convert to use .restart and remove arch_reset()
ARM: davinci: da8xx: fix interrupt handling
ARM: OMAP2+: fix CONFIG_CPU_IDLE dependency on CONFIG_PM
ARM: mxs/tx28: fix odd include
ARM: OMAP: remove unused cpu detection macros
ARM: OMAP: fix typos related to OMAP330
ARM: OMAP7XX: Remove omap730.h and omap850.h
ARM: OMAP2+: fix naming collision of variable nr_irqs
ARM: OMAP: omap2plus_defconfig: Enable EXT4 support
ARM: OMAP depends on MMU
arm: omap3: am35x: Set proper powerdomain states
ARM: OMAP AM35x: clockdomain data: Fix clockdomain dependencies
ARM: OMAP AM35x: EMAC/MDIO integration: Add Davinci EMAC/MDIO hwmod support
ARM: OMAP: AM35xx: fix UART4 softreset
ARM: OMAP AM35xx: clock and hwmod data: fix UART4 data
ARM: OMAP AM35xx: clock and hwmod data: fix AM35xx HSOTGUSB hwmod
ARM: OMAP: Fix dts files w/ status property: "disable" -> "disabled"
ARM: OMAP: beagle: Set USB Host Port 1 to OMAP_USBHS_PORT_MODE_UNUSED
ARM: OMAP2: twl-common: Fix compiler warning
ARM: OMAP: fix the ads7846 init code
...
Pull UBI changes from Artem Bityutskiy:
"Change the default amount of eraseblocks which UBI reserves for bad
block handling from 1% to 2%, because 1% does not meet most modern
flash requirements. 1% was good enough in the past for old
high-quality SLCs, but nowadays 2% is much more appropriate.:
Other changes are clean-ups.
* tag 'upstream-3.6-rc1' of git://git.infradead.org/linux-ubi:
UBI: harmonize the update of ubi->beb_rsvd_pebs
UBI: trivial: fix comment of ubi_calculate_reserved function
UBI: fix spelling of detach in debug output
UBI: Change the default percentage of reserved PEB
Pull UBIFS updates from Artem Bityutskiy:
- Added another debugfs knob for forcing UBIFS R/O mode without
flushing caches or finishing commit or any other I/O operation. I've
originally added this knob in order to reproduce the free space fixup
bug (see commit c6727932cf: "UBIFS: fix a bug in empty space
fix-up") on nandsim.
Without this knob I would have to do real power-cuts, which would
make debugging much harder. Then I've decided to keep this knob
because it is also useful for UBIFS power-cut recovery end
error-paths testing.
- Well-spotted fix from Julia. This bug did not cause real troubles
for UBIFS, but nevertheless it could cause issues for someone trying
to modify the orphans handling code. Kudos to coccinelle!
- Minor cleanups.
* tag 'upstream-3.6-rc1' of git://git.infradead.org/linux-ubifs:
UBIFS: remove invalid reference to list iterator variable
UBIFS: simplify reply code a bit
UBIFS: add debugfs knob to switch to R/O mode
UBIFS: fix compilation warning
"smb2" makes me think of the SMB2.x protocol, which isn't at all what
this function is for...
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
There's a comment here about how we don't want to modify this length,
but nothing in this function actually does.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
struct file_lock is pretty large, so we really don't want that on the
stack in a potentially long call chain. Reorganize the arguments to
CIFSSMBPosixLock to eliminate the need for that.
Eliminate the get_flag and simply use a non-NULL pLockInfo to indicate
that this is a "get" operation. In order to do that, need to add a new
loff_t argument for the start_offset.
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Those macros add a newline on their own, so there's not any need to
embed one in the message itself.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Calling key_revoke here isn't ideal as further requests for the key will
end up returning -EKEYREVOKED until it gets purged from the cache. What we
really intend here is to force a new upcall on the next request_key.
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <smfrench@gmail.com>
While testing with my buffer read fio jobs[1], I find that btrfs does not
perform well enough.
Here is a scenario in fio jobs:
We have 4 threads, "t1 t2 t3 t4", starting to buffer read a same file,
and all of them will race on add_to_page_cache_lru(), and if one thread
successfully puts its page into the page cache, it takes the responsibility
to read the page's data.
And what's more, reading a page needs a period of time to finish, in which
other threads can slide in and process rest pages:
t1 t2 t3 t4
add Page1
read Page1 add Page2
| read Page2 add Page3
| | read Page3 add Page4
| | | read Page4
-----|------------|-----------|-----------|--------
v v v v
bio bio bio bio
Now we have four bios, each of which holds only one page since we need to
maintain consecutive pages in bio. Thus, we can end up with far more bios
than we need.
Here we're going to
a) delay the real read-page section and
b) try to put more pages into page cache.
With that said, we can make each bio hold more pages and reduce the number
of bios we need.
Here is some numbers taken from fio results:
w/o patch w patch
------------- -------- ---------------
READ: 745MB/s +25% 934MB/s
[1]:
[global]
group_reporting
thread
numjobs=4
bs=32k
rw=read
ioengine=sync
directory=/mnt/btrfs/
[READ]
filename=foobar
size=2000M
invalidate=1
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
For backref walking, we've introduce delayed ref's sequence. However,
it changes our preallocation behavior.
The story is that when we preallocate an extent and then mark it written
piece by piece, the ideal case should be that we don't need to COW the
extent, which is why we use 'preallocate'.
But we may not make use of preallocation, since when we check for cross refs on
the extent, we may have two ref entries which have the same content except
the sequence value, and we recognize them as cross refs and do COW to allocate
another extent.
So we end up with several pieces of space instead of an whole extent.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
There is a small window where an eb can have no IO bits set on it, which
could potentially result in extent_buffer_under_io() returning false when we
want it to return true, which could result in not fun things happening. So
in order to protect this case we need to hold the refs_lock when we make
this transition to make sure we get reliable results out of
extent_buffer_udner_io(). Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
This sounds sort of impossible but it is the only thing I can think of and
at the very least it is theoretically possible so here it goes.
If we are in try_release_extent_buffer we will check that the ref count on
the extent buffer is 1 and not under IO, and then go down and clear the tree
ref. If between this check and clearing the tree ref somebody else comes in
and grabs a ref on the eb and the marks it dirty before
try_release_extent_buffer() does it's tree ref clear we can end up with a
dirty eb that will be freed while it is still dirty which will result in a
panic. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
I noticed while looking at an extent_buffer race that we will
unconditionally return 1 if we get down to release_extent_buffer after
clearing the tree ref. However we can easily race in here and get a ref on
the eb and not actually free the eb. So make release_extent_buffer return 1
if it free'd the eb and 0 if not so we can be a little kinder to the vm.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Code is added to suppress the I/O stats printing at mount time if all
statistic values are zero.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
People complained about the annoying kernel log message
"btrfs: no dev_stats entry found ... (OK on first mount after mkfs)"
everytime a filesystem is mounted for the first time after running
mkfs. Since the distribution of the btrfs-progs is not synchronized
to the kernel version, mkfs like it is now will be used also in the
future. Then this message is not useful to find errors, it is just
annoying. This commit removes the printk().
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
BTRFS_SETGET_FUNCS macro is used to generate btrfs_set_foo() and
btrfs_foo() functions, which read and write specific fields in the
extent buffer.
The total number of set/get functions is ~200, but in fact we only
need 8 functions: 2 for u8 field, 2 for u16, 2 for u32 and 2 for u64.
It results in redunction of ~37K bytes.
text data bss dec hex filename
629661 12489 216 642366 9cd3e fs/btrfs/btrfs.o.orig
592637 12489 216 605342 93c9e fs/btrfs/btrfs.o
Signed-off-by: Li Zefan <lizefan@huawei.com>
The otime field is not zeroed, so users will see random otime in an old
filesystem with a new kernel which has otime support in the future.
The reserved bytes are also not zeroed, and we'll have compatibility
issue if we make use of those bytes.
Signed-off-by: Li Zefan <lizefan@huawei.com>
Inodes always allocate free space with BTRFS_BLOCK_GROUP_DATA type,
which means every inode has the same BTRFS_I(inode)->free_space pointer.
This shrinks struct btrfs_inode by 4 bytes (or 8 bytes on 64 bits).
Signed-off-by: Li Zefan <lizefan@huawei.com>
When calling btrfs_next_old_leaf, we were leaking an extent buffer in the
rare case of using the deadlock avoidance code needed for the tree mod log.
Signed-off-by: Jan Schmidt <list.btrfs@jan-o-sch.net>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
If a block group is ro, do not count its entries in when we dump space info.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Here is the whole story:
1)
A free space cache consists of two parts:
o free space cache inode, which is special becase it's stored in root tree.
o free space info, which is stored as the above inode's file data.
But we only build up another new inode and does not flush its free space info
onto disk when we _clear and setup_ free space cache, and this ends up with
that the block group cache's cache_state remains DC_SETUP instead of DC_WRITTEN.
And holding DC_SETUP means that we will not truncate this free space cache inode,
which means the disk offset of its file extent will remain _unchanged_ at least
until next transaction finishes committing itself.
2)
We can set a block group readonly when we relocate the block group.
However,
if the readonly block group covers the disk offset where our free space cache
inode is going to write, it will force the free space cache inode into
cow_file_range() and it'll end up hitting a BUG_ON.
3)
Due to the above analysis, we fix this bug by adding the missing dirty flag.
4)
However, it's not over, there is still another case, nospace_cache.
With nospace_cache, we do not want to set dirty flag, instead we just truncate
free space cache inode and bail out with setting cache state DC_WRITTEN.
We can benifit from it since it saves us another 'pre-allocation' part which
usually costs a lot.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
During disk balance, we prealloc new file extent for file data relocation,
but we may fail in 'no available space' case, and it leads to flipping btrfs
into readonly.
It is not necessary to bail out and abort transaction since we do have several
ways to rescue ourselves from ENOSPC case.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
Since root can be fetched via BTRFS_I macro directly, we can save an args
for btrfs_is_free_space_inode().
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
For btree inode, its root is also 'tree root', so btree inode can be
misunderstood as a free space inode.
We should add one more check for btree inode.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
From btree_read_extent_buffer_pages(), currently repair_io_failure()
can be called with mirror_num being zero when submit_one_bio() returned
an error before. This used to cause a BUG_ON(!mirror_num) in
repair_io_failure() and indeed this is not a case that needs the I/O
repair code to rewrite disk blocks.
This commit prevents calling repair_io_failure() in this case and thus
avoids the BUG_ON() and malfunction.
Signed-off-by: Stefan Behrens <sbehrens@giantdisaster.de>
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
So shrink_delalloc has grown all sorts of cruft over the years thanks to
many reworkings of how we track enospc. What happens now as we fill up the
disk is we will loop for freaking ever hoping to reclaim a arbitrary amount
of space of metadata, this was from when everybody flushed at the same time.
Now we only have people flushing one at a time. So instead of trying to
reclaim a huge amount of space, just try to flush a decent chunk of space,
and stop looping as soon as we have enough free space to satisfy our
reservation. This makes xfstests 224 go much faster. Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>
$ mkfs.btrfs /dev/sdb7
$ btrfstune -S1 /dev/sdb7
$ mount /dev/sdb7 /mnt/btrfs
mount: block device /dev/sdb7 is write-protected, mounting read-only
$ btrfs dev add /dev/sdb8 /mnt/btrfs/
Now we get a btrfs in which mnt flags has readonly but sb flags does
not. So for those ioctls that only check sb flags with MS_RDONLY, it
is going to be a problem.
Setting subvolume flags is such an ioctl, we should use mnt_want_write_file()
to check RO flags.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
mnt_want_write() and mnt_want_write_file() will check sb->s_flags with
MS_RDONLY, and we don't need to do it ourselves.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
Move check of write access to mount into upper functions so that we can
use mnt_want_write_file instead, which is faster than mnt_want_write.
Signed-off-by: Liu Bo <liubo2009@cn.fujitsu.com>
There is weird logic I had to put in place to make sure that when we were
adding csums that we'd used the delalloc block rsv instead of the global
block rsv. Part of this meant that we had to free up our transaction
reservation before we ran the delayed refs since csum deletion happens
during the delayed ref work. The problem with this is that when we release
a reservation we will add it to the global reserve if it is not full in
order to keep us going along longer before we have to force a transaction
commit. By releasing our reservation before we run delayed refs we don't
get the opportunity to drain down the global reserve for the work we did, so
we won't refill it as often. This isn't a problem per-se, it just results
in us possibly committing transactions more and more often, and in rare
cases could cause those WARN_ON()'s to pop in use_block_rsv because we ran
out of space in our block rsv.
This also helps us by holding onto space while the delayed refs run so we
don't end up with as many people trying to do things at the same time, which
again will help us not force commits or hit the use_block_rsv warnings.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fusionio.com>