The order of freeing the IRQ and freeing the device in firmware
in ibmveth_close can cause the adapter to become unusable after a
subsequent ibmveth_open. Only a reboot of the OS will make the
network device usable again. This is seen when cycling the adapter
up and down while there is network activity.
There is a window where an IRQ will be left unserviced (H_EOI will not
be called). The solution is to make a VIO_IRQ_DISABLE h_call, free the
device with firmware, and then call free_irq.
Signed-off-by: Robert Jennings <rcj@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If we fail to assign resources to a PCI BAR, this patch makes us try the
original address from BIOS rather than leaving it disabled.
Linux tries to make sure all PCI device BARs are inside the upstream
PCI host bridge or P2P bridge apertures, reassigning BARs if necessary.
Windows does similar reassignment.
Before this patch, if we could not move a BAR into an aperture, we left
the resource unassigned, i.e., at address zero. Windows leaves such BARs
at the original BIOS addresses, and this patch makes Linux do the same.
This is a bit ugly because we disable the resource long before we try to
reassign it, so we have to keep track of the BIOS BAR address somewhere.
For lack of a better place, I put it in the struct pci_dev.
I think it would be cleaner to attempt the assignment immediately when the
claim fails, so we could easily remember the original address. But we
currently claim motherboard resources in the middle, after attempting to
claim PCI resources and before assigning new PCI resources, and changing
that is a fairly big job.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=16263
Reported-by: Andrew <nitr0@seti.kr.ua>
Tested-by: Andrew <nitr0@seti.kr.ua>
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
tracing: Add alignment to syscall metadata declarations
perf: Sync callchains with period based hits
perf: Resurrect flat callchains
perf: Version String fix, for fallback if not from git
perf: Version String fix, using kernel version
The rt2x00dev->intf_work workqueue is never initialized when a driver is
probed for a non-existent device (in this case rt2500usb). On such a
path we call rt2x00lib_remove_dev() to free any resources initialized
during the probe before we use INIT_WORK to initialize the workqueue.
This causes lockdep to get confused since the lock used in the workqueue
hasn't been initialized yet but is now being acquired during
cancel_work_sync() called by rt2x00lib_remove_dev().
Fix this by initializing the workqueue first before we attempt to probe
the device. This should make lockdep happy and avoid breaking any
assumptions about how the library cleans up after a probe fails.
phy0 -> rt2x00lib_probe_dev: Error - Failed to allocate device.
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Pid: 2027, comm: modprobe Not tainted 2.6.35-rc5+ #60
Call Trace:
[<ffffffff8105fe59>] register_lock_class+0x152/0x31f
[<ffffffff81344a00>] ? usb_control_msg+0xd5/0x111
[<ffffffff81061bde>] __lock_acquire+0xce/0xcf4
[<ffffffff8105f6fd>] ? trace_hardirqs_off+0xd/0xf
[<ffffffff81492aef>] ? _raw_spin_unlock_irqrestore+0x33/0x41
[<ffffffff810628d5>] lock_acquire+0xd1/0xf7
[<ffffffff8104f037>] ? __cancel_work_timer+0x99/0x17e
[<ffffffff8104f06e>] __cancel_work_timer+0xd0/0x17e
[<ffffffff8104f037>] ? __cancel_work_timer+0x99/0x17e
[<ffffffff8104f136>] cancel_work_sync+0xb/0xd
[<ffffffffa0096675>] rt2x00lib_remove_dev+0x25/0xb0 [rt2x00lib]
[<ffffffffa0096bf7>] rt2x00lib_probe_dev+0x380/0x3ed [rt2x00lib]
[<ffffffff811d78a7>] ? __raw_spin_lock_init+0x31/0x52
[<ffffffffa00bbd2c>] ? T.676+0xe/0x10 [rt2x00usb]
[<ffffffffa00bbe4f>] rt2x00usb_probe+0x121/0x15e [rt2x00usb]
[<ffffffff813468bd>] usb_probe_interface+0x151/0x19e
[<ffffffff812ea08e>] driver_probe_device+0xa7/0x136
[<ffffffff812ea167>] __driver_attach+0x4a/0x66
[<ffffffff812ea11d>] ? __driver_attach+0x0/0x66
[<ffffffff812e96ca>] bus_for_each_dev+0x54/0x89
[<ffffffff812e9efd>] driver_attach+0x19/0x1b
[<ffffffff812e9b64>] bus_add_driver+0xb4/0x204
[<ffffffff812ea41b>] driver_register+0x98/0x109
[<ffffffff813465dd>] usb_register_driver+0xb2/0x173
[<ffffffffa00ca000>] ? rt2500usb_init+0x0/0x20 [rt2500usb]
[<ffffffffa00ca01e>] rt2500usb_init+0x1e/0x20 [rt2500usb]
[<ffffffff81000203>] do_one_initcall+0x6d/0x17a
[<ffffffff8106cae8>] sys_init_module+0x9c/0x1e0
[<ffffffff8100296b>] system_call_fastpath+0x16/0x1b
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Strip the cap and dentry releases from replayed messages. They can
cause the shared state to get out of sync because they were generated
(with the request message) earlier, and no longer reflect the current
client state.
Signed-off-by: Sage Weil <sage@newdream.net>
Replayed rename operations (after an mds failure/recovery) were broken
because the request paths were regenerated from the dentry names, which
get mangled when d_move() is called.
Instead, resend the previous request message when replaying completed
operations. Just make sure the REPLAY flag is set and the target ino is
filled in.
This fixes problems with workloads doing renames when the MDS restarts,
where the rename operation appears to succeed, but on mds restart then
fails (leading to client confusion, app breakage, etc.).
Signed-off-by: Sage Weil <sage@newdream.net>
When I ran "perf kvm ... top", I encountered the following error output.
Error: perfcounter syscall returned with -1 (Too many open files)
Fatal: No CONFIG_PERF_EVENTS=y kernel support configured?
Looking into perf, I found perf opens too many directories at
initialization time, but forgets to close them. Here is the fix.
LKML-Reference: <4C230362.5080704@cn.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes:
GFS2: rename causes kernel Oops
GFS2: BUG in gfs2_adjust_quota
GFS2: Fix kernel NULL pointer dereference by dlm_astd
GFS2: recovery stuck on transaction lock
GFS2: O_TRUNC not working on stuffed files across cluster
Add a common early allocator function, in preparation for switching
over to LMB. When we do, this function will need to do a little more
than just allocating memory; we need it zero initialized too.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
The logic in this file is rather convoluted, but essentially:
1. region type 0 is SDRAM
2. referring to the code fragment
if (set_fbmem_region_type(&rg, OMAPFB_MEMTYPE_SDRAM,
sdram_start, sdram_size) < 0 ||
(rg.type != OMAPFB_MEMTYPE_SDRAM))
continue;
- if rg.type is not OMAPFB_MEMTYPE_SDRAM, set_fbmem_region_type()
returns zero immediately (since rg.type is non-zero), and so we
'continue'.
- if rg.type is OMAPFB_MEMTYPE_SDRAM, and rg.paddr is zero,
we fall through.
- if rg.type is OMAPFB_MEMTYPE_SDRAM, and the region lies within
SDRAM, we fall through.
- if rg.type is OMAPFB_MEMTYPE_SDRAM, and the region is not within
SDRAM, we 'continue'.
3. check_fbmem_region seems unnecessary.
- we know rg.type is OMAPFB_MEMTYPE_SDRAM
- we can check rg.size independently
- bootmem_reserve() can check for overlapping reservations itself
- we've already validated that the requested region lies within SDRAM.
4. avoid BUG()ing if the region entry is already set; print an error,
and mark the configuration invalid - at least we'll continue booting
so the error message has a chance of being logged/visible via serial
console.
With these changes in place, it makes the code much easier to understand
and hence easier to convert to LMB.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Move the platform specific bootmem memory reservations out of
arch/arm/mm/mmu.c into their respective platform files.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Everything should now be using sparsemem rather than discontigmem, so
remove the code supporting discontigmem from ARM.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Rather than storing the minimum size of the vmalloc area, store the
maximum permitted address of the vmalloc area instead.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Guest can trigger packet truncation by posting
a very short buffer and disabling buffer merging.
Convert pr_err to pr_debug to avoid log from filling
up when this happens.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
No need to take address, w90p910_ts is already a pointer.
Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
This was detected using two mcast router tables. The
pimreg for the second interface did not have a specific
mrule, so packets received by it were handled by the
default table, which had nothing configured.
This caused the ipmr_fib_lookup to fail, causing
the memory leak.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hists that have been filtered, because they don't have callchains
matching the parent filter, won't be printed. As such,
hist_entry__snprintf() returns 0 for them, but we don't control
this value and we always print the buffer, which might be
untouched and then only made of random stack garbage.
Not only does it paint the screen with barf, it also prints
the callchains for these hists, even though they have been filtered,
since the hist has been filtered as well.
We need to check the return value of hist_entry__snprintf() and
ignore the hist if it is 0, which means it didn't get any callchain
matching the parent filter. This fixes the barf and the undesired
callchains.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
The asics in question have the following requirements with regard to
their gart setups:
1. The GART aperture size has to be in the form of 2^X bytes, where X is from 25 to 31
2. The GART aperture MC base has to be aligned to a boundary equal to the size of the
aperture.
3. The GART page table has to be aligned to the boundary equal to the size of the table.
4. The GART page table size is: table_entry_size * (aperture_size / page_size)
5. The GART page table has to be allocated in non-paged, non-cached, contiguous system
memory.
This patch takes care 2. The rest should already be handled properly.
This fixes a regression noticed by: Torsten Kaiser <just.for.lkml@googlemail.com>
Tested-by: Torsten Kaiser <just.for.lkml@googlemail.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
OCFS2 uses t_commit trigger to compute and store checksum of the just
committed blocks. When a buffer has b_frozen_data, checksum is computed
for it instead of b_data but this can result in an old checksum being
written to the filesystem in the following scenario:
1) transaction1 is opened
2) handle1 is opened
3) journal_access(handle1, bh)
- This sets jh->b_transaction to transaction1
4) modify(bh)
5) journal_dirty(handle1, bh)
6) handle1 is closed
7) start committing transaction1, opening transaction2
8) handle2 is opened
9) journal_access(handle2, bh)
- This copies off b_frozen_data to make it safe for transaction1 to commit.
jh->b_next_transaction is set to transaction2.
10) jbd2_journal_write_metadata() checksums b_frozen_data
11) the journal correctly writes b_frozen_data to the disk journal
12) handle2 is closed
- There was no dirty call for the bh on handle2, so it is never queued for
any more journal operation
13) Checkpointing finally happens, and it just spools the bh via normal buffer
writeback. This will write b_data, which was never triggered on and thus
contains a wrong (old) checksum.
This patch fixes the problem by calling the trigger at the moment data is
frozen for journal commit - i.e., either when b_frozen_data is created by
do_get_write_access or just before we write a buffer to the log if
b_frozen_data does not exist. We also rename the trigger to t_frozen as
that better describes when it is called.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
For migration, we are waiting for DLM_LOCK_RES_MIGRATING flag to be set
before sending DLM_MIG_LOCKRES_MSG message to the target. We are using
dlm_migration_can_proceed() for that purpose. However, if the node is
down, dlm_migration_can_proceed() will also return "go ahead". In this
rare case, the DLM_LOCK_RES_MIGRATING flag might not be set yet. Remove
the BUG_ON() that trips over this condition.
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
During CoW, the pages after i_size don't contain valid data, so there's
no need to read and duplicate them.
Signed-off-by: Tao Ma <tao.ma@oracle.com>
Signed-off-by: Joel Becker <joel.becker@oracle.com>
commit 30a564be (x86, hpet: Restrict read back to affected ATI
chipset) restricted the workaround for the HPET bug to SMX00
chipsets. This was reasonable as those were the only ones against
which we ever got a bug report.
Stephan Wolf reported now that this patch breaks his IXP400 based
machine. Though it's confirmed to work on other IXP400 based systems.
To error out on the safe side, we force the HPET readback workaround
for all ATI SMbus class chipsets.
Reported-by: Stephan Wolf <stephan@letzte-bankreihe.de>
LKML-Reference: <alpine.LFD.2.00.1007142134140.3321@localhost.localdomain>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Stephan Wolf <stephan@letzte-bankreihe.de>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
We flush under vq mutex when changing backends.
This creates a deadlock as workqueue being flushed
needs this lock as well.
https://bugzilla.redhat.com/show_bug.cgi?id=612421
Drop the vq mutex before flush: we have the device mutex
which is sufficient to prevent another ioctl from touching
the vq.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds platform data for the PL022 to the ARM Versatile reference
design, and adds the necessary clock definition.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This adds platform data for the PL022 to the ARM RealView reference
designs, adds the necessary clock definition and fixes a badly
defined IRQ line on the PB1176.
Signed-off-by: Linus Walleij <linus.walleij@stericsson.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch fixes a kernel Oops in the GFS2 rename code.
The problem was in the way the gfs2 directory code was trying
to re-use sentinel directory entries.
In the failing case, gfs2's rename function was renaming a
file to another name that had the same non-trivial length.
The file being renamed happened to be the first directory
entry on the leaf block.
First, the rename code (gfs2_rename in ops_inode.c) found the
original directory entry and decided it could do its job by
simply replacing the directory entry with another. Therefore
it determined correctly that no block allocations were needed.
Next, the rename code deleted the old directory entry prior to
replacing it with the new name. Therefore, the soon-to-be
replaced directory entry was temporarily made into a directory
entry "sentinel" or a place holder at the start of a leaf block.
Lastly, it went to re-add the replacement directory entry in
that leaf block. However, when gfs2_dirent_find_space was
looking for space in the leaf block, it used the wrong value
for the sentinel. That threw off its calculations so later
it decides it can't really re-use the sentinel and therefore
must allocate a new leaf block. But because it previously decided
to re-use the directory entry, it didn't waste the time to
grab a new block allocation for the inode. Therefore, the
inode's i_alloc pointer was still NULL and it crashes trying to
reference it.
In the case of sentinel directory entries, the entire dirent is
reused, not just the "free space" portion of it, and therefore
the function gfs2_dirent_find_space should use the value 0
rather than GFS2_DIRENT_SIZE(0) for the actual dirent size.
Fixing this calculation enables the reproducer programs to work
properly.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
HighMem pages on i686 do not get mapped to the buffer_heads and this was
causing a NULL pointer dereference when we were trying to memset page buffers
to zero.
We now use zero_user() that kmaps the page and directly manipulates page data.
This patch also fixes a boundary condition that was incorrect.
Signed-off-by: Abhi Das <adas@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch fixes a problem in an error path when looking
up dinodes. There are two sister-functions, gfs2_inode_lookup
and gfs2_process_unlinked_inode. Both functions acquire and
hold the i_iopen glock for the dinode being looked up. The last
thing they try to do is hold the i_gl glock for the dinode.
If that glock fails for some reason, the error path was
incorrectly calling gfs2_glock_put for the i_iopen glock twice.
This resulted in the glock being prematurely freed. The
"minimum hold time" usually kept the glock in memory, but the
lock interface to dlm (aka lock_dlm) freed its memory for the
glock. In some circumstances, it would cause dlm's dlm_astd daemon
to try to call the bast function for the freed lock_dlm memory,
which resulted in a NULL pointer dereference.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
This patch fixes bugzilla bug #590878: GFS2: recovery stuck on
transaction lock. We set the frozen flag on the glock when we receive
a completion that cannot be delivered due to blocked locks. At that
point we check to see whether the first waiting holder has the noexp
flag set. If the noexp lock is queued later, then we need to unfreeze
the glock at that point in time, namely, in the glock work function.
This patch was originally written by Steve Whitehouse, but since
he's on holiday, I'm submitting it. It's been well tested with a
complex recovery test called revolver.
Signed-off-by: Steve Whitehouse <swhiteho@redhat.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
This patch replaces a statement that got dropped out by accident.
Without the patch, truncates on stuffed (very small) files cause
those files to have an unpredictable size.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Fix problem in reading the tx_queue recorded in a socket. In
dev_pick_tx, the TX queue is read by doing a check with
sk_tx_queue_recorded on the socket, followed by a sk_tx_queue_get.
The problem is that there is not mutual exclusion across these
calls in the socket so it it is possible that the queue in the
sock can be invalidated after sk_tx_queue_recorded is called so
that sk_tx_queue get returns -1, which sets 65535 in queue_index
and thus dev_pick_tx returns 65536 which is a bogus queue and
can cause crash in dev_queue_xmit.
We fix this by only calling sk_tx_queue_get which does the proper
checks. The interface is that sk_tx_queue_get returns the TX queue
if the sock argument is non-NULL and TX queue is recorded, else it
returns -1. sk_tx_queue_recorded is no longer used so it can be
completely removed.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When configuring DMVPN (GRE + openNHRP) and a GRE remote
address is configured a kernel Oops is observed. The
obserseved Oops is caused by a NULL header_ops pointer
(neigh->dev->header_ops) in neigh_update_hhs() when
void (*update)(struct hh_cache*, const struct net_device*, const unsigned char *)
= neigh->dev->header_ops->cache_update;
is executed. The dev associated with the NULL header_ops is
the GRE interface. This patch guards against the
possibility that header_ops is NULL.
This Oops was first observed in kernel version 2.6.26.8.
Signed-off-by: Doug Kehn <rdkehn@yahoo.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/fsl-booke: Fix address issue when using relocatable kernels
powerpc/cpm1: Mark micropatch code/data static and __init
powerpc/cpm1: Fix build with various CONFIG_*_UCODE_PATCH combinations
powerpc/cpm: Reintroduce global spi_pram struct (fixes build issue)
commit fc6055a5ba (net: Introduce skb_orphan_try()) added early
orphaning of skbs.
This unfortunately added a performance regression in skb_tx_hash() in
case of stacked devices (bonding, vlans, ...)
Since skb->sk is now NULL, we cannot access sk->sk_hash anymore to
spread tx packets to multiple NIC queues on multiqueue devices.
skb_tx_hash() in this case only uses skb->protocol, same value for all
flows.
skb_orphan_try() can copy sk->sk_hash into skb->rxhash and skb_tx_hash()
can use this saved sk_hash value to compute its internal hash value.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
rfs: call sock_rps_record_flow() in tcp_splice_read()
call sock_rps_record_flow() in tcp_splice_read(), so the applications using
splice(2) or sendfile(2) can utilize RFS.
Signed-off-by: Changli Gao <xiaosuo@gmail.com>
----
net/ipv4/tcp.c | 1 +
1 file changed, 1 insertion(+)
Signed-off-by: David S. Miller <davem@davemloft.net>
xfrm_resolve_and_create_bundle() assumed that, if policies indicated
presence of xfrms, bundle template resolution would always return
some xfrms. This is not true for 'use' level policies which can
result in no xfrm's being applied if there is no suitable xfrm states.
This fixes a crash by this incorrect assumption.
Reported-by: George Spelvin <linux@horizon.com>
Bisected-by: George Spelvin <linux@horizon.com>
Tested-by: George Spelvin <linux@horizon.com>
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
"hostap: Protect against initialization interrupt" (which reinstated
"wireless: hostap, fix oops due to early probing interrupt")
reintroduced Bug 16111. This is because hostap_pci wasn't setting
dev->base_addr, which is now checked in prism2_interrupt. As a result,
initialization was failing for PCI-based hostap devices. This corrects
that oversight.
Signed-off-by: John W. Linville <linville@tuxdriver.com>