This patch introduces IFLA_NUM_TX_QUEUES and IFLA_NUM_RX_QUEUES by
which userspace can set number of rx and/or tx queues to be allocated
for newly created netdevice.
This overrides ops->get_num_[tr]x_queues()
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Also cut out unused function parameters and possible err in return
value.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
In netif_copy_real_num_queues() the return value of
netif_set_real_num_tx_queues() should be checked.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Modern TCP stack highly depends on tcp_write_timer() having a small
latency, but current implementation doesn't exactly meet the
expectations.
When a timer fires but finds the socket is owned by the user, it rearms
itself for an additional delay hoping next run will be more
successful.
tcp_write_timer() for example uses a 50ms delay for next try, and it
defeats many attempts to get predictable TCP behavior in term of
latencies.
Use the recently introduced tcp_release_cb(), so that the user owning
the socket will call various handlers right before socket release.
This will permit us to post a followup patch to address the
tcp_tso_should_defer() syndrome (some deferred packets have to wait
RTO timer to be transmitted, while cwnd should allow us to send them
sooner)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Nandita Dukkipati <nanditad@google.com>
Cc: H.K. Jerry Chu <hkchu@google.com>
Cc: John Heffner <johnwheffner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a missing roundup_pow_of_two(), since tcpmhash_entries is not
guaranteed to be a power of two.
Uses hash_32() instead of custom hash.
tcpmhash_entries should be an unsigned int.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marc Kleine-Budde says:
====================
the fifth pull request for upcoming v3.6 net-next cleans up and
improves the janz-ican3 driver (6 patches by Ira W. Snyder, one by me).
A patch by Steffen Trumtrar adds imx53 support to the flexcan driver.
And another patch by me, which marks the bit timing constant in the CAN
drivers as "const".
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The Janz VMOD-ICAN3 hardware has support for one shot packet
transmission. This means that a packet will be attempted to be sent
once, with no automatic retries.
The SocketCAN core has a controller-wide setting for this mode:
CAN_CTRLMODE_ONE_SHOT. The Janz VMOD-ICAN3 hardware supports this flag
on a per-packet level, but the SocketCAN core does not.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
If the bus error quota is set to infinite and the host CPU cannot keep
up, the Janz VMOD-ICAN3 firmware will stop responding to control
messages until the controller is reset.
The firmware will automatically stop sending bus error messages when the
quota is reached, and will only resume sending bus error messages when
the quota is re-set to a positive value.
This limitation is worked around by setting the bus error quota to one
message, and then re-setting the quota to one message every time a bus
error message is received. By doing this, the firmware never stops
responding to control messages. The CAN bus can be reset without a
hard-reset of the controller card.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The Janz VMOD-ICAN3 firmware does not support any sort of TX-done
notification or interrupt. The driver previously used the hardware
loopback to attempt to work around this deficiency, but this caused all
sockets to receive all messages, even if CAN_RAW_RECV_OWN_MSGS is off.
Using the new function ican3_cmp_echo_skb(), we can drop the loopback
messages and return the original skbs. This fixes the issues with
CAN_RAW_RECV_OWN_MSGS.
A private skb queue is used to store the echo skbs. This avoids the need
for any index management.
Due to a lack of TX-error interrupts, bus errors are permanently
enabled, and are used as a TX-error notification. This is used to drop
an echo skb when transmission fails. Bus error packets are not generated
if the user has not enabled bus error reporting.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The error and byte counter statistics were being incremented
incorrectly. For example, a TX error would be counted both in tx_errors
and rx_errors.
This corrects the problem so that tx_errors and rx_errors are only
incremented for errors caused by packets sent to the bus. Error packets
generated by the driver are not counted.
The byte counters are only increased for packets which are actually
transmitted or received from the bus. Error packets generated by the
driver are not counted.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch cleans up the ICAN3 to Linux CAN frame and vice versa
conversion functions:
- RX: Use get_can_dlc() to limit the dlc value.
- RX+TX: Don't copy the whole frame, only copy the amount of bytes
specified in cf->can_dlc.
Acked-by: Ira W. Snyder <iws@ovro.caltech.edu>
Tested-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
We can't guarantee that REQ_DISCARD on dm-mirror zeroes the data even if
the underlying disks support zero on discard. So this patch sets
ti->discard_zeroes_data_unsupported.
For example, if the mirror is in the process of resynchronizing, it may
happen that kcopyd reads a piece of data, then discard is sent on the
same area and then kcopyd writes the piece of data to another leg.
Consequently, the data is not zeroed.
The flag was made available by commit 983c7db347
(dm crypt: always disable discard_zeroes_data).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
When process_discard receives a partial discard that doesn't cover a
full block, it sends this discard down to that block. Unfortunately, the
block can be shared and the discard would corrupt the other snapshots
sharing this block.
This patch detects block sharing and ends the discard with success when
sending it to the shared block.
The above change means that if the device supports discard it can't be
guaranteed that a discard request zeroes data. Therefore, we set
ti->discard_zeroes_data_unsupported.
Thin target discard support with this bug arrived in commit
104655fd4d (dm thin: support discards).
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
This patch fixes a crash when a discard request is sent during mirror
recovery.
Firstly, some background. Generally, the following sequence happens during
mirror synchronization:
- function do_recovery is called
- do_recovery calls dm_rh_recovery_prepare
- dm_rh_recovery_prepare uses a semaphore to limit the number
simultaneously recovered regions (by default the semaphore value is 1,
so only one region at a time is recovered)
- dm_rh_recovery_prepare calls __rh_recovery_prepare,
__rh_recovery_prepare asks the log driver for the next region to
recover. Then, it sets the region state to DM_RH_RECOVERING. If there
are no pending I/Os on this region, the region is added to
quiesced_regions list. If there are pending I/Os, the region is not
added to any list. It is added to the quiesced_regions list later (by
dm_rh_dec function) when all I/Os finish.
- when the region is on quiesced_regions list, there are no I/Os in
flight on this region. The region is popped from the list in
dm_rh_recovery_start function. Then, a kcopyd job is started in the
recover function.
- when the kcopyd job finishes, recovery_complete is called. It calls
dm_rh_recovery_end. dm_rh_recovery_end adds the region to
recovered_regions or failed_recovered_regions list (depending on
whether the copy operation was successful or not).
The above mechanism assumes that if the region is in DM_RH_RECOVERING
state, no new I/Os are started on this region. When I/O is started,
dm_rh_inc_pending is called, which increases reg->pending count. When
I/O is finished, dm_rh_dec is called. It decreases reg->pending count.
If the count is zero and the region was in DM_RH_RECOVERING state,
dm_rh_dec adds it to the quiesced_regions list.
Consequently, if we call dm_rh_inc_pending/dm_rh_dec while the region is
in DM_RH_RECOVERING state, it could be added to quiesced_regions list
multiple times or it could be added to this list when kcopyd is copying
data (it is assumed that the region is not on any list while kcopyd does
its jobs). This results in memory corruption and crash.
There already exist bypasses for REQ_FLUSH requests: REQ_FLUSH requests
do not belong to any region, so they are always added to the sync list
in do_writes. dm_rh_inc_pending does not increase count for REQ_FLUSH
requests. In mirror_end_io, dm_rh_dec is never called for REQ_FLUSH
requests. These bypasses avoid the crash possibility described above.
These bypasses were improperly implemented for REQ_DISCARD when
the mirror target gained discard support in commit
5fc2ffeabb (dm raid1: support discard).
In do_writes, REQ_DISCARD requests is always added to the sync queue and
immediately dispatched (even if the region is in DM_RH_RECOVERING). However,
dm_rh_inc and dm_rh_dec is called for REQ_DISCARD resusts. So it violates the
rule that no I/Os are started on DM_RH_RECOVERING regions, and causes the list
corruption described above.
This patch changes it so that REQ_DISCARD requests follow the same path
as REQ_FLUSH. This avoids the crash.
Reference: https://bugzilla.redhat.com/837607
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@kernel.org
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
hid-picolcd and hid-wiimote do not allow any of hidinput, hiddev or hidraw
to claim the device but still want to remain on the bus. Hence, if a
driver uses the raw_event callback but no other listener claimed the
device, we still leave it on the bus as the driver handles everything by
itself. It thus becomes its own listener.
Under some circumstances (eg., hidinput_connect() fails and raw_event set)
a device may be left on the bus even though it requires external
listeners. But then if hidinput_connect() fails there are bigger issues
than a device that is left unhandled. So we can safely use this heuristic
to avoid adding another flag for special devices like hid-picolcd and
hid-wiimote.
This also removes the ugly hack from hid-picolcd as this is no longer
required.
Signed-off-by: David Herrmann <dh.herrmann@googlemail.com>
Acked-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
LPC32xx SoC has a 128 bits unique id that can be used as a system
serial number, if none has been provided by atags or dt.
Signed-off-by: Alexandre Pereira da Silva <aletes.xgr@gmail.com>
Signed-off-by: Roland Stigge <stigge@antcom.de>
The commit which added the janz-ican3 driver and commit
3ccd4c61 "can: Unify droping of invalid tx skbs and netdev stats" were
committed into mainline Linux during the same merge window.
Therefore, the addition of this code to the janz-ican3 driver was
forgotten. This patch adds the expected code.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The code which used this variable was removed during review, before the
driver was added to mainline Linux. It is now dead code, and can be
removed.
Signed-off-by: Ira W. Snyder <iws@ovro.caltech.edu>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch adds support for a second clock to the flexcan driver. On
modern freescale ARM cores like the imx53 and imx6q two clocks ("ipg"
and "per") must be enabled in order to access the CAN core.
In the original driver, the clock was requested without specifying the
connection id, further all mainline ARM archs with flexcan support
(imx28, imx25, imx35) register their flexcan clock without a
connection id, too.
This patch first renames the existing clk variable to clk_ipg and
converts it to devm for easier error handling. The connection id "ipg"
is added to the devm_clk_get() call. Then a second clock "per" is
requested. As all archs don't specify a connection id, both clk_get
return the same clock. This ensures compatibility to existing flexcan
support and adds support for imx53 at the same time.
After this patch hits mainline, the archs may give their existing
flexcan clock the "ipg" connection id and implement a dummy "per"
clock.
This patch has been tested on imx28 (unmodified clk tree) and on imx53
with a seperate "ipg" and "per" clock.
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Cc: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Acked-by: Hui Wang <jason77.wang@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
This patch marks the bittiming_const pointer as in the struct can_pric as
"const". This allows us to mark the struct can_bittiming_const in the CAN
drivers as "const", too.
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
TI LP8788 PMU has 4 BUCKS and 22 LDOs.
The voltage of BUCK1 and BUCK2 can be controlled by external gpios.
And some LDOs also can be enabled by external gpios.
The regmap interface is used for regulator operations.
Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
The policy might have been changed since last call of target().
Thus, using cpufreq_frequency_table_target(), which depends on
policy to find the corresponding index from a frequency, may return
inconsistent index for freqs.old. Thus, old_index should be
calculated not based on the current policy.
We have been observing such issue when scaling_min/max_freq were
updated and sometimes cuased system lockups deu to incorrectly
configured voltages.
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
This patch (as1597) fixes some of the error paths in usbhid's suspend
routine. The driver was not careful to restart everything that might
have been stopped, in cases where a suspend failed.
For example, once the HID_SUSPENDED flag is set, an output report
submission would not restart the corresponding URB queue. If a
suspend fails, it's therefore necessary to check whether the queues
need to be restarted.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch (as1596) improves the queue-restart logic in usbhid by
checking to see if the device is suspended or a reset is about to
occur. There's no point submitting an URB if either of those is
true.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch (as1595) improves the usbhid driver by using the
HID_SUSPENDED bitflag to indicate that the device is suspended rather
than using HID_REPORTED_IDLE, which the patch removes.
Since HID_SUSPENDED was not being used for anything, and since the
name "HID_REPORTED_IDLE" doesn't convey much meaning, the end result
is easier to read and understand.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch (as1594) simplifies the usbhid driver by inlining a couple
of routines. As a result of an earlier patch, irq_out_pump_restart()
and ctrl_pump_restart() are each used in only one place. Since they
don't really do what their names say, and since they each involve only
about two lines of actual code, there's no reason to keep them as
separate functions.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch (as1593) fixes some logic errors in the usbhid driver
relating to runtime PM. The driver does not balance its calls to
usb_autopm_get_interface_async() and usb_autopm_put_interface_async().
For example, when the control queue is restarted the driver does a
_get. But the resume won't happen immediately, so the driver leaves
the queue stopped. When the resume does occur, the queue is restarted
and a second _get occurs, with no balancing _put.
The patch fixes the problem by rearranging the logic for restarting
the queues. All the _get/_put calls and bitflag settings in
__usbhid_submit_report() are moved into the queue-restart routines. A
balancing _put call is added for the case where the queue is still
suspended. A call to irq_out_pump_restart(), which doesn't take all
the right actions for restarting the irq-OUT queue, is replaced by a
call to usbhid_restart_out_queue(), which does. Similarly for
ctrl_pump_restart().
Finally, new code is added to prevent an autosuspend from happening
every time an URB is cancelled, and the comments explaining what
happens when an URB needs to be cancelled are expanded and clarified.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
This patch (as1592) fixes an obscure problem in the usbhid driver.
Under some circumstances, a control or interrupt-OUT URB can be
submitted twice. This will happen if the first submission fails; the
queue pointers aren't updated, so the next time the queue is restarted
the same URB will be submitted again.
The problem is that raw_report gets deallocated during the first
submission. The second submission will then dereference and try to
free an already-freed region of memory. The patch fixes the problem
by setting raw_report to NULL when it is deallocated and checking for
NULL before dereferencing it.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
CC: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
The current virtual timer interface is inherently per-cpu and hard to
use. The sole user of the interface is appldata which uses it to execute
a function after a specific amount of cputime has been used over all cpus.
Rework the virtual timer interface to hook into the cputime accounting.
This makes the interface independent from the CPU timer interrupts, and
makes the virtual timers global as opposed to per-cpu.
Overall the code is greatly simplified. The downside is that the accuracy
is not as good as the original implementation, but it is still good enough
for appldata.
Reviewed-by: Jan Glauber <jang@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.
Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
If lgr has not been initialized, the lgr_info_log() function currently
crashes because 'lgr_page' is not allocated. To fix this 'lgr_page'
is allocated statically now.
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
It is very common for the end of the file to be unaligned on
stripe size. But since we know it's beyond file's end then
the XOR should be preformed with all zeros.
Old code used to just read zeros out of the OSD devices, which is a great
waist. But what scares me more about this situation is that, we now have
pages attached to the file's mapping that are beyond i_size. I don't
like the kind of bugs this calls for.
Fix both birds, by returning a global zero_page, if offset is beyond
i_size.
TODO:
Change the API to ->__r4w_get_page() so a NULL can be
returned without being considered as error, since XOR API
treats NULL entries as zero_pages.
[Bug since 3.2. Should apply the same way to all Kernels since]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
The read-4-write pages are locked in address ascending order.
But where unlocked in a way easiest for coding. Fix that,
locks should be released in opposite order of locking, .i.e
descending address order.
I have not hit this dead-lock. It was found by inspecting the
dbug print-outs. I suspect there is an higher lock at caller that
protects us, but fix it regardless.
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Do to OOM situations the ore might fail to allocate all resources
needed for IO of the full request. If some progress was possible
it would proceed with a partial/short request, for the sake of
forward progress.
Since this crashes NFS-core and exofs is just fine without it just
remove this contraption, and fail.
TODO:
Support real forward progress with some reserved allocations
of resources, such as mem pools and/or bio_sets
[Bug since 3.2 Kernel]
CC: Stable Tree <stable@kernel.org>
CC: Benny Halevy <bhalevy@tonian.com>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
In RAID_5/6 We used to not permit an IO that it's end
byte is not stripe_size aligned and spans more than one stripe.
.i.e the caller must check if after submission the actual
transferred bytes is shorter, and would need to resubmit
a new IO with the remainder.
Exofs supports this, and NFS was supposed to support this
as well with it's short write mechanism. But late testing has
exposed a CRASH when this is used with none-RPC layout-drivers.
The change at NFS is deep and risky, in it's place the fix
at ORE to lift the limitation is actually clean and simple.
So here it is below.
The principal here is that in the case of unaligned IO on
both ends, beginning and end, we will send two read requests
one like old code, before the calculation of the first stripe,
and also a new site, before the calculation of the last stripe.
If any "boundary" is aligned or the complete IO is within a single
stripe. we do a single read like before.
The code is clean and simple by splitting the old _read_4_write
into 3 even parts:
1._read_4_write_first_stripe
2. _read_4_write_last_stripe
3. _read_4_write_execute
And calling 1+3 at the same place as before. 2+3 before last
stripe, and in the case of all in a single stripe then 1+2+3
is preformed additively.
Why did I not think of it before. Well I had a strike of
genius because I have stared at this code for 2 years, and did
not find this simple solution, til today. Not that I did not try.
This solution is much better for NFS than the previous supposedly
solution because the short write was dealt with out-of-band after
IO_done, which would cause for a seeky IO pattern where as in here
we execute in order. At both solutions we do 2 separate reads, only
here we do it within a single IO request. (And actually combine two
writes into a single submission)
NFS/exofs code need not change since the ORE API communicates the new
shorter length on return, what will happen is that this case would not
occur anymore.
hurray!!
[Stable this is an NFS bug since 3.2 Kernel should apply cleanly]
CC: Stable Tree <stable@kernel.org>
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>