]> git.hungrycats.org Git - linux/log
linux
9 years agoLinux 3.18.20 v3.18.20
Sasha Levin [Fri, 7 Aug 2015 19:08:04 +0000 (15:08 -0400)]
Linux 3.18.20

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoblk-mq: fix CPU hotplug handling
Ming Lei [Tue, 21 Apr 2015 02:00:20 +0000 (10:00 +0800)]
blk-mq: fix CPU hotplug handling

[ Upstream commit 2a34c0872adf252f23a6fef2d051a169ac796cef ]

hctx->tags has to be set as NULL in case that it is to be unmapped
no matter if set->tags[hctx->queue_num] is NULL or not in blk_mq_map_swqueue()
because shared tags can be freed already from another request queue.

The same situation has to be considered during handling CPU online too.
Unmapped hw queue can be remapped after CPU topo is changed, so we need
to allocate tags for the hw queue in blk_mq_map_swqueue(). Then tags
allocation for hw queue can be removed in hctx cpu online notifier, and it
is reasonable to do that after mapping is updated.

Cc: <stable@vger.kernel.org>
Reported-by: Dongsu Park <dongsu.park@profitbricks.com>
Tested-by: Dongsu Park <dongsu.park@profitbricks.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoSCSI: add 1024 max sectors black list flag
Mike Christie [Tue, 21 Apr 2015 03:42:24 +0000 (22:42 -0500)]
SCSI: add 1024 max sectors black list flag

[ Upstream commit 35e9a9f93994d7f7d12afa41169c7ba05513721b ]

This works around a issue with qnap iscsi targets not handling large IOs
very well.

The target returns:

VPD INQUIRY: Block limits page (SBC)
  Maximum compare and write length: 1 blocks
  Optimal transfer length granularity: 1 blocks
  Maximum transfer length: 4294967295 blocks
  Optimal transfer length: 4294967295 blocks
  Maximum prefetch, xdread, xdwrite transfer length: 0 blocks
  Maximum unmap LBA count: 8388607
  Maximum unmap block descriptor count: 1
  Optimal unmap granularity: 16383
  Unmap granularity alignment valid: 0
  Unmap granularity alignment: 0
  Maximum write same length: 0xffffffff blocks
  Maximum atomic transfer length: 0
  Atomic alignment: 0
  Atomic transfer length granularity: 0

and it is *sometimes* able to handle at least one IO of size up to 8 MB. We
have seen in traces where it will sometimes work, but other times it
looks like it fails and it looks like it returns failures if we send
multiple large IOs sometimes. Also it looks like it can return 2 different
errors. It will sometimes send iscsi reject errors indicating out of
resources or it will send invalid cdb illegal requests check conditions.
And then when it sends iscsi rejects it does not seem to handle retries
when there are command sequence holes, so I could not just add code to
try and gracefully handle that error code.

The problem is that we do not have a good contact for the company,
so we are not able to determine under what conditions it returns
which error and why it sometimes works.

So, this patch just adds a new black list flag to set targets like this to
the old max safe sectors of 1024. The max_hw_sectors changes added in 3.19
caused this regression, so I also ccing stable.

Reported-by: Christian Hesse <list@eworm.de>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Cc: stable@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoFix firmware loader uevent buffer NULL pointer dereference
Linus Torvalds [Thu, 9 Jul 2015 18:20:01 +0000 (11:20 -0700)]
Fix firmware loader uevent buffer NULL pointer dereference

[ Upstream commit 6f957724b94cb19f5c1c97efd01dd4df8ced323c ]

The firmware class uevent function accessed the "fw_priv->buf" buffer
without the proper locking and testing for NULL.  This is an old bug
(looks like it goes back to 2012 and commit 1244691c73b2: "firmware
loader: introduce firmware_buf"), but for some reason it's triggering
only now in 4.2-rc1.

Shuah Khan is trying to bisect what it is that causes this to trigger
more easily, but in the meantime let's just fix the bug since others are
hitting it too (at least Ingo reports having seen it as well).

Reported-and-tested-by: Shuah Khan <shuahkh@osg.samsung.com>
Acked-by: Ming Lei <ming.lei@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoarm64: Don't report clear pmds and puds as huge
Christoffer Dall [Wed, 1 Jul 2015 12:08:31 +0000 (14:08 +0200)]
arm64: Don't report clear pmds and puds as huge

[ Upstream commit fd28f5d439fca77348c129d5b73043a56f8a0296 ]

The current pmd_huge() and pud_huge() functions simply check if the table
bit is not set and reports the entries as huge in that case.  This is
counter-intuitive as a clear pmd/pud cannot also be a huge pmd/pud, and
it is inconsistent with at least arm and x86.

To prevent others from making the same mistake as me in looking at code
that calls these functions and to fix an issue with KVM on arm64 that
causes memory corruption due to incorrect page reference counting
resulting from this mistake, let's change the behavior.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Fixes: 084bd29810a5 ("ARM64: mm: HugeTLB support.")
Cc: <stable@vger.kernel.org> # 3.11+
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years ago9p: don't leave a half-initialized inode sitting around
Al Viro [Sun, 12 Jul 2015 14:34:29 +0000 (10:34 -0400)]
9p: don't leave a half-initialized inode sitting around

[ Upstream commit 0a73d0a204a4a04a1e110539c5a524ae51f91d6d ]

Cc: stable@vger.kernel.org # all branches
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years ago9p: forgetting to cancel request on interrupted zero-copy RPC
Al Viro [Sat, 4 Jul 2015 20:04:19 +0000 (16:04 -0400)]
9p: forgetting to cancel request on interrupted zero-copy RPC

[ Upstream commit a84b69cb6e0a41e86bc593904faa6def3b957343 ]

If we'd already sent a request and decide to abort it, we *must*
issue TFLUSH properly and not just blindly reuse the tag, or
we'll get seriously screwed when response eventually arrives
and we confuse it for response to later request that had reused
the same tag.

Cc: stable@vger.kernel.org # v3.2 and later
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoNFS: Fix size of NFSACL SETACL operations
Chuck Lever [Tue, 26 May 2015 15:53:52 +0000 (11:53 -0400)]
NFS: Fix size of NFSACL SETACL operations

[ Upstream commit d683cc49daf7c5afca8cd9654aaa1bf63cdf2ad9 ]

When encoding the NFSACL SETACL operation, reserve just the estimated
size of the ACL rather than a fixed maximum. This eliminates needless
zero padding on the wire that the server ignores.

Fixes: ee5dc7732bd5 ('NFS: Fix "kernel BUG at fs/nfs/nfs3xdr.c:1338!"')
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agowatchdog: omap: assert the counter being stopped before reprogramming
Uwe Kleine-König [Wed, 29 Apr 2015 18:38:46 +0000 (20:38 +0200)]
watchdog: omap: assert the counter being stopped before reprogramming

[ Upstream commit 530c11d432727c697629ad5f9d00ee8e2864d453 ]

The omap watchdog has the annoying behaviour that writes to most
registers don't have any effect when the watchdog is already running.
Quoting the AM335x reference manual:

To modify the timer counter value (the WDT_WCRR register),
prescaler ratio (the WDT_WCLR[4:2] PTV bit field), delay
configuration value (the WDT_WDLY[31:0] DLY_VALUE bit field), or
the load value (the WDT_WLDR[31:0] TIMER_LOAD bit field), the
watchdog timer must be disabled by using the start/stop sequence
(the WDT_WSPR register).

Currently the timer is stopped in the .probe callback but still there
are possibilities that yield to a situation where omap_wdt_start is
entered with the timer running (e.g. when /dev/watchdog is closed
without stopping and then reopened). In such a case programming the
timeout silently fails!

To circumvent this stop the timer before reprogramming.

Assuming one of the first things the watchdog user does is setting the
timeout explicitly nothing too bad should happen because this explicit
setting works fine.

Fixes: 7768a13c252a ("[PATCH] OMAP: Add Watchdog driver support")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoof: return NUMA_NO_NODE from fallback of_node_to_nid()
Konstantin Khlebnikov [Wed, 8 Apr 2015 16:59:20 +0000 (19:59 +0300)]
of: return NUMA_NO_NODE from fallback of_node_to_nid()

[ Upstream commit c8fff7bc5bba6bd59cad40441c189c4efe7190f6 ]

Node 0 might be offline as well as any other numa node,
in this case kernel cannot handle memory allocation and crashes.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Fixes: 0c3f061c195c ("of: implement of_node_to_nid as a weak function")
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoUSB: usbfs: allow URBs to be reaped after disconnection
Alan Stern [Thu, 29 Jan 2015 16:29:13 +0000 (11:29 -0500)]
USB: usbfs: allow URBs to be reaped after disconnection

[ Upstream commit 3f2cee73b650921b2e214bf487b2061a1c266504 ]

The usbfs API has a peculiar hole: Users are not allowed to reap their
URBs after the device has been disconnected.  There doesn't seem to be
any good reason for this; it is an ad-hoc inconsistency.

The patch allows users to issue the USBDEVFS_REAPURB and
USBDEVFS_REAPURBNDELAY ioctls (together with their 32-bit counterparts
on 64-bit systems) even after the device is gone.  If no URBs are
pending for a disconnected device then the ioctls will return -ENODEV
rather than -EAGAIN, because obviously no new URBs will ever be able
to complete.

The patch also adds a new capability flag for
USBDEVFS_GET_CAPABILITIES to indicate that the reap-after-disconnect
feature is supported.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Chris Dickens <christopher.a.dickens@gmail.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Greg Kroah-Hartman <greg@kroah.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agodell-laptop: Fix allocating & freeing SMI buffer page
Pali Rohár [Tue, 23 Jun 2015 08:11:19 +0000 (10:11 +0200)]
dell-laptop: Fix allocating & freeing SMI buffer page

[ Upstream commit b8830a4e71b15d0364ac8e6c55301eea73f211da ]

This commit fix kernel crash when probing for rfkill devices in dell-laptop
driver failed. Function free_page() was incorrectly used on struct page *
instead of virtual address of SMI buffer.

This commit also simplify allocating page for SMI buffer by using
__get_free_page() function instead of sequential call of functions
alloc_page() and page_address().

Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Darren Hart <dvhart@linux.intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agomac80211: prevent possible crypto tx tailroom corruption
Michal Kazior [Fri, 22 May 2015 08:22:40 +0000 (10:22 +0200)]
mac80211: prevent possible crypto tx tailroom corruption

[ Upstream commit ab499db80fcf07c18e4053f91a619500f663e90e ]

There was a possible race between
ieee80211_reconfig() and
ieee80211_delayed_tailroom_dec(). This could
result in inability to transmit data if driver
crashed during roaming or rekeying and subsequent
skbs with insufficient tailroom appeared.

This race was probably never seen in the wild
because a device driver would have to crash AND
recover within 0.5s which is very unlikely.

I was able to prove this race exists after
changing the delay to 10s locally and crashing
ath10k via debugfs immediately after GTK
rekeying. In case of ath10k the counter went below
0. This was harmless but other drivers which
actually require tailroom (e.g. for WEP ICV or
MMIC) could end up with the counter at 0 instead
of >0 and introduce insufficient skb tailroom
failures because mac80211 would not resize skbs
appropriately anymore.

Fixes: 8d1f7ecd2af5 ("mac80211: defer tailroom counter manipulation when roaming")
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agosecurity_syslog() should be called once only
Vasily Averin [Thu, 25 Jun 2015 22:01:44 +0000 (15:01 -0700)]
security_syslog() should be called once only

[ Upstream commit d194e5d666225b04c7754471df0948f645b6ab3a ]

The final version of commit 637241a900cb ("kmsg: honor dmesg_restrict
sysctl on /dev/kmsg") lost few hooks, as result security_syslog() are
processed incorrectly:

- open of /dev/kmsg checks syslog access permissions by using
  check_syslog_permissions() where security_syslog() is not called if
  dmesg_restrict is set.

- syslog syscall and /proc/kmsg calls do_syslog() where security_syslog
  can be executed twice (inside check_syslog_permissions() and then
  directly in do_syslog())

With this patch security_syslog() is called once only in all
syslog-related operations regardless of dmesg_restrict value.

Fixes: 637241a900cb ("kmsg: honor dmesg_restrict sysctl on /dev/kmsg")
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Josh Boyer <jwboyer@redhat.com>
Cc: Eric Paris <eparis@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years ago__bitmap_parselist: fix bug in empty string handling
Chris Metcalf [Thu, 25 Jun 2015 22:02:08 +0000 (15:02 -0700)]
__bitmap_parselist: fix bug in empty string handling

[ Upstream commit 2528a8b8f457d7432552d0e2b6f0f4046bb702f4 ]

bitmap_parselist("", &mask, nmaskbits) will erroneously set bit zero in
the mask.  The same bug is visible in cpumask_parselist() since it is
layered on top of the bitmask code, e.g.  if you boot with "isolcpus=",
you will actually end up with cpu zero isolated.

The bug was introduced in commit 4b060420a596 ("bitmap, irq: add
smp_affinity_list interface to /proc/irq") when bitmap_parselist() was
generalized to support userspace as well as kernelspace.

Fixes: 4b060420a596 ("bitmap, irq: add smp_affinity_list interface to /proc/irq")
Signed-off-by: Chris Metcalf <cmetcalf@ezchip.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agommc: card: Fixup request missing in mmc_blk_issue_rw_rq
Ding Wang [Mon, 18 May 2015 12:14:15 +0000 (20:14 +0800)]
mmc: card: Fixup request missing in mmc_blk_issue_rw_rq

[ Upstream commit 29535f7b797df35cc9b6b3bca635591cdd3dd2a8 ]

The current handler of MMC_BLK_CMD_ERR in mmc_blk_issue_rw_rq function
may cause new coming request permanent missing when the ongoing
request (previoulsy started) complete end.

The problem scenario is as follows:
(1) Request A is ongoing;
(2) Request B arrived, and finally mmc_blk_issue_rw_rq() is called;
(3) Request A encounters the MMC_BLK_CMD_ERR error;
(4) In the error handling of MMC_BLK_CMD_ERR, suppose mmc_blk_cmd_err()
    end request A completed and return zero. Continue the error handling,
    suppose mmc_blk_reset() reset device success;
(5) Continue the execution, while loop completed because variable ret
    is zero now;
(6) Finally, mmc_blk_issue_rw_rq() return without processing request B.

The process related to the missing request may wait that IO request
complete forever, possibly crashing the application or hanging the system.

Fix this issue by starting new request when reset success.

Signed-off-by: Ding Wang <justin.wang@spreadtrum.com>
Fixes: 67716327eec7 ("mmc: block: add eMMC hardware reset support")
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoiser-target: Fix possible deadlock in RDMA_CM connection error
Sagi Grimberg [Sun, 29 Mar 2015 12:52:04 +0000 (15:52 +0300)]
iser-target: Fix possible deadlock in RDMA_CM connection error

[ Upstream commit 4a579da2586bd3b79b025947ea24ede2bbfede62 ]

Before we reach to connection established we may get an
error event. In this case the core won't teardown this
connection (never established it), so we take care of freeing
it ourselves.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoiscsi-target: Convert iscsi_thread_set usage to kthread.h
Nicholas Bellinger [Fri, 27 Feb 2015 06:19:15 +0000 (22:19 -0800)]
iscsi-target: Convert iscsi_thread_set usage to kthread.h

[ Upstream commit 88dcd2dab5c23b1c9cfc396246d8f476c872f0ca ]

This patch converts iscsi-target code to use modern kthread.h API
callers for creating RX/TX threads for each new iscsi_conn descriptor,
and releasing associated RX/TX threads during connection shutdown.

This is done using iscsit_start_kthreads() -> kthread_run() to start
new kthreads from within iscsi_post_login_handler(), and invoking
kthread_stop() from existing iscsit_close_connection() code.

Also, convert iscsit_logout_post_handler_closesession() code to use
cmpxchg when determing when iscsit_cause_connection_reinstatement()
needs to sleep waiting for completion.

Reported-by: Sagi Grimberg <sagig@mellanox.com>
Tested-by: Sagi Grimberg <sagig@mellanox.com>
Cc: Slava Shwartsman <valyushash@gmail.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoACPICA: Tables: Fix an issue that FACS initialization is performed twice
Lv Zheng [Wed, 1 Jul 2015 06:43:26 +0000 (14:43 +0800)]
ACPICA: Tables: Fix an issue that FACS initialization is performed twice

[ Upstream commit c04be18448355441a0c424362df65b6422e27bda ]

ACPICA commit 90f5332a15e9d9ba83831ca700b2b9f708274658

This patch adds a new FACS initialization flag for acpi_tb_initialize().
acpi_enable_subsystem() might be invoked several times in OS bootup process,
and we don't want FACS initialization to be invoked twice. Lv Zheng.

Link: https://github.com/acpica/acpica/commit/90f5332a
Cc: All applicable <stable@vger.kernel.org> # All applicable
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoBtrfs: fix memory leak in the extent_same ioctl
Filipe Manana [Fri, 3 Jul 2015 07:36:11 +0000 (08:36 +0100)]
Btrfs: fix memory leak in the extent_same ioctl

[ Upstream commit 497b4050e0eacd4c746dd396d14916b1e669849d ]

We were allocating memory with memdup_user() but we were never releasing
that memory. This affected pretty much every call to the ioctl, whether
it deduplicated extents or not.

This issue was reported on IRC by Julian Taylor and on the mailing list
by Marcel Ritter, credit goes to them for finding the issue.

Reported-by: Julian Taylor <jtaylor.debian@googlemail.com>
Reported-by: Marcel Ritter <ritter.marcel@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoBtrfs: use kmem_cache_free when freeing entry in inode cache
Filipe Manana [Sat, 13 Jun 2015 05:52:56 +0000 (06:52 +0100)]
Btrfs: use kmem_cache_free when freeing entry in inode cache

[ Upstream commit c3f4a1685bb87e59c886ee68f7967eae07d4dffa ]

The free space entries are allocated using kmem_cache_zalloc(),
through __btrfs_add_free_space(), therefore we should use
kmem_cache_free() and not kfree() to avoid any confusion and
any potential problem. Looking at the kfree() definition at
mm/slab.c it has the following comment:

  /*
   * (...)
   *
   * Don't free memory not originally allocated by kmalloc()
   * or you will run into trouble.
   */

So better be safe and use kmem_cache_free().

Cc: stable@vger.kernel.org
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agomd: fix a build warning
Firo Yang [Thu, 11 Jun 2015 01:41:10 +0000 (09:41 +0800)]
md: fix a build warning

[ Upstream commit 4e023612325a9034a542bfab79f78b1fe5ebb841 ]

Warning like this:

drivers/md/md.c: In function "update_array_info":
drivers/md/md.c:6394:26: warning: logical not is only applied
to the left hand side of comparison [-Wlogical-not-parentheses]
      !mddev->persistent  != info->not_persistent||

Fix it as Neil Brown said:
mddev->persistent != !info->not_persistent ||

Signed-off-by: Firo Yang <firogm@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agotracing: Have branch tracer use recursive field of task struct
Steven Rostedt (Red Hat) [Tue, 7 Jul 2015 19:05:03 +0000 (15:05 -0400)]
tracing: Have branch tracer use recursive field of task struct

[ Upstream commit 6224beb12e190ff11f3c7d4bf50cb2922878f600 ]

Fengguang Wu's tests triggered a bug in the branch tracer's start up
test when CONFIG_DEBUG_PREEMPT set. This was because that config
adds some debug logic in the per cpu field, which calls back into
the branch tracer.

The branch tracer has its own recursive checks, but uses a per cpu
variable to implement it. If retrieving the per cpu variable calls
back into the branch tracer, you can see how things will break.

Instead of using a per cpu variable, use the trace_recursion field
of the current task struct. Simply set a bit when entering the
branch tracing and clear it when leaving. If the bit is set on
entry, just don't do the tracing.

There's also the case with lockdep, as the local_irq_save() called
before the recursion can also trigger code that can call back into
the function. Changing that to a raw_local_irq_save() will protect
that as well.

This prevents the recursion and the inevitable crash that follows.

Link: http://lkml.kernel.org/r/20150630141803.GA28071@wfg-t540p.sh.intel.com
Cc: stable@vger.kernel.org # 3.10+
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agotracing/filter: Do not WARN on operand count going below zero
Steven Rostedt (Red Hat) [Thu, 25 Jun 2015 22:02:29 +0000 (18:02 -0400)]
tracing/filter: Do not WARN on operand count going below zero

[ Upstream commit b4875bbe7e68f139bd3383828ae8e994a0df6d28 ]

When testing the fix for the trace filter, I could not come up with
a scenario where the operand count goes below zero, so I added a
WARN_ON_ONCE(cnt < 0) to the logic. But there is legitimate case
that it can happen (although the filter would be wrong).

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

That is, a single operation without any operands will hit the path
where the WARN_ON_ONCE() can trigger. Although this is harmless,
and the filter is reported as a error. But instead of spitting out
a warning to the kernel dmesg, just fail nicely and report it via
the proper channels.

Link: http://lkml.kernel.org/r/558C6082.90608@oracle.com
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agolibata: force disable trim for SuperSSpeed S238
Arne Fitzenreiter [Wed, 15 Jul 2015 11:54:37 +0000 (13:54 +0200)]
libata: force disable trim for SuperSSpeed S238

[ Upstream commit cda57b1b05cf7b8b99ab4b732bea0b05b6c015cc ]

This device loses blocks, often the partition table area, on trim.
Disable TRIM.
http://pcengines.ch/msata16a.htm

Signed-off-by: Arne Fitzenreiter <arne_f@ipfire.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agolibata: add ATA_HORKAGE_NOTRIM
Arne Fitzenreiter [Wed, 15 Jul 2015 11:54:36 +0000 (13:54 +0200)]
libata: add ATA_HORKAGE_NOTRIM

[ Upstream commit 71d126fd28de2d4d9b7b2088dbccd7ca62fad6e0 ]

Some devices lose data on TRIM whether queued or not.  This patch adds
a horkage to disable TRIM.

tj: Collapsed unnecessary if() nesting.

Signed-off-by: Arne Fitzenreiter <arne_f@ipfire.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoevm: labeling pseudo filesystems exception
Mimi Zohar [Tue, 21 Apr 2015 17:59:31 +0000 (13:59 -0400)]
evm: labeling pseudo filesystems exception

[ Upstream commit 5101a1850bb7ccbf107929dee9af0cd2f400940f ]

To prevent offline stripping of existing file xattrs and relabeling of
them at runtime, EVM allows only newly created files to be labeled.  As
pseudo filesystems are not persistent, stripping of xattrs is not a
concern.

Some LSMs defer file labeling on pseudo filesystems.  This patch
permits the labeling of existing files on pseudo files systems.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoKEYS: ensure we free the assoc array edit if edit is valid
Colin Ian King [Mon, 27 Jul 2015 14:23:43 +0000 (15:23 +0100)]
KEYS: ensure we free the assoc array edit if edit is valid

[ Upstream commit HEAD ]

commit ca4da5dd1f99fe9c59f1709fb43e818b18ad20e0 upstream.

__key_link_end is not freeing the associated array edit structure
and this leads to a 512 byte memory leak each time an identical
existing key is added with add_key().

The reason the add_key() system call returns okay is that
key_create_or_update() calls __key_link_begin() before checking to see
whether it can update a key directly rather than adding/replacing - which
it turns out it can.  Thus __key_link() is not called through
__key_instantiate_and_link() and __key_link_end() must cancel the edit.

CVE-2015-1333

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c9cd9b18dac801040ada16562dc579d5ac366d75)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agodrm: add a check for x/y in drm_mode_setcrtc
Zhao Junwang [Tue, 7 Jul 2015 09:08:35 +0000 (17:08 +0800)]
drm: add a check for x/y in drm_mode_setcrtc

[ Upstream commit 01447e9f04ba1c49a9534ae6a5a6f26c2bb05226 ]

legacy setcrtc ioctl does take a 32 bit value which might indeed
overflow

the checks of crtc_req->x > INT_MAX and crtc_req->y > INT_MAX aren't
needed any more with this

v2: -polish the annotation according to Daniel's comment

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agodrm/radeon: add a dpm quirk for Sapphire Radeon R9 270X 2GB GDDR5
Alex Deucher [Fri, 10 Jul 2015 01:08:17 +0000 (21:08 -0400)]
drm/radeon: add a dpm quirk for Sapphire Radeon R9 270X 2GB GDDR5

[ Upstream commit 5dfc71bc44d91d1620505c064fa22b0b3db58a9d ]

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=76490

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agommc: block: Add missing mmc_blk_put() in power_ro_lock_show()
Tomas Winkler [Thu, 16 Jul 2015 13:50:45 +0000 (15:50 +0200)]
mmc: block: Add missing mmc_blk_put() in power_ro_lock_show()

[ Upstream commit 9098f84cced870f54d8c410dd2444cfa61467fa0 ]

Enclosing mmc_blk_put() is missing in power_ro_lock_show() sysfs handler,
let's add it.

Fixes: add710eaa886 ("mmc: boot partition ro lock support")
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agodm btree: silence lockdep lock inversion in dm_btree_del()
Joe Thornber [Fri, 3 Jul 2015 13:51:32 +0000 (14:51 +0100)]
dm btree: silence lockdep lock inversion in dm_btree_del()

[ Upstream commit 1c7518794a3647eb345d59ee52844e8a40405198 ]

Allocate memory using GFP_NOIO when deleting a btree.  dm_btree_del()
can be called via an ioctl and we don't want to recurse into the FS or
block layer.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agodm btree remove: fix bug in redistribute3
Dennis Yang [Fri, 26 Jun 2015 14:25:48 +0000 (15:25 +0100)]
dm btree remove: fix bug in redistribute3

[ Upstream commit 4c7e309340ff85072e96f529582d159002c36734 ]

redistribute3() shares entries out across 3 nodes.  Some entries were
being moved the wrong way, breaking the ordering.  This manifested as a
BUG() in dm-btree-remove.c:shift() when entries were removed from the
btree.

For additional context see:
https://www.redhat.com/archives/dm-devel/2015-May/msg00113.html

Signed-off-by: Dennis Yang <shinrairis@gmail.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agousb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function
AMAN DEEP [Tue, 21 Jul 2015 14:20:27 +0000 (17:20 +0300)]
usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function

[ Upstream commit 3496810663922617d4b706ef2780c279252ddd6a ]

virt_dev->num_cached_rings counts on freed ring and is not updated
correctly. In xhci_free_or_cache_endpoint_ring() function, the free ring
is added into cache and then num_rings_cache is incremented as below:
virt_dev->ring_cache[rings_cached] =
virt_dev->eps[ep_index].ring;
virt_dev->num_rings_cached++;
here, free ring pointer is added to a current index and then
index is incremented.
So current index always points to empty location in the ring cache.
For getting available free ring, current index should be decremented
first and then corresponding ring buffer value should be taken from ring
cache.

But In function xhci_endpoint_init(), the num_rings_cached index is
accessed before decrement.
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached];
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;
virt_dev->num_rings_cached--;
This is bug in manipulating the index of ring cache.
And it should be as below:
virt_dev->num_rings_cached--;
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached];
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;

Cc: <stable@vger.kernel.org>
Signed-off-by: Aman Deep <aman.deep@samsung.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoUSB: serial: Destroy serial_minors IDR on module exit
Johannes Thumshirn [Wed, 8 Jul 2015 15:26:37 +0000 (17:26 +0200)]
USB: serial: Destroy serial_minors IDR on module exit

[ Upstream commit d23f47d4927fd2f61b3a754d83c7bcec215b5cfe ]

Destroy serial_minors IDR on module exit, reclaiming the allocated memory.

This was detected by the following semantic patch (written by Luis
Rodriguez <mcgrof@suse.com>)

<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@

module_init(init);

@ defines_module_exit @
identifier exit;
@@

module_exit(exit);

@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@

DEFINE_IDR(idr);

@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 idr_destroy(&idr);
 ...
}

@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 +idr_destroy(&idr);
}
</SmPL>

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: stable <stable@vger.kernel.org> # v3.11
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoUSB: option: add 2020:4000 ID
Claudio Cappelli [Wed, 10 Jun 2015 18:38:30 +0000 (20:38 +0200)]
USB: option: add 2020:4000 ID

[ Upstream commit f6d7fb37f92622479ef6da604f27561f5045ba1e ]

Add device Olivetti Olicard 300 (Network Connect: MT6225) - IDs 2020:4000.

T:  Bus=01 Lev=02 Prnt=04 Port=00 Cnt=01 Dev#= 10 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2020 ProdID=4000 Rev=03.00
S:  Manufacturer=Network Connect
S:  Product=MT6225
C:  #Ifs= 7 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=02 Prot=01 Driver=option
I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 6 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage

Signed-off-by: Claudio Cappelli <claudio.cappelli.linux@gmail.com>
Suggested-by: Lars Melin <larsm17@gmail.com>
Cc: stable <stable@vger.kernel.org>
[johan: amend commit message with devices info ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoUSB: cp210x: add ID for Aruba Networks controllers
Peter Sanford [Fri, 26 Jun 2015 00:40:05 +0000 (17:40 -0700)]
USB: cp210x: add ID for Aruba Networks controllers

[ Upstream commit f98a7aa81eeeadcad25665c3501c236d531d4382 ]

Add the USB serial console device ID for Aruba Networks 7xxx series
controllers which have a USB port for their serial console.

Signed-off-by: Peter Sanford <peter@sanford.io>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agousb: musb: host: rely on port_mode to call musb_start()
Felipe Balbi [Tue, 2 Jun 2015 18:03:36 +0000 (13:03 -0500)]
usb: musb: host: rely on port_mode to call musb_start()

[ Upstream commit be9d39881fc4fa39a64b6eed6bab5d9ee5125344 ]

Currently, we're calling musb_start() twice for DRD ports
in some situations. This has been observed to cause enumeration
issues after suspend/resume cycles with AM335x.

In order to fix the problem, we just have to fix the check
on musb_has_gadget() so that it only returns true if
current mode is Host and ignore the fact that we have or
not a gadget driver loaded.

Fixes: ae44df2e21b5 (usb: musb: call musb_start() only once in OTG mode)
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <stable@vger.kernel.org> # v3.11+
Tested-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoUSB: devio: fix a condition in async_completed()
Dan Carpenter [Mon, 18 May 2015 12:29:51 +0000 (15:29 +0300)]
USB: devio: fix a condition in async_completed()

[ Upstream commit 83ed07c5db71bc02bd646d6eb60b48908235cdf9 ]

Static checkers complain that the current condition is never true.  It
seems pretty likely that it's a typo and "URB" was intended instead of
"USB".

Fixes: 3d97ff63f899 ('usbdevfs: Use scatter-gather lists for large bulk transfers')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agousb: dwc3: Reset the transfer resource index on SET_INTERFACE
John Youn [Mon, 17 Sep 2001 07:00:00 +0000 (00:00 -0700)]
usb: dwc3: Reset the transfer resource index on SET_INTERFACE

[ Upstream commit aebda618718157a69c0dc0adb978d69bc2b8723c ]

This fixes an issue introduced in commit b23c843992b6 (usb: dwc3:
gadget: fix DEPSTARTCFG for non-EP0 EPs) that made sure we would
only use DEPSTARTCFG once per SetConfig.

The trick is that we should use one DEPSTARTCFG per SetConfig *OR*
SetInterface. SetInterface was completely missed from the original
patch.

This problem became aparent after commit 76e838c9f776 (usb: dwc3:
gadget: return error if command sent to DEPCMD register fails)
added checking of the return status of device endpoint commands.

'Set Endpoint Transfer Resource' command was caught failing
occasionally. This is because the Transfer Resource
Index was not getting reset during a SET_INTERFACE request.

Finally, to fix the issue, was we have to do is make sure that
our start_config_issued flag gets reset whenever we receive a
SetInterface request.

To verify the problem (and its fix), all we have to do is run
test 9 from testusb with 'testusb -t 9 -s 2048 -a -c 5000'.

Tested-by: Huang Rui <ray.huang@amd.com>
Tested-by: Subbaraya Sundeep Bhatta <subbaraya.sundeep.bhatta@xilinx.com>
Fixes: b23c843992b6 (usb: dwc3: gadget: fix DEPSTARTCFG for non-EP0 EPs)
Cc: <stable@vger.kernel.org> # v3.2+
Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agolibata: increase the timeout when setting transfer mode
Mikulas Patocka [Wed, 8 Jul 2015 17:06:12 +0000 (13:06 -0400)]
libata: increase the timeout when setting transfer mode

[ Upstream commit d531be2ca2f27cca5f041b6a140504999144a617 ]

I have a ST4000DM000 disk. If Linux is booted while the disk is spun down,
the command that sets transfer mode causes the disk to spin up. The
spin-up takes longer than the default 5s timeout, so the command fails and
timeout is reported.

Fix this by increasing the timeout to 15s, which is enough for the disk to
spin up.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agolibata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk VB0250EAVER
Aleksei Mamlin [Wed, 1 Jul 2015 10:48:30 +0000 (13:48 +0300)]
libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk VB0250EAVER

[ Upstream commit 08c85d2a599d967ede38a847f5594447b6100642 ]

Enabling AA on HP 250GB SATA disk VB0250EAVER causes errors:

[    3.788362] ata3.00: failed to enable AA (error_mask=0x1)
[    3.789243] ata3.00: failed to enable AA (error_mask=0x1)

Add the ATA_HORKAGE_BROKEN_FPDMA_AA for this specific harddisk.

tj: Collected FPDMA_AA entries and updated comment.

Signed-off-by: Aleksei Mamlin <mamlinav@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoASoC: imx-wm8962: Add a missing error check
Dan Carpenter [Wed, 10 Jun 2015 15:37:23 +0000 (18:37 +0300)]
ASoC: imx-wm8962: Add a missing error check

[ Upstream commit 474ff0ae23b834e9fc18374d14bb5f3e7b3828b4 ]

My static checker complains that:

sound/soc/fsl/imx-wm8962.c:196 imx_wm8962_probe() warn:
we tested 'ret' before and it was 'false'

The intent was that we use "ret" to check imx_audmux_v2_configure_port().

Fixes: 8de2ae2a7f1f ('ASoC: fsl: add imx-wm8962 machine driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Otherwise, Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoiio: adc: at91_adc: allow to use full range of startup time
Jan Leupold [Wed, 17 Jun 2015 16:21:36 +0000 (18:21 +0200)]
iio: adc: at91_adc: allow to use full range of startup time

[ Upstream commit 815983e9afcb21552e1b9d77e3755d394cbf5249 ]

The DT-Property "atmel,adc-startup-time" is stored in an u8 for a microsecond
value. When trying to increase the value of STARTUP in Register AT91_ADC_MR
some higher values can't be reached.

Change the type in function parameter and private structure field from u8 to
u32.

Signed-off-by: Jan Leupold <leupold@rsi-elektrotechnik.de>
[nicolas.ferre@atmel.com: change commit message, increase u16 to u32 for startup time]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoiio: tmp006: Check channel info on write
Peter Meerwald [Sun, 21 Jun 2015 21:50:21 +0000 (23:50 +0200)]
iio: tmp006: Check channel info on write

[ Upstream commit f451957dafe712ab2cd4b7ca8ba9197798115fbe ]

only SAMP_FREQ is writable

Will lead to SAMP_FREQ being written by any attempt to write
to the other exported attributes and hence a rather unexpected
result!

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoiio: DAC: ad5624r_spi: fix bit shift of output data value
JM Friedt [Fri, 19 Jun 2015 12:48:06 +0000 (14:48 +0200)]
iio: DAC: ad5624r_spi: fix bit shift of output data value

[ Upstream commit 09f4dcc9443c4731b6669066bd1bc4312ef17bcc ]

The value sent on the SPI bus is shifted by an erroneous number of bits.
The shift value was already computed in the iio_chan_spec structure and
hence subtracting this argument to 16 yields an erroneous data position
in the SPI stream.

Signed-off-by: JM Friedt <jmfriedt@femto-st.fr>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoext4: replace open coded nofail allocation in ext4_free_blocks()
Michal Hocko [Sun, 5 Jul 2015 16:33:44 +0000 (12:33 -0400)]
ext4: replace open coded nofail allocation in ext4_free_blocks()

[ Upstream commit 7444a072c387a93ebee7066e8aee776954ab0e41 ]

ext4_free_blocks is looping around the allocation request and mimics
__GFP_NOFAIL behavior without any allocation fallback strategy. Let's
remove the open coded loop and replace it with __GFP_NOFAIL. Without the
flag the allocator has no way to find out never-fail requirement and
cannot help in any way.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoext4: correctly migrate a file with a hole at the beginning
Eryu Guan [Sat, 4 Jul 2015 04:03:44 +0000 (00:03 -0400)]
ext4: correctly migrate a file with a hole at the beginning

[ Upstream commit 8974fec7d72e3e02752fe0f27b4c3719c78d9a15 ]

Currently ext4_ind_migrate() doesn't correctly handle a file which
contains a hole at the beginning of the file.  This caused the migration
to be done incorrectly, and then if there is a subsequent following
delayed allocation write to the "hole", this would reclaim the same data
blocks again and results in fs corruption.

  # assmuing 4k block size ext4, with delalloc enabled
  # skip the first block and write to the second block
  xfs_io -fc "pwrite 4k 4k" -c "fsync" /mnt/ext4/testfile

  # converting to indirect-mapped file, which would move the data blocks
  # to the beginning of the file, but extent status cache still marks
  # that region as a hole
  chattr -e /mnt/ext4/testfile

  # delayed allocation writes to the "hole", reclaim the same data block
  # again, results in i_blocks corruption
  xfs_io -c "pwrite 0 4k" /mnt/ext4/testfile
  umount /mnt/ext4
  e2fsck -nf /dev/sda6
  ...
  Inode 53, i_blocks is 16, should be 8.  Fix? no
  ...

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoext4: be more strict when migrating to non-extent based file
Eryu Guan [Sat, 4 Jul 2015 03:56:50 +0000 (23:56 -0400)]
ext4: be more strict when migrating to non-extent based file

[ Upstream commit d6f123a9297496ad0b6335fe881504c4b5b2a5e5 ]

Currently the check in ext4_ind_migrate() is not enough before doing the
real conversion:

a) delayed allocated extents could bypass the check on eh->eh_entries
   and eh->eh_depth

This can be demonstrated by this script

  xfs_io -fc "pwrite 0 4k" -c "pwrite 8k 4k" /mnt/ext4/testfile
  chattr -e /mnt/ext4/testfile

where testfile has two extents but still be converted to non-extent
based file format.

b) only extent length is checked but not the offset, which would result
   in data lose (delalloc) or fs corruption (nodelalloc), because
   non-extent based file only supports at most (12 + 2^10 + 2^20 + 2^30)
   blocks

This can be demostrated by

  xfs_io -fc "pwrite 5T 4k" /mnt/ext4/testfile
  chattr -e /mnt/ext4/testfile
  sync

If delalloc is enabled, dmesg prints
  EXT4-fs warning (device dm-4): ext4_block_to_path:105: block 1342177280 > max in inode 53
  EXT4-fs (dm-4): Delayed block allocation failed for inode 53 at logical offset 1342177280 with max blocks 1 with error 5
  EXT4-fs (dm-4): This should not happen!! Data will be lost

If delalloc is disabled, e2fsck -nf shows corruption
  Inode 53, i_size is 5497558142976, should be 4096.  Fix? no

Fix the two issues by

a) forcing all delayed allocation blocks to be allocated before checking
   eh->eh_depth and eh->eh_entries
b) limiting the last logical block of the extent is within direct map

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoext4: fix reservation release on invalidatepage for delalloc fs
Lukas Czerner [Sat, 4 Jul 2015 01:13:55 +0000 (21:13 -0400)]
ext4: fix reservation release on invalidatepage for delalloc fs

[ Upstream commit 9705acd63b125dee8b15c705216d7186daea4625 ]

On delalloc enabled file system on invalidatepage operation
in ext4_da_page_release_reservation() we want to clear the delayed
buffer and remove the extent covering the delayed buffer from the extent
status tree.

However currently there is a bug where on the systems with page size >
block size we will always remove extents from the start of the page
regardless where the actual delayed buffers are positioned in the page.
This leads to the errors like this:

EXT4-fs warning (device loop0): ext4_da_release_space:1225:
ext4_da_release_space: ino 13, to_free 1 with only 0 reserved data
blocks

This however can cause data loss on writeback time if the file system is
in ENOSPC condition because we're releasing reservation for someones
else delayed buffer.

Fix this by only removing extents that corresponds to the part of the
page we want to invalidate.

This problem is reproducible by the following fio receipt (however I was
only able to reproduce it with fio-2.1 or older.

[global]
bs=8k
iodepth=1024
iodepth_batch=60
randrepeat=1
size=1m
directory=/mnt/test
numjobs=20
[job1]
ioengine=sync
bs=1k
direct=1
rw=randread
filename=file1:file2
[job2]
ioengine=libaio
rw=randwrite
direct=1
filename=file1:file2
[job3]
bs=1k
ioengine=posixaio
rw=randwrite
direct=1
filename=file1:file2
[job5]
bs=1k
ioengine=sync
rw=randread
filename=file1:file2
[job7]
ioengine=libaio
rw=randwrite
filename=file1:file2
[job8]
ioengine=posixaio
rw=randwrite
filename=file1:file2
[job10]
ioengine=mmap
rw=randwrite
bs=1k
filename=file1:file2
[job11]
ioengine=mmap
rw=randwrite
direct=1
filename=file1:file2

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agospi: pl022: Specify 'num-cs' property as required in devicetree binding
Ezequiel Garcia [Mon, 11 May 2015 15:20:18 +0000 (12:20 -0300)]
spi: pl022: Specify 'num-cs' property as required in devicetree binding

[ Upstream commit ea6055c46eda1e19e02209814955e13f334bbe1b ]

Since commit 39a6ac11df65 ("spi/pl022: Devicetree support w/o platform data")
the 'num-cs' parameter cannot be passed through platform data when probing
with devicetree. Instead, it's a required devicetree property.

Fix the binding documentation so the property is properly specified.

Fixes: 39a6ac11df65 ("spi/pl022: Devicetree support w/o platform data")
Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoRevert "can: fix loss of CAN frames in raw_rcv"
Sasha Levin [Tue, 4 Aug 2015 17:32:39 +0000 (13:32 -0400)]
Revert "can: fix loss of CAN frames in raw_rcv"

This reverts commit c215cf258214858a5a6c3e63cd7ee78b92d210b2.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoLinux 3.18.19 v3.18.19
Sasha Levin [Tue, 21 Jul 2015 01:13:10 +0000 (21:13 -0400)]
Linux 3.18.19

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agox86/xen: allow privcmd hypercalls to be preempted
David Vrabel [Thu, 19 Feb 2015 15:23:17 +0000 (15:23 +0000)]
x86/xen: allow privcmd hypercalls to be preempted

[ Upstream commit db0e2aa47f0a45fc56592c25ad370f8836cbb128 ]

Hypercalls submitted by user space tools via the privcmd driver can
take a long time (potentially many 10s of seconds) if the hypercall
has many sub-operations.

A fully preemptible kernel may deschedule such as task in any upcall
called from a hypercall continuation.

However, in a kernel with voluntary or no preemption, hypercall
continuations in Xen allow event handlers to be run but the task
issuing the hypercall will not be descheduled until the hypercall is
complete and the ioctl returns to user space.  These long running
tasks may also trigger the kernel's soft lockup detection.

Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to
bracket hypercalls that may be preempted.  Use these in the privcmd
driver.

When returning from an upcall, call xen_maybe_preempt_hcall() which
adds a schedule point if if the current task was within a preemptible
hypercall.

Since _cond_resched() can move the task to a different CPU, clear and
set xen_in_preemptible_hcall around the call.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agortnl: restore notifications for deleted interfaces
Nicolas Dichtel [Thu, 16 Jul 2015 09:31:43 +0000 (11:31 +0200)]
rtnl: restore notifications for deleted interfaces

The commit 984ff7a3e060 is an upstream backport. In fact, it depends on
commit 395eea6ccf2b ("rtnetlink: delay RTM_DELLINK notification until after ndo_uninit()")
which has not been backported in 3.18.y.

Before commit 395eea6ccf2b, rollback_registered_many() uses rtmsg_ifinfo().
The call to this function is done with dev->reg_state set to
NETREG_UNREGISTERING, thus testing this reg_state in rtmsg_ifinfo() is
wrong.

This patch partially reverts commit 984ff7a3e060.

Fixes: 984ff7a3e060 ("rtnl/bond: don't send rtnl msg for unregistered iface")
Reported-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoInput: synaptics - add min/max quirk for Lenovo S540
Peter Hutterer [Mon, 8 Jun 2015 17:17:32 +0000 (10:17 -0700)]
Input: synaptics - add min/max quirk for Lenovo S540

[ Upstream commit 7f2ca8b55aeff1fe51ed3570200ef88a96060917 ]

https://bugzilla.redhat.com/show_bug.cgi?id=1223051#c2

Cc: stable@vger.kernel.org
Tested-by: tommy.gagnes@gmail.com
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoRevert "Input: synaptics - add min/max quirk for Lenovo S540"
Sasha Levin [Tue, 14 Jul 2015 18:58:30 +0000 (14:58 -0400)]
Revert "Input: synaptics - add min/max quirk for Lenovo S540"

This reverts commit e73268af3e137b1f69d0f82d80aa88e0e8c7cbbe.

Seems like an incorrect merge on my end, will pick the upstream
version again next.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoRevert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"
Sasha Levin [Mon, 13 Jul 2015 12:56:15 +0000 (08:56 -0400)]
Revert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"

This reverts commit ed7f7f145ec1445a130513db9ad8f1547f77a578.

Reverting from stable tree as fix was found to be buggy. New fix pending.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agofs/ufs: restore s_lock mutex_init()
Fabian Frederick [Wed, 17 Jun 2015 16:15:45 +0000 (18:15 +0200)]
fs/ufs: restore s_lock mutex_init()

[ Upstream commit e4f95517f18271b1da36cfc5d700e46844396d6e ]

Add last missing line in commit "cdd9eefdf905"
("fs/ufs: restore s_lock mutex")

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoufs: Fix possible deadlock when looking up directories
Jan Kara [Tue, 2 Jun 2015 09:26:34 +0000 (11:26 +0200)]
ufs: Fix possible deadlock when looking up directories

[ Upstream commit 514d748f69c97a51a2645eb198ac5c6218f22ff9 ]

Commit e4502c63f56aeca88 (ufs: deal with nfsd/iget races) made ufs
create inodes with I_NEW flag set. However ufs_mkdir() never cleared
this flag. Thus if someone ever tried to lookup the directory by inode
number, he would deadlock waiting for I_NEW to be cleared. Luckily this
mostly happens only if the filesystem is exported over NFS since
otherwise we have the inode attached to dentry and don't look it up by
inode number. In rare cases dentry can get freed without inode being
freed and then we'd hit the deadlock even without NFS export.

Fix the problem by clearing I_NEW before instantiating new directory
inode.

Fixes: e4502c63f56aeca887ced37f24e0def1ef11cec8
Reported-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoufs: Fix warning from unlock_new_inode()
Jan Kara [Mon, 1 Jun 2015 12:52:04 +0000 (14:52 +0200)]
ufs: Fix warning from unlock_new_inode()

[ Upstream commit 12ecbb4b1d765a5076920999298d9625439dbe58 ]

Commit e4502c63f56aeca88 (ufs: deal with nfsd/iget races) introduced
unlock_new_inode() call into ufs_add_nondir(). However that function
gets called also from ufs_link() which hands it already initialized
inode and thus unlock_new_inode() complains. The problem is harmless but
annoying.

Fix the problem by opencoding necessary stuff in ufs_link()

Fixes: e4502c63f56aeca887ced37f24e0def1ef11cec8
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agovfs: Ignore unlocked mounts in fs_fully_visible
Eric W. Biederman [Wed, 7 Jan 2015 14:10:09 +0000 (08:10 -0600)]
vfs: Ignore unlocked mounts in fs_fully_visible

[ Upstream commit c89d4319ae55186496c43b7a6e510aa1d09dd387 ]

commit ceeb0e5d39fcdf4dca2c997bf225c7fc49200b37 upstream.

Limit the mounts fs_fully_visible considers to locked mounts.
Unlocked can always be unmounted so considering them adds hassle
but no security benefit.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoKVM: x86: properly restore LVT0
Radim Krčmář [Tue, 30 Jun 2015 20:19:17 +0000 (22:19 +0200)]
KVM: x86: properly restore LVT0

[ Upstream commit 58382447b9a9989da551a7b17e72756f6e238bb8 ]

commit db1385624c686fe99fe2d1b61a36e1537b915d08 upstream.

Legacy NMI watchdog didn't work after migration/resume, because
vapics_in_nmi_mode was left at 0.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoKVM: s390: virtio-ccw: don't overwrite config space values
Cornelia Huck [Mon, 29 Jun 2015 14:44:01 +0000 (16:44 +0200)]
KVM: s390: virtio-ccw: don't overwrite config space values

[ Upstream commit 431dae778aea4eed31bd12e5ee82edc571cd4d70 ]

Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi
complained about overwriting values in the config space, which
was triggered by a broken implementation of virtio-ccw's config
get/set routines. It was probably sheer luck that we did not hit
this before.

When writing a value to the config space, the WRITE_CONF ccw will
always write from the beginning of the config space up to and
including the value to be set. If the config space up to the value
has not yet been retrieved from the device, however, we'll end up
overwriting values. Keep track of the known config space and update
if needed to avoid this.

Moreover, READ_CONF will only read the number of bytes it has been
instructed to retrieve, so we must not copy more than that to the
buffer, or we might overwrite trailing values.

Reported-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Tested-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoselinux: fix setting of security labels on NFS
J. Bruce Fields [Thu, 4 Jun 2015 19:57:25 +0000 (15:57 -0400)]
selinux: fix setting of security labels on NFS

[ Upstream commit 9fc2b4b436cff7d8403034676014f1be9d534942 ]

Before calling into the filesystem, vfs_setxattr calls
security_inode_setxattr, which ends up calling selinux_inode_setxattr in
our case.  That returns -EOPNOTSUPP whenever SBLABEL_MNT is not set.
SBLABEL_MNT was supposed to be set by sb_finish_set_opts, which sets it
only if selinux_is_sblabel_mnt returns true.

The selinux_is_sblabel_mnt logic was broken by eadcabc697e9 "SELinux: do
all flags twiddling in one place", which didn't take into the account
the SECURITY_FS_USE_NATIVE behavior that had been introduced for nfs
with eb9ae686507b "SELinux: Add new labeling type native labels".

This caused setxattr's of security labels over NFSv4.2 to fail.

Cc: stable@kernel.org # 3.13
Cc: Eric Paris <eparis@redhat.com>
Cc: David Quigley <dpquigl@davequigley.com>
Reported-by: Richard Chan <rc556677@outlook.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: added the stable dependency]
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agousb: gadget: f_fs: add extra check before unregister_gadget_item
Rui Miguel Silva [Wed, 20 May 2015 13:52:40 +0000 (14:52 +0100)]
usb: gadget: f_fs: add extra check before unregister_gadget_item

[ Upstream commit f14e9ad17f46051b02bffffac2036486097de19e ]

ffs_closed can race with configfs_rmdir which will call config_item_release, so
add an extra check to avoid calling the unregister_gadget_item with an null
gadget item.

Signed-off-by: Rui Miguel Silva <rui.silva@linaro.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoperf/x86: Honor the architectural performance monitoring version
Palik, Imre [Mon, 8 Jun 2015 12:46:49 +0000 (14:46 +0200)]
perf/x86: Honor the architectural performance monitoring version

[ Upstream commit 2c33645d366d13b969d936b68b9f4875b1fdddea ]

Architectural performance monitoring, version 1, doesn't support fixed counters.

Currently, even if a hypervisor advertises support for architectural
performance monitoring version 1, perf may still try to use the fixed
counters, as the constraints are set up based on the CPU model.

This patch ensures that perf honors the architectural performance monitoring
version returned by CPUID, and it only uses the fixed counters for version 2
and above.

(Some of the ideas in this patch came from Peter Zijlstra.)

Signed-off-by: Imre Palik <imrep@amazon.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433767609-1039-1-git-send-email-imrep.amz@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoperf: Fix ring_buffer_attach() RCU sync, again
Oleg Nesterov [Sat, 30 May 2015 20:04:25 +0000 (22:04 +0200)]
perf: Fix ring_buffer_attach() RCU sync, again

[ Upstream commit 2f993cf093643b98477c421fa2b9a98dcc940323 ]

While looking for other users of get_state/cond_sync. I Found
ring_buffer_attach() and it looks obviously buggy?

Don't we need to ensure that we have "synchronize" _between_
list_del() and list_add() ?

IOW. Suppose that ring_buffer_attach() preempts right_after
get_state_synchronize_rcu() and gp completes before spin_lock().

In this case cond_synchronize_rcu() does nothing and we reuse
->rb_entry without waiting for gp in between?

It also moves the ->rcu_pending check under "if (rb)", to make it
more readable imo.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: josh@joshtriplett.org
Cc: tj@kernel.org
Fixes: b69cf53640da ("perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()")
Link: http://lkml.kernel.org/r/20150530200425.GA15748@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agox86/boot: Fix overflow warning with 32-bit binutils
Borislav Petkov [Fri, 19 Jun 2015 11:49:06 +0000 (13:49 +0200)]
x86/boot: Fix overflow warning with 32-bit binutils

[ Upstream commit 04c17341b42699a5859a8afa05e64ba08a4e5235 ]

When building the kernel with 32-bit binutils built with support
only for the i386 target, we get the following warning:

  arch/x86/kernel/head_32.S:66: Warning: shift count out of range (32 is not between 0 and 31)

The problem is that in that case, binutils' internal type
representation is 32-bit wide and the shift range overflows.

In order to fix this, manipulate the shift expression which
creates the 4GiB constant to not overflow the shift count.

Suggested-by: Michael Matz <matz@suse.de>
Reported-and-tested-by: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agovfs: Remove incorrect debugging WARN in prepend_path
Eric W. Biederman [Sun, 24 May 2015 14:25:00 +0000 (09:25 -0500)]
vfs: Remove incorrect debugging WARN in prepend_path

[ Upstream commit 93e3bce6287e1fb3e60d3324ed08555b5bbafa89 ]

The warning message in prepend_path is unclear and outdated.  It was
added as a warning that the mechanism for generating names of pseudo
files had been removed from prepend_path and d_dname should be used
instead.  Unfortunately the warning reads like a general warning,
making it unclear what to do with it.

Remove the warning.  The transition it was added to warn about is long
over, and I added code several years ago which in rare cases causes
the warning to fire on legitimate code, and the warning is now firing
and scaring people for no good reason.

Cc: stable@vger.kernel.org
Reported-by: Ivan Delalande <colona@arista.com>
Reported-by: Omar Sandoval <osandov@osandov.com>
Fixes: f48cfddc6729e ("vfs: In d_path don't call d_dname on a mount point")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoKVM: x86: make vapics_in_nmi_mode atomic
Radim Krčmář [Wed, 1 Jul 2015 13:31:49 +0000 (15:31 +0200)]
KVM: x86: make vapics_in_nmi_mode atomic

[ Upstream commit 42720138b06301cc8a7ee8a495a6d021c4b6a9bc ]

Writes were a bit racy, but hard to turn into a bug at the same time.
(Particularly because modern Linux doesn't use this feature anymore.)

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[Actually the next patch makes it much, much easier to trigger the race
 so I'm including this one for stable@ as well. - Paolo]
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoarm: KVM: force execution of HCPTR access on VM exit
Marc Zyngier [Mon, 16 Mar 2015 10:59:43 +0000 (10:59 +0000)]
arm: KVM: force execution of HCPTR access on VM exit

[ Upstream commit 85e84ba31039595995dae80b277378213602891b ]

On VM entry, we disable access to the VFP registers in order to
perform a lazy save/restore of these registers.

On VM exit, we restore access, test if we did enable them before,
and save/restore the guest/host registers if necessary. In this
sequence, the FPEXC register is always accessed, irrespective
of the trapping configuration.

If the guest didn't touch the VFP registers, then the HCPTR access
has now enabled such access, but we're missing a barrier to ensure
architectural execution of the new HCPTR configuration. If the HCPTR
access has been delayed/reordered, the subsequent access to FPEXC
will cause a trap, which we aren't prepared to handle at all.

The same condition exists when trapping to enable VFP for the guest.

The fix is to introduce a barrier after enabling VFP access. In the
vmexit case, it can be relaxed to only takes place if the guest hasn't
accessed its view of the VFP registers, making the access to FPEXC safe.

The set_hcptr macro is modified to deal with both vmenter/vmexit and
vmtrap operations, and now takes an optional label that is branched to
when the guest hasn't touched the VFP registers.

Reported-by: Vikram Sethi <vikrams@codeaurora.org>
Cc: stable@kernel.org # v3.9+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoiommu/amd: Handle large pages correctly in free_pagetable
Joerg Roedel [Thu, 18 Jun 2015 08:48:34 +0000 (10:48 +0200)]
iommu/amd: Handle large pages correctly in free_pagetable

[ Upstream commit 0b3fff54bc01e8e6064d222a33e6fa7adabd94cd ]

Make sure that we are skipping over large PTEs while walking
the page-table tree.

Cc: stable@kernel.org
Fixes: 5c34c403b723 ("iommu/amd: Fix memory leak in free_pagetable")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoarm64/kvm: Fix assembler compatibility of macros
Geoff Levand [Fri, 31 Oct 2014 23:06:47 +0000 (23:06 +0000)]
arm64/kvm: Fix assembler compatibility of macros

[ Upstream commit 286fb1cc32b11c18da3573a8c8c37a4f9da16e30 ]

Some of the macros defined in kvm_arm.h are useful in assembly files, but are
not compatible with the assembler.  Change any C language integer constant
definitions using appended U, UL, or ULL to the UL() preprocessor macro.  Also,
add a preprocessor include of the asm/memory.h file which defines the UL()
macro.

Fixes build errors like these when using kvm_arm.h in assembly
source files:

  Error: unexpected characters following instruction at operand 3 -- `and x0,x1,#((1U<<25)-1)'

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoKVM: nSVM: Check for NRIPS support before updating control field
Bandan Das [Thu, 11 Jun 2015 06:05:33 +0000 (02:05 -0400)]
KVM: nSVM: Check for NRIPS support before updating control field

[ Upstream commit f104765b4f81fd74d69e0eb161e89096deade2db ]

If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoARM: clk-imx6q: refine sata's parent
Sebastien Szymanski [Wed, 20 May 2015 14:30:37 +0000 (16:30 +0200)]
ARM: clk-imx6q: refine sata's parent

[ Upstream commit 3a3ac04f0e0ef48c2d30ae6df7b8906421c2b35d ]

commit da946aeaeadcd24ff0cda9984c6fb8ed2bfd462a upstream.

According to IMX6D/Q RM, table 18-3, sata clock's parent is ahb, not ipg.

Signed-off-by: Sebastien Szymanski <sebastien.szymanski@armadeus.com>
Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
[dirk.behme: Adjust moved file]
Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoBtrfs: make xattr replace operations atomic
Filipe Manana [Sun, 9 Nov 2014 08:38:39 +0000 (08:38 +0000)]
Btrfs: make xattr replace operations atomic

[ Upstream commit 02590fd855d1690568b2fa439c942e933221b57a ]

commit 5f5bc6b1e2d5a6f827bc860ef2dc5b6f365d1339 upstream.

Replacing a xattr consists of doing a lookup for its existing value, delete
the current value from the respective leaf, release the search path and then
finally insert the new value. This leaves a time window where readers (getxattr,
listxattrs) won't see any value for the xattr. Xattrs are used to store ACLs,
so this has security implications.

This change also fixes 2 other existing issues which were:

*) Deleting the old xattr value without verifying first if the new xattr will
   fit in the existing leaf item (in case multiple xattrs are packed in the
   same item due to name hash collision);

*) Returning -EEXIST when the flag XATTR_CREATE is given and the xattr doesn't
   exist but we have have an existing item that packs muliple xattrs with
   the same name hash as the input xattr. In this case we should return ENOSPC.

A test case for xfstests follows soon.

Thanks to Alexandre Oliva for reporting the non-atomicity of the xattr replace
implementation.

Reported-by: Alexandre Oliva <oliva@gnu.org>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agox86/microcode/intel: Guard against stack overflow in the loader
Quentin Casasnovas [Tue, 3 Feb 2015 12:00:22 +0000 (13:00 +0100)]
x86/microcode/intel: Guard against stack overflow in the loader

[ Upstream commit f84598bd7c851f8b0bf8cd0d7c3be0d73c432ff4 ]

mc_saved_tmp is a static array allocated on the stack, we need to make
sure mc_saved_count stays within its bounds, otherwise we're overflowing
the stack in _save_mc(). A specially crafted microcode header could lead
to a kernel crash or potentially kernel execution.

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1422964824-22056-1-git-send-email-quentin.casasnovas@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agonetfilter: nf_tables: allow to change chain policy without hook if it exists
Pablo Neira Ayuso [Tue, 17 Mar 2015 12:21:42 +0000 (13:21 +0100)]
netfilter: nf_tables: allow to change chain policy without hook if it exists

[ Upstream commit d6b6cb1d3e6f78d55c2d4043d77d0d8def3f3b99 ]

If there's an existing base chain, we have to allow to change the
default policy without indicating the hook information.

However, if the chain doesn't exists, we have to enforce the presence of
the hook attribute.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agonetfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set
Pablo Neira Ayuso [Sat, 21 Mar 2015 18:25:05 +0000 (19:25 +0100)]
netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set

[ Upstream commit 749177ccc74f9c6d0f51bd78a15c652a2134aa11 ]

ip6tables extensions check for this flag to restrict match/target to a
given protocol. Without this flag set, SYNPROXY6 returns an error.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agonetfilter: Zero the tuple in nfnl_cthelper_parse_tuple()
Ian Wilson [Thu, 12 Mar 2015 09:37:58 +0000 (09:37 +0000)]
netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()

[ Upstream commit 78146572b9cd20452da47951812f35b1ad4906be ]

nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(),
nfnl_cthelper_get() and nfnl_cthelper_del().  In each case they pass
a pointer to an nf_conntrack_tuple data structure local variable:

    struct nf_conntrack_tuple tuple;
    ...
    ret = nfnl_cthelper_parse_tuple(&tuple, tb[NFCTH_TUPLE]);

The problem is that this local variable is not initialized, and
nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and
dst.protonum.  This leaves all other fields with undefined values
based on whatever is on the stack:

    tuple->src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM]));
    tuple->dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]);

The symptom observed was that when the rpc and tns helpers were added
then traffic to port 1536 was being sent to user-space.

Signed-off-by: Ian Wilson <iwilson@brocade.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agosb_edac: Fix erroneous bytes->gigabytes conversion
Jim Snow [Tue, 18 Nov 2014 13:51:09 +0000 (14:51 +0100)]
sb_edac: Fix erroneous bytes->gigabytes conversion

[ Upstream commit ece9210859abb2cc0126f8adbfbbdee668d4906a ]

commit 8c009100295597f23978c224aec5751a365bc965 upstream.

Signed-off-by: Jim Snow <jim.snow@intel.com>
Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
[ vlee: Backported to 3.14. Adjusted context. ]
Signed-off-by: Vinson Lee <vlee@twitter.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agotracing: Have filter check for balanced ops
Steven Rostedt [Mon, 15 Jun 2015 21:50:25 +0000 (17:50 -0400)]
tracing: Have filter check for balanced ops

[ Upstream commit 2cf30dc180cea808077f003c5116388183e54f9e ]

When the following filter is used it causes a warning to trigger:

 # cd /sys/kernel/debug/tracing
 # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
 # cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: No error

 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990()
 Modules linked in: bnep lockd grace bluetooth  ...
 CPU: 3 PID: 1223 Comm: bash Tainted: G        W       4.1.0-rc3-test+ #450
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
  0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0
  0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c
  ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea
 Call Trace:
  [<ffffffff816ed4f9>] dump_stack+0x4c/0x6e
  [<ffffffff8107fb07>] warn_slowpath_common+0x97/0xe0
  [<ffffffff8136b46c>] ? _kstrtoull+0x2c/0x80
  [<ffffffff8107fb6a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff81159065>] replace_preds+0x3c5/0x990
  [<ffffffff811596b2>] create_filter+0x82/0xb0
  [<ffffffff81159944>] apply_event_filter+0xd4/0x180
  [<ffffffff81152bbf>] event_filter_write+0x8f/0x120
  [<ffffffff811db2a8>] __vfs_write+0x28/0xe0
  [<ffffffff811dda43>] ? __sb_start_write+0x53/0xf0
  [<ffffffff812e51e0>] ? security_file_permission+0x30/0xc0
  [<ffffffff811dc408>] vfs_write+0xb8/0x1b0
  [<ffffffff811dc72f>] SyS_write+0x4f/0xb0
  [<ffffffff816f5217>] system_call_fastpath+0x12/0x6a
 ---[ end trace e11028bd95818dcd ]---

Worse yet, reading the error message (the filter again) it says that
there was no error, when there clearly was. The issue is that the
code that checks the input does not check for balanced ops. That is,
having an op between a closed parenthesis and the next token.

This would only cause a warning, and fail out before doing any real
harm, but it should still not caues a warning, and the error reported
should work:

 # cd /sys/kernel/debug/tracing
 # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
 # cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: Meaningless filter expression

And give no kernel warning.

Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: stable@vger.kernel.org # 2.6.31+
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agofs: Fix S_NOSEC handling
Jan Kara [Thu, 21 May 2015 14:05:52 +0000 (16:05 +0200)]
fs: Fix S_NOSEC handling

[ Upstream commit 2426f3910069ed47c0cc58559a6d088af7920201 ]

file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.

Fix the bug by checking actual mode bits before setting S_NOSEC.

CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agocan: fix loss of CAN frames in raw_rcv
Oliver Hartkopp [Sun, 21 Jun 2015 16:50:44 +0000 (18:50 +0200)]
can: fix loss of CAN frames in raw_rcv

[ Upstream commit 36c01245eb8046c16eee6431e7dbfbb302635fa8 ]

As reported by Manfred Schlaegl here

   http://marc.info/?l=linux-netdev&m=143482089824232&w=2

commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for
overlapping CAN filters" requires the skb->tstamp to be set to check for
identical CAN skbs.

As net timestamping is influenced by several players (netstamp_needed and
netdev_tstamp_prequeue) Manfred missed a proper timestamp which leads to
CAN frame loss.

As skb timestamping became now mandatory for CAN related skbs this patch
makes sure that received CAN skbs always have a proper timestamp set.
Maybe there's a better solution in the future but this patch fixes the
CAN frame loss so far.

Reported-by: Manfred Schlaegl <manfred.schlaegl@gmx.at>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agofs/ufs: restore s_lock mutex
Fabian Frederick [Wed, 10 Jun 2015 00:09:32 +0000 (10:09 +1000)]
fs/ufs: restore s_lock mutex

[ Upstream commit cdd9eefdf905e92e7fc6cc393314efe68dc6ff66 ]

Commit 0244756edc4b98c ("ufs: sb mutex merge + mutex_destroy") generated
deadlocks in read/write mode on mkdir.

This patch partially reverts it keeping fixes by Andrew Morton and
mutex_destroy()

[AV: fixed a missing bit in ufs_remount()]

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reported-by: Ian Campbell <ian.campbell@citrix.com>
Suggested-by: Jan Kara <jack@suse.cz>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agofs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"
Fabian Frederick [Wed, 10 Jun 2015 00:09:32 +0000 (10:09 +1000)]
fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"

[ Upstream commit 13b987ea275840d74d9df9a44326632fab1894da ]

This reverts commit 9ef7db7f38d0 ("ufs: fix deadlocks introduced by sb
mutex merge") That patch tried to solve commit 0244756edc4b98c ("ufs: sb
mutex merge + mutex_destroy") which is itself partially reverted due to
multiple deadlocks.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Suggested-by: Jan Kara <jack@suse.cz>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agonetfilter: nfnetlink_cthelper: Remove 'const' and '&' to avoid warnings
Chen Gang [Wed, 24 Dec 2014 15:04:54 +0000 (23:04 +0800)]
netfilter: nfnetlink_cthelper: Remove 'const' and '&' to avoid warnings

[ Upstream commit b18c5d15e8714336365d9d51782d5b53afa0443c ]

The related code can be simplified, and also can avoid related warnings
(with allmodconfig under parisc):

    CC [M]  net/netfilter/nfnetlink_cthelper.o
  net/netfilter/nfnetlink_cthelper.c: In function ‘nfnl_cthelper_from_nlattr’:
  net/netfilter/nfnetlink_cthelper.c:97:9: warning: passing argument 1 o ‘memcpy’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-array-qualifiers]
    memcpy(&help->data, nla_data(attr), help->helper->data_len);
           ^
  In file included from include/linux/string.h:17:0,
                   from include/uapi/linux/uuid.h:25,
                   from include/linux/uuid.h:23,
                   from include/linux/mod_devicetable.h:12,
                   from ./arch/parisc/include/asm/hardware.h:4,
                   from ./arch/parisc/include/asm/processor.h:15,
                   from ./arch/parisc/include/asm/spinlock.h:6,
                   from ./arch/parisc/include/asm/atomic.h:21,
                   from include/linux/atomic.h:4,
                   from ./arch/parisc/include/asm/bitops.h:12,
                   from include/linux/bitops.h:36,
                   from include/linux/kernel.h:10,
                   from include/linux/list.h:8,
                   from include/linux/module.h:9,
                   from net/netfilter/nfnetlink_cthelper.c:11:
  ./arch/parisc/include/asm/string.h:8:8: note: expected ‘void *’ but argument is of type ‘const char (*)[]’
   void * memcpy(void * dest,const void *src,size_t count);
          ^

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@soleta.eu>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agoconfig: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected
Konrad Rzeszutek Wilk [Fri, 17 Apr 2015 19:04:48 +0000 (15:04 -0400)]
config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected

[ Upstream commit a6dfa128ce5c414ab46b1d690f7a1b8decb8526d ]

A huge amount of NIC drivers use the DMA API, however if
compiled under 32-bit an very important part of the DMA API can
be ommitted leading to the drivers not working at all
(especially if used with 'swiotlb=force iommu=soft').

As Prashant Sreedharan explains it: "the driver [tg3] uses
DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of
the dma "mapping" and dma_unmap_addr() to get the "mapping"
value. On most of the platforms this is a no-op, but ... with
"iommu=soft and swiotlb=force" this house keeping is required,
... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_
instead of the DMA address."

As such enable this even when using 32-bit kernels.

Reported-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Prashant Sreedharan <prashant@broadcom.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Chan <mchan@broadcom.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: cascardo@linux.vnet.ibm.com
Cc: david.vrabel@citrix.com
Cc: sanjeevb@broadcom.com
Cc: siva.kallam@broadcom.com
Cc: vyasevich@gmail.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/20150417190448.GA9462@l.oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agokprobes/x86: Return correct length in __copy_instruction()
Eugene Shatokhin [Tue, 17 Mar 2015 10:09:18 +0000 (19:09 +0900)]
kprobes/x86: Return correct length in __copy_instruction()

[ Upstream commit c80e5c0c23ce2282476fdc64c4b5e3d3a40723fd ]

On x86-64, __copy_instruction() always returns 0 (error) if the
instruction uses %rip-relative addressing. This is because
kernel_insn_init() is called the second time for 'insn' instance
in such cases and sets all its fields to 0.

Because of this, trying to place a kprobe on such instruction
will fail, register_kprobe() will return -EINVAL.

This patch fixes the problem.

Signed-off-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/20150317100918.28349.94654.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agomnt: Modify fs_fully_visible to deal with locked ro nodev and atime
Eric W. Biederman [Sat, 9 May 2015 04:49:47 +0000 (23:49 -0500)]
mnt: Modify fs_fully_visible to deal with locked ro nodev and atime

[ Upstream commit 8c6cf9cc829fcd0b179b59f7fe288941d0e31108 ]

Ignore an existing mount if the locked readonly, nodev or atime
attributes are less permissive than the desired attributes
of the new mount.

On success ensure the new mount locks all of the same readonly, nodev and
atime attributes as the old mount.

The nosuid and noexec attributes are not checked here as this change
is destined for stable and enforcing those attributes causes a
regression in lxc and libvirt-lxc where those applications will not
start and there are no known executables on sysfs or proc and no known
way to create exectuables without code modifications

Cc: stable@vger.kernel.org
Fixes: e51db73532955 ("userns: Better restrictions on when proc and sysfs can be mounted")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agotty/serial: at91: RS485 mode: 0 is valid for delay_rts_after_send
Nicolas Ferre [Mon, 11 May 2015 11:00:31 +0000 (13:00 +0200)]
tty/serial: at91: RS485 mode: 0 is valid for delay_rts_after_send

[ Upstream commit 8687634b7908c42eb700e0469e110e02833611d1 ]

In RS485 mode, we may want to set the delay_rts_after_send value to 0.
In the datasheet, the 0 value is said to "disable" the Transmitter Timeguard but
this is exactly the expected behavior if we want no delay...

Moreover, if the value was set to non-zero value by device-tree or earlier
ioctl command, it was impossible to change it back to zero.

Reported-by: Sami Pietikäinen <Sami.Pietikainen@wapice.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: stable@vger.kernel.org # 3.2+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
9 years agomnt: Refactor the logic for mounting sysfs and proc in a user namespace
Eric W. Biederman [Sat, 9 May 2015 04:22:29 +0000 (23:22 -0500)]
mnt: Refactor the logic for mounting sysfs and proc in a user namespace

[ Upstream commit 1b852bceb0d111e510d1a15826ecc4a19358d512 ]

Fresh mounts of proc and sysfs are a very special case that works very
much like a bind mount.  Unfortunately the current structure can not
preserve the MNT_LOCK... mount flags.  Therefore refactor the logic
into a form that can be modified to preserve those lock bits.

Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount
of the filesystem be fully visible in the current mount namespace,
before the filesystem may be mounted.

Move the logic for calling fs_fully_visible from proc and sysfs into
fs/namespace.c where it has greater access to mount namespace state.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoLinux 3.18.18 v3.18.18
Sasha Levin [Fri, 10 Jul 2015 00:46:48 +0000 (20:46 -0400)]
Linux 3.18.18

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agommc: sdhci-pxav3: do the mbus window configuration after enabling clocks
Thomas Petazzoni [Wed, 31 Dec 2014 10:54:10 +0000 (11:54 +0100)]
mmc: sdhci-pxav3: do the mbus window configuration after enabling clocks

[ upstream commit aa8165f914420f143476305a01894b017d3abe6b ]

In commit 5491ce3f79ee ("mmc: sdhci-pxav3: add support for the Armada
38x SDHCI controller"), the sdhci-pxav3 driver was extended to include
support for the SDHCI controller found in the Armada 38x
processor. This mainly involved adding some MBus window related
configuration.

However, this configuration is currently done too early in ->probe():
it is done before clocks are enabled, while this configuration
involves touching the registers of the controller, which will hang the
SoC if the clock is disabled. It wasn't noticed until now because the
bootloader typically leaves gatable clocks enabled, but in situations
where we have a deferred probe (due to a CD GPIO that cannot be taken,
for example), then the probe will be re-tried later, after a clock
disable has been done in the exit path of the failed probe attempt of
the device. This second probe() will hang the system due to the clock
being disabled.

This can for example be produced on Armada 385 GP, which has a CD GPIO
connected to an I2C PCA9555. If the driver for the PCA9555 is not
compiled into the kernel, then we will have the following sequence of
events:

  1. The SDHCI probes
  2. It does the MBus configuration (which works, because the clock is
     left enabled by the bootloader)
  3. It enables the clock
  4. It tries to get the CD GPIO, which fails due to the driver being
     missing, so -EPROBE_DEFER is returned.
  5. Before returning -EPROBE_DEFER, the driver cleans up what was
     done, which includes disabling the clock.
  6. Later on, the SDHCI probe is tried again.
  7. It does the MBus configuration, which hangs because the clock is
     no longer enabled.

This commit does the obvious fix of doing the MBus configuration after
the clock has been enabled by the driver.

Fixes: 5491ce3f79ee ("mmc: sdhci-pxav3: add support for the Armada 38x SDHCI controller")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[jogo: rebased onto 3.18.17]
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agosctp: Fix race between OOTB responce and route removal
Alexander Sverdlin [Mon, 29 Jun 2015 08:41:03 +0000 (10:41 +0200)]
sctp: Fix race between OOTB responce and route removal

[ Upstream commit 29c4afc4e98f4dc0ea9df22c631841f9c220b944 ]

There is NULL pointer dereference possible during statistics update if the route
used for OOTB responce is removed at unfortunate time. If the route exists when
we receive OOTB packet and we finally jump into sctp_packet_transmit() to send
ABORT, but in the meantime route is removed under our feet, we take "no_route"
path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...).

But sctp_ootb_pkt_new() used to prepare responce packet doesn't call
sctp_transport_set_owner() and therefore there is no asoc associated with this
packet. Probably temporary asoc just for OOTB responces is overkill, so just
introduce a check like in all other places in sctp_packet_transmit(), where
"asoc" is dereferenced.

To reproduce this, one needs to
0. ensure that sctp module is loaded (otherwise ABORT is not generated)
1. remove default route on the machine
2. while true; do
     ip route del [interface-specific route]
     ip route add [interface-specific route]
   done
3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT
   responce

On x86_64 the crash looks like this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ...
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O    4.0.5-1-ARCH #1
Hardware name: ...
task: ffffffff818124c0 ti: ffffffff81800000 task.ti: ffffffff81800000
RIP: 0010:[<ffffffffa05ec9ac>]  [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
RSP: 0018:ffff880127c037b8  EFLAGS: 00010296
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000015ff66b480
RDX: 00000015ff66b400 RSI: ffff880127c17200 RDI: ffff880123403700
RBP: ffff880127c03888 R08: 0000000000017200 R09: ffffffff814625af
R10: ffffea00047e4680 R11: 00000000ffffff80 R12: ffff8800b0d38a28
R13: ffff8800b0d38a28 R14: ffff8800b3e88000 R15: ffffffffa05f24e0
FS:  0000000000000000(0000) GS:ffff880127c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 00000000c855b000 CR4: 00000000000007f0
Stack:
 ffff880127c03910 ffff8800b0d38a28 ffffffff8189d240 ffff88011f91b400
 ffff880127c03828 ffffffffa05c94c5 0000000000000000 ffff8800baa1c520
 0000000000000000 0000000000000001 0000000000000000 0000000000000000
Call Trace:
 <IRQ>
 [<ffffffffa05c94c5>] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp]
 [<ffffffffa05d6b42>] ? sctp_transport_put+0x52/0x80 [sctp]
 [<ffffffffa05d0bfc>] sctp_do_sm+0xb8c/0x19a0 [sctp]
 [<ffffffff810b0e00>] ? trigger_load_balance+0x90/0x210
 [<ffffffff810e0329>] ? update_process_times+0x59/0x60
 [<ffffffff812c7a40>] ? timerqueue_add+0x60/0xb0
 [<ffffffff810e0549>] ? enqueue_hrtimer+0x29/0xa0
 [<ffffffff8101f599>] ? read_tsc+0x9/0x10
 [<ffffffff8116d4b5>] ? put_page+0x55/0x60
 [<ffffffff810ee1ad>] ? clockevents_program_event+0x6d/0x100
 [<ffffffff81462b68>] ? skb_free_head+0x58/0x80
 [<ffffffffa029a10b>] ? chksum_update+0x1b/0x27 [crc32c_generic]
 [<ffffffff81283f3e>] ? crypto_shash_update+0xce/0xf0
 [<ffffffffa05d3993>] sctp_endpoint_bh_rcv+0x113/0x280 [sctp]
 [<ffffffffa05dd4e6>] sctp_inq_push+0x46/0x60 [sctp]
 [<ffffffffa05ed7a0>] sctp_rcv+0x880/0x910 [sctp]
 [<ffffffffa05ecb50>] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp]
 [<ffffffffa05ecb70>] ? sctp_csum_update+0x20/0x20 [sctp]
 [<ffffffff814b05a5>] ? ip_route_input_noref+0x235/0xd30
 [<ffffffff81051d6b>] ? ack_ioapic_level+0x7b/0x150
 [<ffffffff814b27be>] ip_local_deliver_finish+0xae/0x210
 [<ffffffff814b2e15>] ip_local_deliver+0x35/0x90
 [<ffffffff814b2a15>] ip_rcv_finish+0xf5/0x370
 [<ffffffff814b3128>] ip_rcv+0x2b8/0x3a0
 [<ffffffff81474193>] __netif_receive_skb_core+0x763/0xa50
 [<ffffffff81476c28>] __netif_receive_skb+0x18/0x60
 [<ffffffff81476cb0>] netif_receive_skb_internal+0x40/0xd0
 [<ffffffff814776c8>] napi_gro_receive+0xe8/0x120
 [<ffffffffa03946aa>] rtl8169_poll+0x2da/0x660 [r8169]
 [<ffffffff8147896a>] net_rx_action+0x21a/0x360
 [<ffffffff81078dc1>] __do_softirq+0xe1/0x2d0
 [<ffffffff8107912d>] irq_exit+0xad/0xb0
 [<ffffffff8157d158>] do_IRQ+0x58/0xf0
 [<ffffffff8157b06d>] common_interrupt+0x6d/0x6d
 <EOI>
 [<ffffffff810e1218>] ? hrtimer_start+0x18/0x20
 [<ffffffffa05d65f9>] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp]
 [<ffffffff81020c50>] ? mwait_idle+0x60/0xa0
 [<ffffffff810216ef>] arch_cpu_idle+0xf/0x20
 [<ffffffff810b731c>] cpu_startup_entry+0x3ec/0x480
 [<ffffffff8156b365>] rest_init+0x85/0x90
 [<ffffffff818eb035>] start_kernel+0x48b/0x4ac
 [<ffffffff818ea120>] ? early_idt_handlers+0x120/0x120
 [<ffffffff818ea339>] x86_64_start_reservations+0x2a/0x2c
 [<ffffffff818ea49c>] x86_64_start_kernel+0x161/0x184
Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9
RIP  [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
 RSP <ffff880127c037b8>
CR2: 0000000000000020
---[ end trace 5aec7fd2dc983574 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
drm_kms_helper: panic occurred, switching back to text console
---[ end Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agobnx2x: fix lockdep splat
Eric Dumazet [Fri, 26 Jun 2015 05:32:29 +0000 (07:32 +0200)]
bnx2x: fix lockdep splat

[ Upstream commit d53c66a5b80698620f7c9ba2372fff4017e987b8 ]

Michel reported following lockdep splat

[   44.718117] INFO: trying to register non-static key.
[   44.723081] the code is fine but needs lockdep annotation.
[   44.728559] turning off the locking correctness validator.
[   44.734036] CPU: 8 PID: 5483 Comm: ethtool Not tainted 4.1.0
[   44.770289] Call Trace:
[   44.772741]  [<ffffffff816eb1cd>] dump_stack+0x4c/0x65
[   44.777879]  [<ffffffff8111d921>] ? console_unlock+0x1f1/0x510
[   44.783708]  [<ffffffff811121f5>] __lock_acquire+0x1d05/0x1f10
[   44.789538]  [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90
[   44.795276]  [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0
[   44.801967]  [<ffffffff8111390d>] ? trace_hardirqs_on+0xd/0x10
[   44.807793]  [<ffffffff811330fa>] ? hrtimer_try_to_cancel+0x4a/0x250
[   44.814142]  [<ffffffff81112ba6>] lock_acquire+0xb6/0x290
[   44.819537]  [<ffffffff810d6675>] ? flush_work+0x5/0x280
[   44.824844]  [<ffffffff810d66ad>] flush_work+0x3d/0x280
[   44.830061]  [<ffffffff810d6675>] ? flush_work+0x5/0x280
[   44.835366]  [<ffffffff816f3c43>] ? schedule_hrtimeout_range+0x13/0x20
[   44.841889]  [<ffffffff8112ec9b>] ? usleep_range+0x4b/0x50
[   44.847365]  [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90
[   44.853102]  [<ffffffff810d8585>] ? __cancel_work_timer+0x105/0x1c0
[   44.859359]  [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0
[   44.866045]  [<ffffffff810d851f>] __cancel_work_timer+0x9f/0x1c0
[   44.872048]  [<ffffffffa0010982>] ? bnx2x_func_stop+0x42/0x90 [bnx2x]
[   44.878481]  [<ffffffff810d8670>] cancel_work_sync+0x10/0x20
[   44.884134]  [<ffffffffa00259e5>] bnx2x_chip_cleanup+0x245/0x730 [bnx2x]
[   44.890829]  [<ffffffff8110ce02>] ? up+0x32/0x50
[   44.895439]  [<ffffffff811306b5>] ? del_timer_sync+0x5/0xd0
[   44.901005]  [<ffffffffa005596d>] bnx2x_nic_unload+0x20d/0x8e0 [bnx2x]
[   44.907527]  [<ffffffff811f1aef>] ? might_fault+0x5f/0xb0
[   44.912921]  [<ffffffffa005851c>] bnx2x_reload_if_running+0x2c/0x50 [bnx2x]
[   44.919879]  [<ffffffffa005a3c5>] bnx2x_set_ringparam+0x2b5/0x460 [bnx2x]
[   44.926664]  [<ffffffff815d498b>] dev_ethtool+0x55b/0x1c40
[   44.932148]  [<ffffffff815dfdc7>] ? rtnl_lock+0x17/0x20
[   44.937364]  [<ffffffff815e7f8b>] dev_ioctl+0x17b/0x630
[   44.942582]  [<ffffffff815abf8d>] sock_do_ioctl+0x5d/0x70
[   44.947972]  [<ffffffff815ac013>] sock_ioctl+0x73/0x280
[   44.953192]  [<ffffffff8124c1c8>] do_vfs_ioctl+0x88/0x5b0
[   44.958587]  [<ffffffff8110d0b3>] ? up_read+0x23/0x40
[   44.963631]  [<ffffffff812584cc>] ? __fget_light+0x6c/0xa0
[   44.969105]  [<ffffffff8124c781>] SyS_ioctl+0x91/0xb0
[   44.974149]  [<ffffffff816f4dd7>] system_call_fastpath+0x12/0x6f

As bnx2x_init_ptp() is only called if bp->flags contains PTP_SUPPORTED,
we also need to guard bnx2x_stop_ptp() with same condition, otherwise
ptp_task workqueue is not initialized and kernel barfs on
cancel_work_sync()

Fixes: eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support")
Reported-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michal Kalderon <Michal.Kalderon@qlogic.com>
Cc: Ariel Elior <Ariel.Elior@qlogic.com>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>
Cc: David Decotigny <decot@google.com>
Acked-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonet: phy: fix phy link up when limiting speed via device tree
Mugunthan V N [Thu, 25 Jun 2015 16:51:02 +0000 (22:21 +0530)]
net: phy: fix phy link up when limiting speed via device tree

[ Upstream commit eb686231fce3770299760f24fdcf5ad041f44153 ]

When limiting phy link speed using "max-speed" to 100mbps or less on a
giga bit phy, phy never completes auto negotiation and phy state
machine is held in PHY_AN. Fixing this issue by comparing the giga
bit advertise though phydev->supported doesn't have it but phy has
BMSR_ESTATEN set. So that auto negotiation is restarted as old and
new advertise are different and link comes up fine.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonet/mlx4_en: Wake TX queues only when there's enough room
Ido Shamay [Thu, 25 Jun 2015 08:29:42 +0000 (11:29 +0300)]
net/mlx4_en: Wake TX queues only when there's enough room

[ Upstream commit 488a9b48e398b157703766e2cd91ea45ac6997c5 ]

Indication of a single completed packet, marked by txbbs_skipped
being bigger then zero, in not enough in order to wake up a
stopped TX queue. The completed packet may contain a single TXBB,
while next packet to be sent (after the wake up) may have multiple
TXBBs (LSO/TSO packets for example), causing overflow in queue followed
by WQE corruption and TX queue timeout.
Instead, wake the stopped queue only when there's enough room for the
worst case (maximum sized WQE) packet that we should need to handle after
the queue is opened again.

Also created an helper routine - mlx4_en_is_tx_ring_full, which checks
if the current TX ring is full or not. It provides better code readability
and removes code duplication.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agotcp: Do not call tcp_fastopen_reset_cipher from interrupt context
Christoph Paasch [Thu, 18 Jun 2015 16:15:34 +0000 (09:15 -0700)]
tcp: Do not call tcp_fastopen_reset_cipher from interrupt context

[ Upstream commit dfea2aa654243f70dc53b8648d0bbdeec55a7df1 ]

tcp_fastopen_reset_cipher really cannot be called from interrupt
context. It allocates the tcp_fastopen_context with GFP_KERNEL and
calls crypto_alloc_cipher, which allocates all kind of stuff with
GFP_KERNEL.

Thus, we might sleep when the key-generation is triggered by an
incoming TFO cookie-request which would then happen in interrupt-
context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP:

[   36.001813] BUG: sleeping function called from invalid context at mm/slub.c:1266
[   36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill
[   36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14
[   36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[   36.008250]  00000000000004f2 ffff88007f8838a8 ffffffff8171d53a ffff880075a084a8
[   36.009630]  ffff880075a08000 ffff88007f8838c8 ffffffff810967d3 ffff88007f883928
[   36.011076]  0000000000000000 ffff88007f8838f8 ffffffff81096892 ffff88007f89be00
[   36.012494] Call Trace:
[   36.012953]  <IRQ>  [<ffffffff8171d53a>] dump_stack+0x4f/0x6d
[   36.014085]  [<ffffffff810967d3>] ___might_sleep+0x103/0x170
[   36.015117]  [<ffffffff81096892>] __might_sleep+0x52/0x90
[   36.016117]  [<ffffffff8118e887>] kmem_cache_alloc_trace+0x47/0x190
[   36.017266]  [<ffffffff81680d82>] ? tcp_fastopen_reset_cipher+0x42/0x130
[   36.018485]  [<ffffffff81680d82>] tcp_fastopen_reset_cipher+0x42/0x130
[   36.019679]  [<ffffffff81680f01>] tcp_fastopen_init_key_once+0x61/0x70
[   36.020884]  [<ffffffff81680f2c>] __tcp_fastopen_cookie_gen+0x1c/0x60
[   36.022058]  [<ffffffff816814ff>] tcp_try_fastopen+0x58f/0x730
[   36.023118]  [<ffffffff81671788>] tcp_conn_request+0x3e8/0x7b0
[   36.024185]  [<ffffffff810e3872>] ? __module_text_address+0x12/0x60
[   36.025327]  [<ffffffff8167b2e1>] tcp_v4_conn_request+0x51/0x60
[   36.026410]  [<ffffffff816727e0>] tcp_rcv_state_process+0x190/0xda0
[   36.027556]  [<ffffffff81661f97>] ? __inet_lookup_established+0x47/0x170
[   36.028784]  [<ffffffff8167c2ad>] tcp_v4_do_rcv+0x16d/0x3d0
[   36.029832]  [<ffffffff812e6806>] ? security_sock_rcv_skb+0x16/0x20
[   36.030936]  [<ffffffff8167cc8a>] tcp_v4_rcv+0x77a/0x7b0
[   36.031875]  [<ffffffff816af8c3>] ? iptable_filter_hook+0x33/0x70
[   36.032953]  [<ffffffff81657d22>] ip_local_deliver_finish+0x92/0x1f0
[   36.034065]  [<ffffffff81657f1a>] ip_local_deliver+0x9a/0xb0
[   36.035069]  [<ffffffff81657c90>] ? ip_rcv+0x3d0/0x3d0
[   36.035963]  [<ffffffff81657569>] ip_rcv_finish+0x119/0x330
[   36.036950]  [<ffffffff81657ba7>] ip_rcv+0x2e7/0x3d0
[   36.037847]  [<ffffffff81610652>] __netif_receive_skb_core+0x552/0x930
[   36.038994]  [<ffffffff81610a57>] __netif_receive_skb+0x27/0x70
[   36.040033]  [<ffffffff81610b72>] process_backlog+0xd2/0x1f0
[   36.041025]  [<ffffffff81611482>] net_rx_action+0x122/0x310
[   36.042007]  [<ffffffff81076743>] __do_softirq+0x103/0x2f0
[   36.042978]  [<ffffffff81723e3c>] do_softirq_own_stack+0x1c/0x30

This patch moves the call to tcp_fastopen_init_key_once to the places
where a listener socket creates its TFO-state, which always happens in
user-context (either from the setsockopt, or implicitly during the
listen()-call)

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Fixes: 222e83d2e0ae ("tcp: switch tcp_fastopen key generation to net_get_random_once")
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>