git.hungrycats.org Git

Btrfs: Fix a jump typo of nodatasum_case to avoid wrong WARN_ON()

if (sctx->is_dev_replace && !is_metadata && !have_csum) {
    ...
    goto nodatasum_case;
}
...
nodatasum_case:
    WARN_ON(sctx->is_dev_replace);

In above code, nodatasum_case marker should be moved after
WARN_ON().

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit b25c94c580c27bb6cff1e2f478746e77119215ad)

Btrfs: add ref_count and free function for btrfs_bio

1: ref_count is simple than current RBIO_HOLD_BBIO_MAP_BIT flag
to keep btrfs_bio's memory in raid56 recovery implement.
2: free function for bbio will make code clean and flexible, plus
forced data type checking in compile.

Changelog v1->v2:
Rename following by David Sterba's suggestion:
put_btrfs_bio() -> btrfs_put_bio()
get_btrfs_bio() -> btrfs_get_bio()
bbio->ref_count -> bbio->refs

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 6e9606d2a2dce098c1739fb3cd82a1c34fd73d3a)

Btrfs: Make raid_map array be inlined in btrfs_bio structure

It can make code more simple and clear, we need not care about
free bbio and raid_map together.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 8e5cfb55d3f7dc764cd7f4c966d4c2687eaf7569)

Btrfs: sort raid_map before adding tgtdev stripes

It can avoid complex calculation of real stripes in sort,
moreover, we can clean up code of sorting tgtdev_map because it
will be in order initially.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit cc7539edea6dd02536d56f0a3405b8bb7ae24168)

Btrfs: fix a out-of-bound access of raid_map

We add the number of stripes on target devices into bbio->num_stripes
if we are under device replacement, and we just sort the raid_map of
those stripes that not on the target devices, so if when we need
real raid_map, we need skip the stripes on the target devices.

Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com>
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit e34c330d639177bbb345bf2bde16613b00cc6e6b)

Btrfs: fix fsync log replay for inodes with a mix of regular refs and extrefs

If we have an inode with a large number of hard links, some of which may
be extrefs, turn a regular ref into an extref, fsync the inode and then
replay the fsync log (after a crash/reboot), we can endup with an fsync
log that makes the replay code always fail with -EOVERFLOW when processing
the inode's references.

This is easy to reproduce with the test case I made for xfstests. Its steps
are the following:

   _scratch_mkfs "-O extref" >> $seqres.full 2>&1
   _init_flakey
   _mount_flakey

   # Create a test file with 3001 hard links. This number is large enough to
   # make btrfs start using extrefs at some point even if the fs has the maximum
   # possible leaf/node size (64Kb).
   echo "hello world" > $SCRATCH_MNT/foo
   for i in `seq 1 3000`; do
       ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_`printf "%04d" $i`
   done

   # Make sure all metadata and data are durably persisted.
   sync

   # Now remove one link, add a new one with a new name, add another new one with
   # the same name as the one we just removed and fsync the inode.
   rm -f $SCRATCH_MNT/foo_link_0001
   ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3001
   ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_0001
   rm -f $SCRATCH_MNT/foo_link_0002
   ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3002
   ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3003
   $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

   # Simulate a crash/power loss. This makes sure the next mount
   # will see an fsync log and will replay that log.

   _load_flakey_table $FLAKEY_DROP_WRITES
   _unmount_flakey

   _load_flakey_table $FLAKEY_ALLOW_WRITES
   _mount_flakey

   # Check that the number of hard links is correct, we are able to remove all
   # the hard links and read the file's data. This is just to verify we don't
   # get stale file handle errors (due to dangling directory index entries that
   # point to inodes that no longer exist).
   echo "Link count: $(stat --format=%h $SCRATCH_MNT/foo)"
   [ -f $SCRATCH_MNT/foo ] || echo "Link foo is missing"
   for ((i = 1; i <= 3003; i++)); do
       name=foo_link_`printf "%04d" $i`
       if [ $i -eq 2 ]; then
           [ -f $SCRATCH_MNT/$name ] && echo "Link $name found"
       else
           [ -f $SCRATCH_MNT/$name ] || echo "Link $name is missing"
       fi
   done
   rm -f $SCRATCH_MNT/foo_link_*
   cat $SCRATCH_MNT/foo
   rm -f $SCRATCH_MNT/foo

   status=0
   exit

The fix is simply to correct the overflow condition when overwriting a
reference item because it was wrong, trying to increase the item in the
fs/subvol tree by an impossible amount. Also ensure that we don't insert
one normal ref and one ext ref for the same dentry - this happened because
processing a dir index entry from the parent in the log happened when
the normal ref item was full, which made the logic insert an extref and
later when the normal ref had enough room, it would be inserted again
when processing the ref item from the child inode in the log.

This issue has been present since the introduction of the extrefs feature
(2012).

A test case for xfstests follows soon. This test only passes if the previous
patch titled "Btrfs: fix fsync when extend references are added to an inode"
is applied too.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit df8d116ffa379f3ef09d7ff28da0f0c921cc9fa1)

Btrfs: fix fsync when extend references are added to an inode

If we added an extended reference to an inode and fsync'ed it, the log
replay code would make our inode have an incorrect link count, which
was lower then the expected/correct count.
This resulted in stale directory index entries after deleting some of
the hard links, and any access to the dangling directory entries resulted
in -ESTALE errors because the entries pointed to inode items that don't
exist anymore.

This is easy to reproduce with the test case I made for xfstests, and
the bulk of that test is:

    _scratch_mkfs "-O extref" >> $seqres.full 2>&1
    _init_flakey
    _mount_flakey

    # Create a test file with 3001 hard links. This number is large enough to
    # make btrfs start using extrefs at some point even if the fs has the maximum
    # possible leaf/node size (64Kb).
    echo "hello world" > $SCRATCH_MNT/foo
    for i in `seq 1 3000`; do
        ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_`printf "%04d" $i`
    done

    # Make sure all metadata and data are durably persisted.
    sync

    # Add one more link to the inode that ends up being a btrfs extref and fsync
    # the inode.
    ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link_3001
    $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo

    # Simulate a crash/power loss. This makes sure the next mount
    # will see an fsync log and will replay that log.

    _load_flakey_table $FLAKEY_DROP_WRITES
    _unmount_flakey

    _load_flakey_table $FLAKEY_ALLOW_WRITES
    _mount_flakey

    # Now after the fsync log replay btrfs left our inode with a wrong link count N,
    # which was smaller than the correct link count M (N < M).
    # So after removing N hard links, the remaining M - N directory entries were
    # still visible to user space but it was impossible to do anything with them
    # because they pointed to an inode that didn't exist anymore. This resulted in
    # stale file handle errors (-ESTALE) when accessing those dentries for example.
    #
    # So remove all hard links except the first one and then attempt to read the
    # file, to verify we don't get an -ESTALE error when accessing the inodel
    #
    # The btrfs fsck tool also detected the incorrect inode link count and it
    # reported an error message like the following:
    #
    # root 5 inode 257 errors 2001, no inode item, link count wrong
    #   unresolved ref dir 256 index 2978 namelen 13 name foo_link_2976 filetype 1 errors 4, no inode ref
    #
    # The fstests framework automatically calls fsck after a test is run, so we
    # don't need to call fsck explicitly here.

    rm -f $SCRATCH_MNT/foo_link_*
    cat $SCRATCH_MNT/foo

    status=0
    exit

So make sure an fsync always flushes the delayed inode item, so that the
fsync log contains it (needed in order to trigger the link count fixup
code) and fix the extref counting function, which always return -ENOENT
to its caller (and made it assume there were always 0 extrefs).

This issue has been present since the introduction of the extrefs feature
(2012).

A test case for xfstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 2c2c452b0cafdc27442796c526ac929443654b0b)

Btrfs: fix directory inconsistency after fsync log replay

If we have an inode (file) with a link count greater than 1, remove
one of its hard links, fsync the inode, power fail/crash and then
replay the fsync log on the next mount, we end up getting the parent
directory's metadata inconsistent - its i_size still reflects the
deleted hard link and has dangling index entries (with no matching
inode reference entries). This prevents the directory from ever being
deletable, as its i_size can never decrease to BTRFS_EMPTY_DIR_SIZE
even if all of its children inodes are deleted, and the dangling index
entries can never be removed (as they point to an inode that does not
exist anymore).

This is easy to reproduce with the following excerpt from the test case
for xfstests that I just made:

    _scratch_mkfs >> $seqres.full 2>&1

    _init_flakey
    _mount_flakey

    # Create a test file with 2 hard links in the same directory.
    mkdir -p $SCRATCH_MNT/a/b
    echo "hello world" > $SCRATCH_MNT/a/b/foo
    ln $SCRATCH_MNT/a/b/foo $SCRATCH_MNT/a/b/bar

    # Make sure all metadata and data are durably persisted.
    sync

    # Now remove one of the hard links and fsync the inode.
    rm -f $SCRATCH_MNT/a/b/bar
    $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/a/b/foo

    # Simulate a crash/power loss. This makes sure the next mount
    # will see an fsync log and will replay that log.

    _load_flakey_table $FLAKEY_DROP_WRITES
    _unmount_flakey

    _load_flakey_table $FLAKEY_ALLOW_WRITES
    _mount_flakey

    # Remove the last hard link of the file and attempt to remove its parent
    # directory - this failed in btrfs because the fsync log and replay code
    # didn't decrement the parent directory's i_size and left dangling directory
    # index entries - this made the btrfs rmdir implementation always fail with
    # the error -ENOTEMPTY.
    #
    # The dangling directory index entries were visible to user space, but it was
    # impossible to do anything on them (unlink, open, read, write, stat, etc)
    # because the inode they pointed to did not exist anymore.
    #
    # The parent directory's metadata inconsistency (stale index entries) was
    # also detected by btrfs' fsck tool, which is run automatically by the fstests
    # framework when the test finishes. The error message reported by fsck was:
    #
    # root 5 inode 259 errors 2001, no inode item, link count wrong
    #   unresolved ref dir 258 index 3 namelen 3 name bar filetype 1 errors 4, no inode ref
    #
    rm -f $SCRATCH_MNT/a/b/*
    rmdir $SCRATCH_MNT/a/b
    rmdir $SCRATCH_MNT/a

To fix this just make sure that after an unlink, if the inode is fsync'ed,
he parent inode is fully logged in the fsync log.

A test case for xfstests follows soon.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit d36808e0d4fc4802892dbd1dba964912cddf1323)

Btrfs: lookup for block group only if needed when freeing a tree block

Very often our extent buffer's header generation doesn't match the current
transaction's id or it is also referenced by other trees (snapshots), so
we don't need the corresponding block group cache object. Therefore only
search for it if we are going to use it, so we avoid an unnecessary search
in the block groups rbtree (and acquiring and releasing its spinlock).

Freeing a tree block is performed when COWing or deleting a node/leaf,
which implies we are holding the node/leaf's parent node lock, therefore
reducing the amount of time spent when freeing a tree block helps reducing
the amount of time we are holding the parent node's lock.

For example, for a run of xfstests/generic/083, the block group cache
object was needed only 682 times for a total of 226691 calls to free
a tree block.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 6219872dc6e56529159f04e73587ed0fcd63eb20)

btrfs: remove a no-op unfreeze superbock callback

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 730a78c741df4eaa24589015191f03c2d1240091)

btrfs: switch extent_state state to unsigned

Currently there's a 4B hole in the structure between refs and state and there
are only 16 bits used so we can make it unsigned. This will get a better
packing and may save some stack space for local variables.

The size of extent_state gets reduced by 8B and there are usually a lot
of slab objects.

struct extent_state {
u64                        start;                /*     0     8 */
u64                        end;                  /*     8     8 */
struct rb_node             rb_node;              /*    16    24 */
wait_queue_head_t          wq;                   /*    40    24 */
/* --- cacheline 1 boundary (64 bytes) --- */
atomic_t                   refs;                 /*    64     4 */

/* XXX 4 bytes hole, try to pack */

long unsigned int          state;                /*    72     8 */
u64                        private;              /*    80     8 */

/* size: 88, cachelines: 2, members: 7 */
/* sum members: 84, holes: 1, sum holes: 4 */
/* last cacheline: 24 bytes */
};

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 9ee49a047dc53fd21808cbb7f3b6a3345463e834)

btrfs: update message levels after checksum errors

The errors are worth noting and might get missed with INFO level.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit f0954c66377b1610a512fd05f4d33afa54441c3b)

btrfs: update message levels during failed mount

All error conditions from open_ctree shall be ERR. Warning would
suggest that something's wrong and we can continue.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit aa8ee31209d644a3f4716cb0c43aba7889fbef31)

btrfs: update message levels for errors

Several messages that point to some internal problem, level INFO is
wrong here.

Signed-off-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit 68b663d13ceba777ee6e250154f0e3ab28bd1181)

Btrfs: fix setup_leaf_for_split() to avoid leaf corruption

We were incorrectly detecting when the target key didn't exist anymore
after releasing the path and re-searching the tree. This could make
us split or duplicate (btrfs_split_item() and btrfs_duplicate_item()
are its only callers at the moment) an item when we should not.

For the case of duplicating an item, we currently only duplicate
checksum items (csum tree) and file extent items (fs/subvol trees).
For the checksum items we end up overriding the item completely,
but for file extent items we update only some of their fields in
the copy (done in __btrfs_drop_extents), which means we can end up
having a logical corruption for some values.

Also for the case where we duplicate a file extent item it will make
us produce a leaf with a wrong key order, as btrfs_duplicate_item()
advances us to the next slot and then its caller sets a smaller key
on the new item at that slot (like in __btrfs_drop_extents() e.g.).
Alternatively if the tree search in setup_leaf_for_split() leaves
with path->slots[0] == btrfs_header_nritems(path->nodes[0]), we end
up accessing beyond the leaf's end (when we check if the item's size
has changed) and make our caller insert an item at the invalid slot
btrfs_header_nritems(path->nodes[0]) + 1, causing an invalid memory
access if the leaf is full or nearly full.

This issue has been present since the introduction of this function
in 2009:

Btrfs: Add btrfs_duplicate_item
commit ad48fd754676bfae4139be1a897b1ea58f9aaf21

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit a8df6fe666f9a3a7c08df70f94351f967462dd95)

Btrfs: track dirty block groups on their own list

Currently any time we try to update the block groups on disk we will walk _all_
block groups and check for the ->dirty flag to see if it is set.  This function
can get called several times during a commit.  So if you have several terabytes
of data you will be a very sad panda as we will loop through _all_ of the block
groups several times, which makes the commit take a while which slows down the
rest of the file system operations.

This patch introduces a dirty list for the block groups that we get added to
when we dirty the block group for the first time.  Then we simply update any
block groups that have been dirtied since the last time we called
btrfs_write_dirty_block_groups.  This allows us to clean up how we write the
free space cache out so it is much cleaner.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit ce93ec548cfa02f9cd6b70d546d5f36f4d160f57)

Btrfs: change how we track dirty roots

I've been overloading root->dirty_list to keep track of dirty roots and which
roots need to have their commit roots switched at transaction commit time.  This
could cause us to lose an update to the root which could corrupt the file
system.  To fix this use a state bit to know if the root is dirty, and if it
isn't set we go ahead and move the root to the dirty list.  This way if we
re-dirty the root after adding it to the switch_commit list we make sure to
update it.  This also makes it so that the extent root is always the last root
on the dirty list to try and keep the amount of churn down at this point in the
commit.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
(cherry picked from commit e7070be198b34c26f39bd9010a29ce6462dc4f3e)

fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info

Now that we got rid of the bdi abuse on character devices we can always use
sb->s_bdi to get at the backing_dev_info for a file, except for the block
device special case. Export inode_to_bdi and replace uses of
mapping->backing_dev_info with it to prepare for the removal of
mapping->backing_dev_info.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
(cherry picked from commit de1414a654e66b81b5348dbc5259ecf2fb61655e)

btrfs: expand btrfs_find_item if found_key is NULL

If the found_key is NULL, then btrfs_find_item becomes a verbose wrapper
for simple btrfs_search_slot.

After we've removed all such callers, passing a NULL key is not valid
anymore.

Signed-off-by: David Sterba <dsterba@suse.cz>
(cherry picked from commit 1d4c08e0a60be356134d0c466744d0d4e16ebab0)

btrfs: simplify insert_orphan_item

We can search and add the orphan item in one go,
btrfs_insert_orphan_item will find out if the item already exists.

Signed-off-by: David Sterba <dsterba@suse.cz>
(cherry picked from commit 9c4f61f01d269815bb7c37be3ede59c5587747c6)

btrfs: cleanup, remove inode_ref_info helper

A simple wrapper around btrfs_find_item.

Signed-off-by: David Sterba <dsterba@suse.cz>
(cherry picked from commit c234a24de91837db6f00aeebe28e3daddc53c1b7)

btrfs: cleanup, remove inode_item_info helper

It's only a simple wrapper around btrfs_find_item, the locally defined
key is not used.

Signed-off-by: David Sterba <dsterba@suse.cz>
(cherry picked from commit 14692cc150d3ce10ea8766ccb2a8f483b77b49f0)

rcu: Make SRCU optional by using CONFIG_SRCU

SRCU is not necessary to be compiled by default in all cases. For tinification
efforts not compiling SRCU unless necessary is desirable.

The current patch tries to make compiling SRCU optional by introducing a new
Kconfig option CONFIG_SRCU which is selected when any of the components making
use of SRCU are selected.

If we do not select CONFIG_SRCU, srcu.o will not be compiled at all.

   text    data     bss     dec     hex filename
   2007       0       0    2007     7d7 kernel/rcu/srcu.o

Size of arch/powerpc/boot/zImage changes from

   text    data     bss     dec     hex filename
831552   64180   23944  919676   e087c arch/powerpc/boot/zImage : before
829504   64180   23952  917636   e0084 arch/powerpc/boot/zImage : after

so the savings are about ~2000 bytes.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
CC: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
CC: Josh Triplett <josh@joshtriplett.org>
CC: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
[ paulmck: resolve conflict due to removal of arch/ia64/kvm/Kconfig. ]
(cherry picked from commit 83fe27ea531161a655f02dc7732d14cfaa27fd5d)

zygo: turn off MMC debug, it floods the world with printks
(cherry picked from commit c1e395c05c5ae5677b2ce68ffab95785d4b17db7)

Conflicts:
.config

Merge tag 'v3.19.2' into zygo-3.19.2-zb64

This is the 3.19.2 stable release

# gpg: Signature made Wed Mar 18 09:11:58 2015 EDT using RSA key ID 6092693E
# gpg: Can't check signature: public key not found

Linux 3.19.2

Revert "netfilter: xt_recent: relax ip_pkt_list_tot restrictions"

This reverts commit abc86d0f99242b7f142b7cb8f90e30081dd3c256 as it is
broken in 3.19 and is easier to revert here than try to fix it.

Reported-by: Florian Westphal <fw@strlen.de
Reported-by: David Miller <davem@redhat.com>
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org

cxl: Add missing return statement after handling AFU errror

commit a6130ed253a931d2169c26ab0958d81b0dce4d6e upstream.

We were missing a return statement in the PSL interrupt handler in the
case of an AFU error, which would trigger an "Unhandled CXL PSL IRQ"
warning. We do actually handle these type of errors (by notifying
userspace), so add the missing return IRQ_HANDLED so we don't throw
unecessary warnings.

Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cxl: Fix device_node reference counting

commit 6f963ec2d6bf2476a16799eece920acb2100ff1c upstream.

When unbinding and rebinding the driver on a system with a card in PHB0, this
error condition is reached after a few attempts:

ERROR: Bad of_node_put() on /pciex@3fffe40000000
CPU: 0 PID: 3040 Comm: bash Not tainted 3.18.0-rc3-12545-g3627ffe #152
Call Trace:
[c000000721acb5c0] [c00000000086ef94] .dump_stack+0x84/0xb0 (unreliable)
[c000000721acb640] [c00000000073a0a8] .of_node_release+0xd8/0xe0
[c000000721acb6d0] [c00000000044bc44] .kobject_release+0x74/0xe0
[c000000721acb760] [c0000000007394fc] .of_node_put+0x1c/0x30
[c000000721acb7d0] [c000000000545cd8] .cxl_probe+0x1a98/0x1d50
[c000000721acb900] [c0000000004845a0] .local_pci_probe+0x40/0xc0
[c000000721acb980] [c000000000484998] .pci_device_probe+0x128/0x170
[c000000721acba30] [c00000000052400c] .driver_probe_device+0xac/0x2a0
[c000000721acbad0] [c000000000522468] .bind_store+0x108/0x160
[c000000721acbb70] [c000000000521448] .drv_attr_store+0x38/0x60
[c000000721acbbe0] [c000000000293840] .sysfs_kf_write+0x60/0xa0
[c000000721acbc50] [c000000000292500] .kernfs_fop_write+0x140/0x1d0
[c000000721acbcf0] [c000000000208648] .vfs_write+0xd8/0x260
[c000000721acbd90] [c000000000208b18] .SyS_write+0x58/0x100
[c000000721acbe30] [c000000000009258] syscall_exit+0x0/0x98

We are missing a call to of_node_get(). pnv_pci_to_phb_node() should
call of_node_get() otherwise np's reference count isn't incremented and
it might go away. Rename pnv_pci_to_phb_node() to pnv_pci_get_phb_node()
so it's clear it calls of_node_get().

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cxl: Use image state defaults for reloading FPGA

commit 4beb5421babee1204757b877622830c6aa31be6d upstream.

Select defaults such that a PERST causes flash image reload. Select which
image based on what the card is set up to load.

CXL_VSEC_PERST_LOADS_IMAGE selects whether PERST assertion causes flash image
load.

CXL_VSEC_PERST_SELECT_USER selects which image is loaded on the next PERST.

cxl_update_image_control writes these bits into the VSEC.

Signed-off-by: Ryan Grimm <grimm@linux.vnet.ibm.com>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk-gate: fix bit # check in clk_register_gate()

commit 2e9dcdae4068460c45a308dd891be5248260251c upstream.

In case CLK_GATE_HIWORD_MASK flag is passed to clk_register_gate(), the bit #
should be no higher than 15, however the corresponding check is obviously off-
by-one.

Fixes: 045779942c04 ("clk: gate: add CLK_GATE_HIWORD_MASK")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Michael Turquette <mturquette@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sched/autogroup: Fix failure to set cpu.rt_runtime_us

commit 1fe89e1b6d270aa0d3452c60d38461ea589594e3 upstream.

Because task_group() uses a cache of autogroup_task_group(), whose
output depends on sched_class, switching classes can generate
problems.

In particular, when started as fair, the cache points to the
autogroup, so when switching to RT the tg_rt_schedulable() test fails
for every cpu.rt_{runtime,period}_us change because now the autogroup
has tasks and no runtime.

Furthermore, going back to the previous semantics of varying
task_group() with sched_class has the down-side that the sched_debug
output varies as well, even though the task really is in the
autogroup.

Therefore add an autogroup exception to tg_has_rt_tasks() -- such that
both (all) task_group() usages in sched/core now have one. And remove
all the remnants of the variable task_group() output.

Reported-by: Zefan Li <lizefan@huawei.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Stefan Bader <stefan.bader@canonical.com>
Fixes: 8323f26ce342 ("sched: Fix race in task_group()")
Link: http://lkml.kernel.org/r/20150209112237.GR5029@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

vmstat: do not use deferrable delayed work for vmstat_update

commit ba4877b9ca51f80b5d30f304a46762f0509e1635 upstream.

Vinayak Menon has reported that an excessive number of tasks was throttled
in the direct reclaim inside too_many_isolated() because NR_ISOLATED_FILE
was relatively high compared to NR_INACTIVE_FILE. However it turned out
that the real number of NR_ISOLATED_FILE was 0 and the per-cpu
vm_stat_diff wasn't transferred into the global counter.

vmstat_work which is responsible for the sync is defined as deferrable
delayed work which means that the defined timeout doesn't wake up an idle
CPU. A CPU might stay in an idle state for a long time and general effort
is to keep such a CPU in this state as long as possible which might lead
to all sorts of troubles for vmstat consumers as can be seen with the
excessive direct reclaim throttling.

This patch basically reverts 39bf6270f524 ("VM statistics: Make timer
deferrable") but it shouldn't cause any problems for idle CPUs because
only CPUs with an active per-cpu drift are woken up since 7cc36bbddde5
("vmstat: on-demand vmstat workers v8") and CPUs which are idle for a
longer time shouldn't have per-cpu drift.

Fixes: 39bf6270f524 (VM statistics: Make timer deferrable)
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Reported-by: Vinayak Menon <vinmenon@codeaurora.org>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@parallels.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pinctrl: imx25: fix numbering for pins

commit 34027ca2bbc6043fea8fc5c4a82670518b6be7df upstream.

The pin id for a given tuple listed in a fsl,pins property is calculated
by dividing the first entry (which is also a register offset) by 4.
As the first available register is at offset 0x8 and configures the pad
MX25_PAD_A10 the right id for this pin is 2. All other pins are off by
one, too.

This patch drops the definition MX25_PAD_RESERVE1 (together with its
only use) and decrements all following values by 1.

Fixes: b4a87c9b966f ("pinctrl: pinctrl-imx: add imx25 pinctrl driver")
Signed-off-by: Uwe Kleine-KÃ¶nig <u.kleine-koenig@pengutronix.de>
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

pinctrl: pinctrl-imx: don't use invalid value of conf_reg

commit 4ff0f034e95d65f8f063a362dfcf86e986377a82 upstream.

The right check for conf_reg to be invalid it testing against -1 not 0
as is done in the rest of the driver.

This fixes an oops that can be triggered by:

cat /sys/kernel/debug/pinctrl/43fac000.iomuxc/*

Fixes: ae75ff814538 ("pinctrl: pinctrl-imx: add imx pinctrl core driver")
Signed-off-by: Uwe Kleine-KÃ¶nig <u.kleine-koenig@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ath5k: fix spontaneus AR5312 freezes

commit 8bfae4f9938b6c1f033a5159febe97e441d6d526 upstream.

Sometimes while CPU have some load and ath5k doing the wireless
interface reset the whole WiSoC completely freezes. Set of tests shows
that using atomic delay function while we wait interface reset helps to
avoid such freezes.

The easiest way to reproduce this issue: create a station interface,
start continous scan with wpa_supplicant and load CPU by something. Or
just create multiple station interfaces and put them all in continous
scan.

This patch partially reverts the commit 1846ac3dbec0 ("ath5k: Use
usleep_range where possible"), which replaces initial udelay()
by usleep_range().

I do not know actual source of this issue, but all looks like that HW
freeze is caused by transaction on internal SoC bus, while wireless
block is in reset state.

Also I should note that I do not know how many chips are affected, but I
did not see this issue with chips, other than AR5312.

CC: Jiri Slaby <jirislaby@gmail.com>
CC: Nick Kossifidis <mickflemm@gmail.com>
CC: Luis R. Rodriguez <mcgrof@do-not-panic.com>
Fixes: 1846ac3dbec0 ("ath5k: Use usleep_range where possible")
Reported-by: Christophe Prevotaux <c.prevotaux@rural-networks.com>
Tested-by: Christophe Prevotaux <c.prevotaux@rural-networks.com>
Tested-by: Eric Bree <ebree@nltinc.com>
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

GFS2: Fix crash during ACL deletion in acl max entry check in gfs2_set_acl()

commit 278702074ff77b1a3fa2061267997095959f5e2c upstream.

Fixes: e01580bf9e ("gfs2: use generic posix ACL infrastructure")
Reported-by: Eric Meddaugh <etmsys@rit.edu>
Tested-by: Eric Meddaugh <etmsys@rit.edu>
Signed-off-by: Andrew Elble <aweits@rit.edu>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

of/pci: Free resources on failure in of_pci_get_host_bridge_resources()

commit d2be00c0fb5ae0794deffcdb0425cd5a8d823db0 upstream.

In the function of_pci_get_host_bridge_resources() if the parsing of ranges
fails, previously allocated resources inclusive of bus_range are not freed
and are not expected to be freed by the function caller on error return.

This patch fixes the issues by adding code that properly frees resources
and bus_range before exiting the function with an error return value.

Fixes: cbe4097f8ae6 ("of/pci: Add support for parsing PCI host bridge resources from DT")
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Liviu Dudau <liviu.dudau@arm.com>
CC: Arnd Bergmann <arnd@arndb.de>
CC: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sched: Fix hrtick_start() on UP

commit 868933359a3bdda25b562e9d41bce7071edc1b08 upstream.

The commit 177ef2a6315e ("sched/deadline: Fix a precision problem in
the microseconds range") forgot to change the UP version of
hrtick_start(), do so now.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Fixes: 177ef2a6315e ("sched/deadline: Fix a precision problem in the microseconds range")
[ Fixed the changelog. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@arm.com>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1416962647-76792-7-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

coresight-etm: unlock on error paths in mode_store()

commit 6ad1095990328e7e4b3a0e260825ad4b6406785a upstream.

There are some missing unlocks on the error paths.

Fixes: a939fc5a71ad ('coresight-etm: add CoreSight ETM/PTM driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

stable_kernel_rules: reorganize and update submission options

commit 5de61e7aa1ba9ac3c7edbea375da2bc8eb1a89ae upstream.

The current organization of Documentation/stable_kernel_rules.txt
doesn't clearly differentiate the mutually exclusive options for
submission to the -stable review process. As I understand it, patches
are not actually required to be mailed directly to
stable@vger.kernel.org, but the instructions do not make this clear.

Also, there are some established processes that are not listed --
specifically, what I call Option 2 below.

This patch updates and reorganizes a bit, to make things clearer.

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: rt5670: Set RT5670_IRQ_CTRL1 non volatile

commit 850529249d7cce02e9bfae9476d09c8c51410d28 upstream.

RT5670_IRQ_CTRL1(0xbd) is a non volatile register. And we need to
restore its value after suspend/resume.

Signed-off-by: Bard Liao <bardliao@realtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ASoC: omap-pcm: Correct dma mask

commit d51199a83a2cf82a291d19ee852c44caa511427d upstream.

DMA_BIT_MASK of 64 is not valid dma address mask for OMAPs, it should be
set to 32.
The 64 was introduced by commit (in 2009):
a152ff24b978 ASoC: OMAP: Make DMA 64 aligned

But the dma_mask and coherent_dma_mask can not be used to specify alignment.

Fixes: a152ff24b978 (ASoC: OMAP: Make DMA 64 aligned)
Reported-by: Grygorii Strashko <Grygorii.Strashko@linaro.org>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFSv4: Don't call put_rpccred() under the rcu_read_lock()

commit 7c0af9ffb7bb4e5355470fa60b3eb711ddf226fa upstream.

put_rpccred() can sleep.

Fixes: 8f649c3762547 ("NFSv4: Fix the locking in nfs_inode_reclaim_delegation()")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

NFS: Don't invalidate a submounted dentry in nfs_prime_dcache()

commit 6c441c254eea2354d686be7f5544bcd79fb6a61f upstream.

If we're traversing a directory which contains a submounted filesystem,
or one that has a referral, the NFS server that is processing the READDIR
request will often return information for the underlying (mounted-on)
directory. It may, or may not, also return filehandle information.

If this happens, and the lookup in nfs_prime_dcache() returns the
dentry for the submounted directory, the filehandle comparison will
fail, and we call d_invalidate(). Post-commit 8ed936b5671bf
("vfs: Lazily remove mounts on unlinked files and directories."), this
means the entire subtree is unmounted.

The following minimal patch addresses this problem by punting on
the invalidation if there is a submount.

Kudos to Neil Brown <neilb@suse.de> for having tracked down this
issue (see link).

Reported-by: Nix <nix@esperi.org.uk>
Link: http://lkml.kernel.org/r/87iofju9ht.fsf@spindle.srvr.nix
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / LPSS: provide con_id for the clkdev

commit fcf0789a96777d79d20290e08bf43943a5619387 upstream.

Commit 7d78cbefaa (serial: 8250_dw: add ability to handle
the peripheral clock) introduces handling for a second clk
to 8250_dw.c which is the driver also for LPSS UART. The
second clk forces us to provide identifier (con_id) for the
clkdev we create.

This fixes an issue where 8250_dw.c is getting the same
handler for both clocks.

Fixes: 7d78cbefaa (serial: 8250_dw: add ability to handle the peripheral clock)
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ACPI / video: Load the module even if ACPI is disabled

commit 6e17cb12881ba8d5e456b89f072dc6b70048af36 upstream.

i915.ko depends upon the acpi/video.ko module and so refuses to load if
ACPI is disabled at runtime if for example the BIOS is broken beyond
repair. acpi/video provides an optional service for i915.ko and so we
should just allow the modules to load, but do no nothing in order to let
the machines boot correctly.

Reported-by: Bill Augur <bill-auger@programmer.net>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jani Nikula <jani.nikula@intel.com>
Acked-by: Aaron Lu <aaron.lu@intel.com>
[ rjw: Fixed up the new comment in acpi_video_init() ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

eCryptfs: don't pass fs-specific ioctl commands through

commit 6d65261a09adaa374c05de807f73a144d783669e upstream.

eCryptfs can't be aware of what to expect when after passing an
arbitrary ioctl command through to the lower filesystem. The ioctl
command may trigger an action in the lower filesystem that is
incompatible with eCryptfs.

One specific example is when one attempts to use the Btrfs clone
ioctl command when the source file is in the Btrfs filesystem that
eCryptfs is mounted on top of and the destination fd is from a new file
created in the eCryptfs mount. The ioctl syscall incorrectly returns
success because the command is passed down to Btrfs which thinks that it
was able to do the clone operation. However, the result is an empty
eCryptfs file.

This patch allows the trim, {g,s}etflags, and {g,s}etversion ioctl
commands through and then copies up the inode metadata from the lower
inode to the eCryptfs inode to catch any changes made to the lower
inode's metadata. Those five ioctl commands are mostly common across all
filesystems but the whitelist may need to be further pruned in the
future.

https://bugzilla.kernel.org/show_bug.cgi?id=93691
https://launchpad.net/bugs/1305335

Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Cc: Rocko <rockorequin@hotmail.com>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

efi/libstub: Fix boundary checking in efi_high_alloc()

commit 7ed620bb343f434f8a85f830020c04988df2a140 upstream.

While adding support loading kernel and initrd above 4G to grub2 in legacy
mode, I was referring to efi_high_alloc().
That will allocate buffer for kernel and then initrd, and initrd will
use kernel buffer start as limit.

During testing found two buffers will be overlapped when initrd size is
very big like 400M.

It turns out efi_high_alloc() boundary checking is not right.
end - size will be the new start, and should not compare new
start with max, we need to make sure end is smaller than max.

[ Basically, with the current efi_high_alloc() code it's possible to
  allocate memory above 'max', because efi_high_alloc() doesn't check
  that the tail of the allocation is below 'max'.

  If you have an EFI memory map with a single entry that looks like so,

   [0xc0000000-0xc0004000]

  And want to allocate 0x3000 bytes below 0xc0003000 the current code
  will allocate [0xc0001000-0xc0004000], not [0xc0000000-0xc0003000]
  like you would expect. - Matt ]

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

efi: Small leak on error in runtime map code

commit 86d68a58d00db3770735b5919ef2c6b12d7f06f3 upstream.

The "> 0" here should ">= 0" so we free map_entries[0].

Fixes: 926172d46038 ('efi: Export EFI runtime memory mapping to sysfs')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nfsd: fix clp->cl_revoked list deletion causing softlock in nfsd

commit c876486be17aeefe0da569f3d111cbd8de6f675d upstream.

commit 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is
protected by clp->cl_lock") removed the use of the reaplist to
clean out clp->cl_revoked. It failed to change list_entry() to
walk clp->cl_revoked.next instead of reaplist.next

Fixes: 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock")
Reported-by: Eric Meddaugh <etmsys@rit.edu>
Tested-by: Eric Meddaugh <etmsys@rit.edu>
Signed-off-by: Andrew Elble <aweits@rit.edu>
Reviewed-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

reservation: Remove shadowing local variable 'ret'

commit 4eb2440ed60fb5793f7aa6da89b3d517cc59de43 upstream.

It was causing the return value of fence_is_signaled to be ignored, making
reservation objects signal too early.

Reviewed-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Check for driver readyness before handling an underrun interrupt

commit 54fc7c1c961cb39edfe31f8a3f5ba6414e134b37 upstream.

When we takeover from the BIOS and install our interrupt handler, the
BIOS may have left us a few surprises in the form of spontaneous
interrupts. (This is especially likely on hardware like 965gm where
display fifo underruns are continuous and the GMCH cannot filter that
interrupt souce.) As we enable our IRQ early so that we can use it
during hardware probing, our interrupt handler must be prepared to
handle a few sources prior to being fully configured. As such, we need
to add a simple is-ready check prior to dereferencing our KMS state for
reporting underruns.

Reported-by: Rob Clark <rclark@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1193972
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
[Jani: dropped the extra !]
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: avoid processing spurious/shared interrupts in low-power states

commit 2dd2a883aad7c852400027c2261bcab69d9e238e upstream.

Atm, it's possible that the interrupt handler is called when the device
is in D3 or some other low-power state. It can be due to another device
that is still in D0 state and shares the interrupt line with i915, or on
some platforms there could be spurious interrupts even without sharing
the interrupt line. The latter case was reported by Klaus Ethgen using a
Lenovo x61p machine (gen 4). He noticed this issue via a system
suspend/resume hang and bisected it to the following commit:

commit e11aa362308f5de467ce355a2a2471321b15a35c
Author: Jesse Barnes <jbarnes@virtuousgeek.org>
Date:   Wed Jun 18 09:52:55 2014 -0700

    drm/i915: use runtime irq suspend/resume in freeze/thaw

This is a problem, since in low-power states IIR will always read
0xffffffff resulting in an endless IRQ servicing loop.

Fix this by handling interrupts only when the driver explicitly enables
them and so it's guaranteed that the interrupt registers return a valid
value.

Note that this issue existed even before the above commit, since during
runtime suspend/resume we never unregistered the handler.

v2:
- clarify the purpose of smp_mb() vs. synchronize_irq() in the
  code comment (Chris)

v3:
- no need for an explicit smp_mb(), we can assume that synchronize_irq()
  and the mmio read/writes in the install hooks provide for this (Daniel)
- remove code comment as the remaining synchronize_irq() is self
  explanatory (Daniel)

v4:
- drm_irq_uninstall() implies synchronize_irq(), so no need to call it
  explicitly (Daniel)

Reference: https://lkml.org/lkml/2015/2/11/205
Reported-and-bisected-by: Klaus Ethgen <Klaus@Ethgen.ch>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Dell Chromebook 11 has PWM backlight

commit cf6f0af9fbdd90b81af14fa6375387131cd8adf1 upstream.

Add quirk for Dell Chromebook 11 backlight.

Reported-and-tested-by: Owen Garland <garland.owen@gmail.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=93451
Acked-by: Damien Lespiau <damien.lespiau@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915: Check obj->vma_list under the struct_mutex

commit 6c31a614c43ae274546f736b2a33363e149c3dc2 upstream.

When we walk the list of vma, or even for protecting against concurrent
framebuffer creation, we must hold the struct_mutex or else a second
thread can corrupt the list as we walk it.

Fixes regression from
commit d7f46fc4e7323887494db13f063a8e59861fefb0
Author: Ben Widawsky <benjamin.widawsky@intel.com>
Date: Fri Dec 6 14:10:55 2013 -0800

drm/i915: Make pin count per VMA

References: https://bugs.freedesktop.org/show_bug.cgi?id=89085
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/i915/bdw: PCI IDs ending in 0xb are ULT.

commit 0dc6f20b9803f09726bbb682649d35cda8ef5b5d upstream.

When reviewing patch that fixes VGA on BDW Halo Jani noticed that
we also had other ULT IDs that weren't listed there.

So this follow-up patch add these pci-ids as halo and fix comments
on i915_pciids.h

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: fix 1 RB harvest config setup for TN/RL

commit dbfb00c3e7e18439f2ebf67fe99bf7a50b5bae1e upstream.

The logic was reversed from what the hw actually exposed.
Fixes graphics corruption in certain harvest configurations.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: use drm_mode_vrefresh() rather than mode->vrefresh

commit 3d2d98ee1af0cf6eebfbd6bff4c17d3601ac1284 upstream.

Just in case it hasn't been calculated for the mode.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

drm/radeon: enable native backlight control on old macs

commit 7a26f9ad1b5badfd0200ce2262ad696e2a6b7fbb upstream.

Commit b7bc596ebbe0 ("drm/radeon: disable native
backlight control on pre-r6xx asics (v2)") accidently
broke backlight control on old mac laptops that use the
on-GPU backlight controller.

Signed-off-by: Nathan-J. Hirschauer <nathanhi@deepserve.info>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: wacom: Report ABS_MISC event for Cintiq Companion Hybrid

commit 33e5df0e0e32027866e9fb00451952998fc957f2 upstream.

It appears that the Cintiq Companion Hybrid does not send an ABS_MISC event to
userspace when any of its ExpressKeys are pressed. This is not strictly
necessary now that the pad exists on its own device, but should be fixed for
consistency's sake.

Traditionally both the stylus and pad shared the same device node, and
xf86-input-wacom would use ABS_MISC for disambiguation. Not sending this causes
the Hybrid to behave incorrectly with xf86-input-wacom beginning with its
8f44f3 commit.

Signed-off-by: Jason Gerecke <killertofu@gmail.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: fixup the conflicting keyboard mappings quirk

commit 8e7b341037db1835ee6eea64663013cbfcf33575 upstream.

The ignore check that got added in 6ce901eb61 ("HID: input: fix confusion
on conflicting mappings") needs to properly check for VARIABLE reports
as well (ARRAY reports should be ignored), otherwise legitimate keyboards
might break.

Fixes: 6ce901eb61 ("HID: input: fix confusion on conflicting mappings")
Reported-by: Fredrik Hallenberg <megahallon@gmail.com>
Reported-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

HID: input: fix confusion on conflicting mappings

commit 6ce901eb61aa30ba8565c62049ee80c90728ef14 upstream.

On an PC-101/103/104 keyboard (American layout) the 'Enter' key and its
neighbours look like this:

           +---+ +---+ +-------+
           | 1 | | 2 | |   5   |
           +---+ +---+ +-------+
             +---+ +-----------+
             | 3 | |     4     |
             +---+ +-----------+

On a PC-102/105 keyboard (European layout) it looks like this:

           +---+ +---+ +-------+
           | 1 | | 2 | |       |
           +---+ +---+ +-+  4  |
             +---+ +---+ |     |
             | 3 | | 5 | |     |
             +---+ +---+ +-----+

(Note that the number of keys is the same, but key '5' is moved down and
the shape of key '4' is changed. Keys '1' to '3' are exactly the same.)

The keys 1-4 report the same scan-code in HID in both layouts, even though
the keysym they produce is usually different depending on the XKB-keymap
used by user-space.
However, key '5' (US 'backslash'/'pipe') reports 0x31 for the upper layout
and 0x32 for the lower layout, as defined by the HID spec. This is highly
confusing as the linux-input API uses a single keycode for both.

So far, this was never a problem as there never has been a keyboard with
both of those keys present at the same time. It would have to look
something like this:

           +---+ +---+ +-------+
           | 1 | | 2 | |  x31  |
           +---+ +---+ +-------+
             +---+ +---+ +-----+
             | 3 | |x32| |  4  |
             +---+ +---+ +-----+

HID can represent such a keyboard, but the linux-input API cannot.
Furthermore, any user-space mapping would be confused by this and,
luckily, no-one ever produced such hardware.

Now, the HID input layer fixed this mess by mapping both 0x31 and 0x32 to
the same keycode (KEY_BACKSLASH==0x2b). As only one of both physical keys
is present on a hardware, this works just fine.

Lets introduce hardware-vendors into this:
------------------------------------------

Unfortunately, it seems way to expensive to produce a different device for
American and European layouts. Therefore, hardware-vendors put both keys,
(0x31 and 0x32) on the same keyboard, but only one of them is hooked up
to the physical button, the other one is 'dead'.
This means, they can use the same hardware, with a different button-layout
and automatically produce the correct HID events for American *and*
European layouts. This is unproblematic for normal keyboards, as the
'dead' key will never report any KEY-DOWN events. But RollOver keyboards
send the whole matrix on each key-event, allowing n-key roll-over mode.
This means, we get a 0x31 and 0x32 event on each key-press. One of them
will always be 0, the other reports the real state. As we map both to the
same keycode, we will get spurious key-events, even though the real
key-state never changed.

The easiest way would be to blacklist 'dead' keys and never handle those.
We could simply read the 'country' tag of USB devices and blacklist either
key according to the layout. But... hardware vendors... want the same
device for all countries and thus many of them set 'country' to 0 for all
devices. Meh..

So we have to deal with this properly. As we cannot know which of the keys
is 'dead', we either need a heuristic and track those keys, or we simply
make use of our value-tracking for HID fields. We simply ignore HID events
for absolute data if the data didn't change. As HID tracks events on the
HID level, we haven't done the keycode translation, yet. Therefore, the
'dead' key is tracked independently of the real key, therefore, any events
on it will be ignored.

This patch simply discards any HID events for absolute data if it didn't
change compared to the last report. We need to ignore relative and
buffered-byte reports for obvious reasons. But those cannot be affected by
this bug, so we're fine.

Preferably, we'd do this filtering on the HID-core level. But this might
break a lot of custom drivers, if they do not follow the HID specs.
Therefore, we do this late in hid-input just before we inject it into the
input layer (which does the exact same filtering, but on the keycode
level).

If this turns out to break some devices, we might have to limit filtering
to EV_KEY events. But lets try to do the Right Thing first, and properly
filter any absolute data that didn't change.

This patch is tagged for 'stable' as it fixes a lot of n-key RollOver
hardware. We might wanna wait with backporting for a while, before we know
it doesn't break anything else, though.

Reported-by: Adam Goode <adam@spicenitz.org>
Reported-by: Fredrik Hallenberg <megahallon@gmail.com>
Tested-by: Fredrik Hallenberg <megahallon@gmail.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: comedi: cb_pcidas64: fix incorrect AI range code handling

commit be8e89087ec2d2c8a1ad1e3db64bf4efdfc3c298 upstream.

The hardware range code values and list of valid ranges for the AI
subdevice is incorrect for several supported boards.  The hardware range
code values for all boards except PCI-DAS4020/12 is determined by
calling `ai_range_bits_6xxx()` based on the maximum voltage of the range
and whether it is bipolar or unipolar, however it only returns the
correct hardware range code for the PCI-DAS60xx boards.  For
PCI-DAS6402/16 (and /12) it returns the wrong code for the unipolar
ranges.  For PCI-DAS64/Mx/16 it returns the wrong code for all the
ranges and the comedi range table is incorrect.

Change `ai_range_bits_6xxx()` to use a look-up table pointed to by new
member `ai_range_codes` of `struct pcidas64_board` to map the comedi
range table indices to the hardware range codes.  Use a new comedi range
table for the PCI-DAS64/Mx/16 boards (and the commented out variants).

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

firmware: dmi_scan: Fix dmi scan to handle "End of Table" structure

commit ce204e9a4bd82e9e6e7479bca8057e45aaac5c42 upstream.

The dmi-sysfs should create "End of Table" entry, that is type 127. But
after adding initial SMBIOS v3 support fc43026278b2 ("dmi: add support
for SMBIOS 3.0 64-bit entry point") the 127-0 entry is not handled any
more, as result it's not created in dmi sysfs for instance. This is
important because the size of whole DMI table must correspond to sum of
all DMI entry sizes.

So move the end-of-table check after it's handled by dmi_table.

Reviewed-by: Ard Biesheuvel <ard@linaro.org>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

firmware: dmi_scan: Fix dmi_len type

commit 6d9ff473317245e3e5cd9922b4520411c2296388 upstream.

According to SMBIOSv3 specification the length of DMI table can be
up to 32bits wide. So use appropriate type to avoid overflow.

It's obvious that dmi_num theoretically can be more than u16 also,
so it's can be changed to u32 or at least it's better to use int
instead of u16, but on that moment I cannot imagine dmi structure
count more than 65535 and it can require changing type of vars that
work with it. So I didn't correct it.

Acked-by: Ard Biesheuvel <ard@linaro.org>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm snapshot: fix a possible invalid memory access on unload

commit 22aa66a3ee5b61e0f4a0bfeabcaa567861109ec3 upstream.

When the snapshot target is unloaded, snapshot_dtr() waits until
pending_exceptions_count drops to zero. Then, it destroys the snapshot.
Therefore, the function that decrements pending_exceptions_count
should not touch the snapshot structure after the decrement.

pending_complete() calls free_pending_exception(), which decrements
pending_exceptions_count, and then it performs up_write(&s->lock) and it
calls retry_origin_bios() which dereferences s->origin. These two
memory accesses to the fields of the snapshot may touch the dm_snapshot
struture after it is freed.

This patch moves the call to free_pending_exception() to the end of
pending_complete(), so that the snapshot will not be destroyed while
pending_complete() is in progress.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm: fix a race condition in dm_get_md

commit 2bec1f4a8832e74ebbe859f176d8a9cb20dd97f4 upstream.

The function dm_get_md finds a device mapper device with a given dev_t,
increases the reference count and returns the pointer.

dm_get_md calls dm_find_md, dm_find_md takes _minor_lock, finds the
device, tests that the device doesn't have DMF_DELETING or DMF_FREEING
flag, drops _minor_lock and returns pointer to the device. dm_get_md then
calls dm_get. dm_get calls BUG if the device has the DMF_FREEING flag,
otherwise it increments the reference count.

There is a possible race condition - after dm_find_md exits and before
dm_get is called, there are no locks held, so the device may disappear or
DMF_FREEING flag may be set, which results in BUG.

To fix this bug, we need to call dm_get while we hold _minor_lock. This
patch renames dm_find_md to dm_get_md and changes it so that it calls
dm_get while holding the lock.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm io: reject unsupported DISCARD requests with EOPNOTSUPP

commit 37527b869207ad4c208b1e13967d69b8bba1fbf9 upstream.

I created a dm-raid1 device backed by a device that supports DISCARD
and another device that does NOT support DISCARD with the following
dm configuration:

#  echo '0 2048 mirror core 1 512 2 /dev/sda 0 /dev/sdb 0' | dmsetup create moo
# lsblk -D
NAME         DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO
sda                 0        4K       1G         0
`-moo (dm-0)        0        4K       1G         0
sdb                 0        0B       0B         0
`-moo (dm-0)        0        4K       1G         0

Notice that the mirror device /dev/mapper/moo advertises DISCARD
support even though one of the mirror halves doesn't.

If I issue a DISCARD request (via fstrim, mount -o discard, or ioctl
BLKDISCARD) through the mirror, kmirrord gets stuck in an infinite
loop in do_region() when it tries to issue a DISCARD request to sdb.
The problem is that when we call do_region() against sdb, num_sectors
is set to zero because q->limits.max_discard_sectors is zero.
Therefore, "remaining" never decreases and the loop never terminates.

To fix this: before entering the loop, check for the combination of
REQ_DISCARD and no discard and return -EOPNOTSUPP to avoid hanging up
the mirror device.

This bug was found by the unfortunate coincidence of pvmove and a
discard operation in the RHEL 6.5 kernel; upstream is also affected.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Acked-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dm mirror: do not degrade the mirror on discard error

commit f2ed51ac64611d717d1917820a01930174c2f236 upstream.

It may be possible that a device claims discard support but it rejects
discards with -EOPNOTSUPP. It happens when using loopback on ext2/ext3
filesystem driven by the ext4 driver. It may also happen if the
underlying devices are moved from one disk on another.

If discard error happens, we reject the bio with -EOPNOTSUPP, but we do
not degrade the array.

This patch fixes failed test shell/lvconvert-repair-transient.sh in the
lvm2 testsuite if the testsuite is extracted on an ext2 or ext3
filesystem and it is being driven by the ext4 driver.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

staging: comedi: comedi_compat32.c: fix COMEDI_CMD copy back

commit 42b8ce6f55facfa101462e694d33fc6bca471138 upstream.

`do_cmd_ioctl()` in "comedi_fops.c" handles the `COMEDI_CMD` ioctl.
This returns `-EAGAIN` if it has copied a modified `struct comedi_cmd`
back to user-space.  (This occurs when the low-level Comedi driver's
`do_cmdtest()` handler returns non-zero to indicate a problem with the
contents of the `struct comedi_cmd`, or when the `struct comedi_cmd` has
the `CMDF_BOGUS` flag set.)

`compat_cmd()` in "comedi_compat32.c" handles the 32-bit compatible
version of the `COMEDI_CMD` ioctl.  Currently, it never copies a 32-bit
compatible version of `struct comedi_cmd` back to user-space, which is
at odds with the way the regular `COMEDI_CMD` ioctl is handled.  To fix
it, change `compat_cmd()` to copy a 32-bit compatible version of the
`struct comedi_cmd` back to user-space when the main ioctl handler
returns `-EAGAIN`.

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Reviewed-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sunxi: clk: Set sun6i-pll1 n_start = 1

commit 76820fcf7aa5a418b69cb7bed31b62d1feb1d6ad upstream.

For all pll-s on sun6i n == 0 means use a multiplier of 1, rather then 0 as
it means on sun4i / sun5i / sun7i. n_start = 1 is already correctly set
for sun6i pll6, but was missing for pll1, this commit fixes this.

Cc: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: Fix debugfs clk removal before inited

commit 52bba9809a954d72bc77773bd560b9724b495eb7 upstream.

Some of the clks can be registered & unregistered before the clk related debugfs
entries are initialized at late_initcall. In the unregister path checking for only
dentry before clk_debug_init() would lead dangling pointers in the debug clk list,
because the list is already populated in register path and the clk pointer freed in
unregister path.
The side effect of not removing it from the list is either a null pointer
dereference or if lucky to boot the system, the number of clk entries in
debugfs disappear.

We could add more checks like if (inited && !clk->dentry) but just removing
the check for dentry made more sense as debugfs_remove_recursive() seems to be
safe with null pointers. This will ensure that the unregistering clk would be
removed from the debug list in all the code paths.

Without this patch kernel would crash with log:
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0204000
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G    B          3.19.0-rc3-00007-g412f9ba-dirty #840
Hardware name: Qualcomm (Flattened Device Tree)
task: ed948000 ti: ed944000 task.ti: ed944000
PC is at strlen+0xc/0x40
LR is at __create_file+0x64/0x1dc
pc : [<c04ee604>]    lr : [<c049f1c4>]    psr: 60000013
sp : ed945e40  ip : ed945e50  fp : ed945e4c
r10: 00000000  r9 : c1006094  r8 : 00000000
r7 : 000041ed  r6 : 00000000  r5 : ed4af998  r4 : c11b5e28
r3 : 00000000  r2 : ed945e38  r1 : a0000013  r0 : 00000000
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
Control: 10c5787d  Table: 8020406a  DAC: 00000015
Process swapper/0 (pid: 1, stack limit = 0xed944248)
Stack: (0xed945e40 to 0xed946000)
5e40: ed945e7c ed945e50 c049f1c4 c04ee604 c0fc2fa4 00000000 ecb748c0 c11c2b80
5e60: c0beec04 0000011c c0fc2fa4 00000000 ed945e94 ed945e80 c049f3e0 c049f16c
5e80: 00000000 00000000 ed945eac ed945e98 c08cbc50 c049f3c0 ecb748c0 c11c2b80
5ea0: ed945ed4 ed945eb0 c0fc3080 c08cbc30 c0beec04 c107e1d8 ecdf0600 c107e1d8
5ec0: c107e1d8 ecdf0600 ed945f54 ed945ed8 c0208ed4 c0fc2fb0 c026a784 c04ee628
5ee0: ed945f0c ed945ef0 c0f5d600 c04ee604 c0f5d5ec ef7fcc7d c0b40ecc 0000011c
5f00: ed945f54 ed945f10 c026a994 c0f5d5f8 c04ecc00 00000007 ef7fcc95 00000007
5f20: c0e90744 c0dd0884 ed945f54 c106cde0 00000007 c117f8c0 0000011c c0f5d5ec
5f40: c1006094 c100609c ed945f94 ed945f58 c0f5de34 c0208e50 00000007 00000007
5f60: c0f5d5ec be9b5ae0 00000000 c117f8c0 c0af1680 00000000 00000000 00000000
5f80: 00000000 00000000 ed945fac ed945f98 c0af169c c0f5dd2c ed944000 00000000
5fa0: 00000000 ed945fb0 c020f298 c0af168c 00000000 00000000 00000000 00000000
5fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
5fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ebcc6d33 bfffca73
[<c04ee604>] (strlen) from [<c049f1c4>] (__create_file+0x64/0x1dc)
[<c049f1c4>] (__create_file) from [<c049f3e0>] (debugfs_create_dir+0x2c/0x34)
[<c049f3e0>] (debugfs_create_dir) from [<c08cbc50>] (clk_debug_create_one+0x2c/0x16c)
[<c08cbc50>] (clk_debug_create_one) from [<c0fc3080>] (clk_debug_init+0xdc/0x144)
[<c0fc3080>] (clk_debug_init) from [<c0208ed4>] (do_one_initcall+0x90/0x1e0)
[<c0208ed4>] (do_one_initcall) from [<c0f5de34>] (kernel_init_freeable+0x114/0x1e0)
[<c0f5de34>] (kernel_init_freeable) from [<c0af169c>] (kernel_init+0x1c/0xfc)
[<c0af169c>] (kernel_init) from [<c020f298>] (ret_from_fork+0x14/0x3c)
Code: c0b40ecc e1a0c00d e92dd800 e24cb004 (e5d02000)
---[ end trace b940e45b5e25c1e7 ]---

Fixes: 6314b6796e3c "clk: Don't hold prepare_lock across debugfs creation"
Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Michael Turquette <mturquette@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

clk: zynq: Force CPU_2X clock to be ungated

commit 3dccfecdb867fe35b305a4e493ef5652b7d9d4cb upstream.

The CPU_2X clock does not have a classical in-kernel user, but is,
amongst other things, required for OCM and debug access. Make sure this
clock is not mistakenly disabled during boot up by enabling it in the
platform's clock driver.

Fixes: 0ee52b157b8e 'clk: zynq: Add clock controller driver'
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Michael Turquette <mturquette@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fixed invalid assignment of 64bit mask to host dma_boundary for scatter gather segment boundary limit.

commit f76a610a8b4b6280eaedf48f3af9d5d74e418b66 upstream.

In reference to bug https://bugzilla.redhat.com/show_bug.cgi?id=1097141
Assert is seen with AMD cpu whenever calling pci_alloc_consistent.

[ 29.406183] ------------[ cut here ]------------
[ 29.410505] kernel BUG at lib/iommu-helper.c:13!

Signed-off-by: Minh Tran <minh.tran@emulex.com>
Fixes: 6733b39a1301b0b020bbcbf3295852e93e624cb1
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

wd719x: add missing .module to wd719x_template

commit 2ecf8e0ae28cb22d434e628c351c6193fd75fafa upstream.

wd719x_template is missing the .module field, causing module refcount
not to work, allowing to rmmod the driver while in use (mounted filesystem),
causing an oops.

Set .module to THIS_MODULE to fix the problem.

Signed-off-by: Ondrej Zary <linux@rainbow-software.org>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

nilfs2: fix potential memory overrun on inode

commit 957ed60b53b519064a54988c4e31e0087e47d091 upstream.

Each inode of nilfs2 stores a root node of a b-tree, and it turned out to
have a memory overrun issue:

Each b-tree node of nilfs2 stores a set of key-value pairs and the number
of them (in "bn_nchildren" member of nilfs_btree_node struct), as well as
a few other "bn_*" members.

Since the value of "bn_nchildren" is used for operations on the key-values
within the b-tree node, it can cause memory access overrun if a large
number is incorrectly set to "bn_nchildren".

For instance, nilfs_btree_node_lookup() function determines the range of
binary search with it, and too large "bn_nchildren" leads
nilfs_btree_node_get_key() in that function to overrun.

As for intermediate b-tree nodes, this is prevented by a sanity check
performed when each node is read from a drive, however, no sanity check
has been done for root nodes stored in inodes.

This patch fixes the issue by adding missing sanity check against b-tree
root nodes so that it's called when on-memory inodes are read from ifile,
inode metadata file.

Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/core: When marshaling ucma path from user-space, clear unused fields

commit c2be9dc0e0fa59cc43c2c7084fc42b430809a0fe upstream.

When marshaling a user path to the kernel struct ib_sa_path, we need
to zero smac and dmac and set the vlan id to the "no vlan" value.

This is to ensure that Ethernet attributes are not used with
InfiniBand QPs.

Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
Signed-off-by: Ilya Nelkenbaum <ilyan@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/core: Properly handle registration of on-demand paging MRs after dereg

commit 4fc701ead77ede96df3e8b3de13fdf2b1326ee5b upstream.

When the last on-demand paging MR is released the notifier count is
left non-zero so that concurrent page faults will have to abort. If a
new MR is then registered, the counter is reset. However, the decision
is made to put the new MR in the list waiting for the notifier count
to reach zero, before the counter is reset. An invalidation or another
MR registration can release the MR to handle page faults, but without
such an event the MR can wait forever.

The patch fixes this issue by adding a check whether the MR is the
first on-demand paging MR when deciding whether it is ready to handle
page faults. If it is the first MR, we know that there are no mmu
notifiers running in parallel to the registration.

Fixes: 882214e2b128 ("IB/core: Implement support for MMU notifiers regarding on demand paging regions")
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shachar Raindel <raindel@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/core: Fix deadlock on uverbs modify_qp error flow

commit 0fb8bcf022f19a375d7c4bd79ac513da8ae6d78b upstream.

The deadlock occurs in __uverbs_modify_qp: we take a lock (idr_read_qp)
and in case of failure in ib_resolve_eth_l2_attrs we don't release
it (put_qp_read). Fix that.

Fixes: ed4c54e5b4ba ("IB/core: Resolve Ethernet L2 addresses when modifying QP")
Signed-off-by: Moshe Lazer <moshel@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx4: Fix wrong usage of IPv4 protocol for multicast attach/detach

commit e9a7faf11af94957e5107b40af46c2e329541510 upstream.

The MLX4_PROT_IB_IPV4 protocol should only be used with RoCEv2 and such.
Removing this wrong usage allows to run multicast applications over RoCE.

Fixes: d487ee77740c ("IB/mlx4: Use IBoE (RoCE) IP based GIDs in the port GID table")
Reported-by: Carol Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx4: Fix memory leak in __mlx4_ib_modify_qp

commit bede98e781747623ae170667694a71ef19c6ba7f upstream.

In case handle_eth_ud_smac_index fails, we need to free the allocated resources.

Fixes: 2f5bb473681b ("mlx4: Add ref counting to port MAC table for RoCE")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/mlx5: Fix error code in get_port_caps()

commit f614fc15ae39ceb531586e3969f2b99fd23182a0 upstream.

The current code returns success when kmalloc() fails. It should
return an error code, -ENOMEM.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/iser: Use correct dma direction when unmapping SGs

commit c6c95ef4cec680f7a10aa425a9970744b35b6489 upstream.

We always unmap SGs with the same direction instead of unmapping
with the direction the mapping was done, fix that.

Fixes: 9a8b08fad2ef ("IB/iser: Generalize iser_unmap_task_data and [...]")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/iser: Fix memory regions possible leak

commit 6606e6a2ff2710b473838b291dc533cd8fc1471f upstream.

When teardown process starts during live IO, we need to keep the
memory regions pool (frmr/fmr) until all in-flight tasks are properly
released, since each task may return a memory region to the pool. In
order to do this, we pass a destroy flag to iser_free_ib_conn_res to
indicate we can destroy the device and the memory regions
pool. iser_conn_release will pass it as true and also DEVICE_REMOVAL
event (we need to let the device to properly remove).

Also, Since we conditionally call iser_free_rx_descriptors,
remove the extra check on iser_conn->rx_descs.

Fixes: 5426b1711fd0 ("IB/iser: Collapse cleanup and disconnect handlers")
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

IB/qib: Do not write EEPROM

commit 18c0b82a3e4501511b08d0e8676fb08ac08734a3 upstream.

This changeset removes all the code that allows the driver to write to
the EEPROM and update the recorded error counters and power on hours.

These two stats are unused and writing them exposes a timing risk
which could leave the EEPROM in a bad state preventing further normal
operation of the HCA.

Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Mitko Haralanov <mitko.haralanov@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

sg: fix read() error reporting

commit 3b524a683af8991b4eab4182b947c65f0ce1421b upstream.

Fix SCSI generic read() incorrectly returning success after detecting an
error.

Signed-off-by: Tony Battersby <tonyb@cybernetics.com>
Acked-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

locking/rtmutex: Avoid a NULL pointer dereference on deadlock

commit 8d1e5a1a1ccf5ae9d8a5a0ee7960202ccb0c5429 upstream.

With task_blocks_on_rt_mutex() returning early -EDEADLK we never
add the waiter to the waitqueue. Later, we try to remove it via
remove_waiter() and go boom in rt_mutex_top_waiter() because
rb_entry() gives a NULL pointer.

( Tested on v3.18-RT where rtmutex is used for regular mutex and I
tried to get one twice in a row. )

Not sure when this started but I guess 397335f004f4 ("rtmutex: Fix
deadlock detector for real") or commit 3d5c9340d194 ("rtmutex:
Handle deadlock detection smarter").

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1424187823-19600-1-git-send-email-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - One more Dell macine needs DELL1_MIC_NO_PRESENCE quirk

commit 70658b99490dd86cfdbf4fca117bbe2ef9a80d03 upstream.

BugLink: https://bugs.launchpad.net/bugs/1428947
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: oxfw: fix a condition and return code in start_stream()

commit f2b14c0bc510c6a8f67a4f36049deefe5d99a537 upstream.

The amdtp_stream_wait_callback() doesn't return minus value and
the return code is not for error code.

This commit fixes with a propper condition and an error code.

Fixes: f3699e2c7745 ('ALSA: oxfw: Change the way to start stream')
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Disable runtime PM for Panther Point again

commit de5d0ad506cb10ab143e2ffb9def7607e3671f83 upstream.

This is essentially a partial revert of the commit [b1920c21102a:
'ALSA: hda - Enable runtime PM on Panther Point']. There was a bug
report showing the HD-audio bus hang during runtime PM on HP Spectre
XT.

Reported-by: Dang Sananikone <dang.sananikone@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda: controller code - do not export static functions

commit 37ed398839fa3e0d2de77925097db7a370abb096 upstream.

It is a bad idea to export static functions. GCC for some platforms
shows errors like:

error: __ksymtab_azx_get_response causes a section type conflict

Signed-off-by: Jaroslav Kysela <perex@perex.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: fireworks/bebob/dice/oxfw: make it possible to shutdown safely

commit dec84316dd53c90e93b4ee849483bd4bd1e9a585 upstream.

A part of these drivers, especially BeBoB driver, are programmed to wait
some events. Thus the drivers should not destroy any data in .remove()
context.

This commit moves some destructors from 'struct fw_driver.remove()' to
'struct snd_card.private_free()' to shutdown safely.

Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: fireworks/bebob/dice/oxfw: allow stream destructor after releasing runtime

commit d23c2cc4485d10f0cedfef99dd2961d9652b1b3f upstream.

Currently stream destructor in each driver has a problem to be called in
a context in which sound card object is released, because the destructors
call amdtp_stream_pcm_abort() and touch PCM runtime data.

The PCM runtime data is destroyed in application's context with
snd_pcm_close(), on the other hand PCM substream data is destroyed after
sound card object is released, in most case after all of ALSA character
devices are released. When PCM runtime is destroyed and PCM substream is
remained, amdtp_stream_pcm_abort() touches PCM runtime data and causes
Null-pointer-dereference.

This commit changes stream destructors and allows each driver to call
it after releasing runtime.

Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: firewire-lib: remove reference counting

commit c6f224dc20ad959175c2dfec70b5a61c6503a793 upstream.

AMDTP helper functions increment/decrement reference counter for an
instance of FireWire unit, while it's complicated for each driver to
process error state.

In previous commit, each driver has the role of reference counting. This
commit removes this role from the helper function.

Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: fireworks/bebob/dice/oxfw: add reference-counting for FireWire unit

commit 12ed719291a953d443921f9cdb0ffee41066c340 upstream.

Fireworks and Dice drivers try to touch instances of FireWire unit after
sound card object is released, while references to the unit is decremented
in .remove(). When unplugging during streaming, sound card object is
released after .remove(), thus Fireworks and Dice drivers causes GPF or
Null-pointer-dereferencing to application processes because an instance of
FireWire unit was already released.

This commit adds reference-counting for FireWire unit in drivers to allow
them to touch an instance of FireWire unit after .remove(). In most case,
any operations after .remove() may be failed safely.

Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: hda - Add pin configs for ASUS mobo with IDT 92HD73XX codec

commit 6426460e5d87810e042962281fe3c1e8fc256162 upstream.

BIOS doesn't seem to set up pins for 5.1 and the SPDIF out, so we need
to give explicitly here.

Reported-and-tested-by: Misan Thropos <misanthropos@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

ALSA: pcm: Don't leave PREPARED state after draining

commit 70372a7566b5e552dbe48abdac08c275081d8558 upstream.

When a PCM draining is performed to an empty stream that has been
already in PREPARED state, the current code just ignores and leaves as
it is, although the drain is supposed to set all such streams to SETUP
state. This patch covers that overlooked case.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

serial: 8250: Revert "tty: serial: 8250_core: read only RX if there is something in the FIFO"

commit ca8bb4aefb932e3da105f28cbfba36d57a931081 upstream.

This reverts commit 0aa525d11859c1a4d5b78fdc704148e2ae03ae13.

The conditional RX-FIFO read seems to cause spurious interrupts and we
see just:
|serial8250: too much work for irq29

The previous behaviour was "default" for decades and Marvell's 88f6282 SoC
might not be the only that relies on it. Therefore the Omap fix is
reverted for now.

Fixes: 0aa525d11859 ("tty: serial: 8250_core: read only RX if there is
something in the FIFO")
Reported-By: Nicolas Schichan <nschichan@freebox.fr>
Debuged-By: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

tty: fix up atime/mtime mess, take four

commit f0bf0bd07943bfde8f5ac39a32664810a379c7d3 upstream.

This problem was taken care of three times already in
* b0de59b5733d18b0d1974a060860a8b5c1b36a2e (TTY: do not update
  atime/mtime on read/write),
* 37b7f3c76595e23257f61bd80b223de8658617ee (TTY: fix atime/mtime
  regression), and
* b0b885657b6c8ef63a46bc9299b2a7715d19acde (tty: fix up atime/mtime
  mess, take three)

But it still misses one point. As John Paul correctly points out, we
do not care about setting date. If somebody ever changes wall
time backwards (by mistake for example), tty timestamps are never
updated until the original wall time passes.

So check the absolute difference of times and if it large than "8
seconds or so", always update the time. That means we will update
immediatelly when changing time. Ergo, CAP_SYS_TIME can foul the
check, but it was always that way.

Thanks John for serving me this so nicely debugged.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Reported-by: John Paul Perry <john_paul.perry@alcatel-lucent.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>