The clock enable/disable codes for alarm have been removed from
commit 24e1455493da ("drivers/rtc/rtc-s3c.c: delete duplicate clock
control") and the clocks are disabled even if alarm is set, so alarm
interrupt can't happen.
The s3c_rtc_setaie function can be called several times with 'enabled'
argument having same value, so it needs to check whether clocks are
enabled or not.
Commit 718ba5b87343, moved the responsibility for unlocking the socket to
xs_tcp_setup_socket, meaning that the socket will be unlocked before we
know that it has finished trying to connect. The following patch is based on
an initial patch by Russell King to ensure that we delay clearing the
XPRT_CONNECTING flag until we either know that we failed to initiate
a connection attempt, or the connection attempt itself failed.
Fixes: 718ba5b87343 ("SUNRPC: Add helpers to prevent socket create from racing") Reported-by: Russell King <linux@arm.linux.org.uk> Reported-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Russell King <rmk+kernel@arm.linux.org.uk> Tested-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
It is rather pointless to test the value of transport->inet after
calling xs_reset_transport(), since it will always be zero, and
so we will never see any exponential back off behaviour.
Also don't force early connections for SOFTCONN tasks. If the server
disconnects us, we should respect the exponential backoff.
`perf stat -e sunrpc:svc_xprt_do_enqueue true` results in
Warning: unknown op '->'
Warning: [sunrpc:svc_xprt_do_enqueue] unknown op '->'
Similar warning for svc_handle_xprt as well.
Actually TP_printk() should never dereference an address saved in the ring
buffer that points somewhere in the kernel. There's no guarantee that that
object still exists (with the exception of static strings).
Therefore change all the arguments for TP_printk(), so that it references
values existing in the ring buffer only.
While doing that, also fix another possible bug when argument xprt could be
NULL and TP_fast_assign() tries to access it's elements.
Signed-off-by: Pratyush Anand <panand@redhat.com> Reviewed-by: Jeff Layton <jeff.layton@primarydata.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Fixes: 83a712e0afef "sunrpc: add some tracepoints around ..." Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
According to RFC5661 Section 18.2.4, CLOSE is supposed to return
the zero stateid. This means that nfs_clear_open_stateid_locked()
cannot assume that the result stateid will always match the 'other'
field of the existing open stateid when trying to determine a race
with a parallel OPEN.
Instead, we look at the argument, and check for matches.
If the ctime or mtime or change attribute have changed because
of an operation we initiated, we should make sure that we force
an attribute update. However we do not want to mark the page cache
for revalidation.
We should ensure that we always set the pgio_header's error field
if a READ or WRITE RPC call returns an error. The current code depends
on 'hdr->good_bytes' always being initialised to a large value, which
is not always done correctly by callers.
When this happens, applications may end up missing important errors.
- Switch back to using list_for_each_entry(). Fixes an incorrect test
for list NULL termination.
- Do not assume that lists are sorted.
- Finally, consider an existing entry to match if it consists of a subset
of the addresses in the new entry.
Chuck reports seeing cases where a GETATTR that happens to race
with an asynchronous WRITE is overriding the file size, despite
the attribute barrier being set by the writeback code.
The culprit turns out to be the check in nfs_ctime_need_update(),
which sees that the ctime is newer than the cached ctime, and
assumes that it is safe to override the attribute barrier.
This patch removes that override, and ensures that attribute
barriers are always respected.
NFSv4 takes O_EXCL as a sign that a setattr command should be sent,
probably to reset the timestamps.
When it was an O_RDONLY open, the SETATTR command does not
identify any actual attributes to change.
If no delegation was provided to the open, the SETATTR uses the
all-zeros stateid and the request is accepted (at least by the
Linux NFS server - no harm, no foul).
If a read-delegation was provided, this is used in the SETATTR
request, and a Netapp filer will justifiably claim
NFS4ERR_BAD_STATEID, which the Linux client takes as a sign
to retry - indefinitely.
So only treat O_EXCL specially if O_CREAT was also given.
It's possible that a DELEGRETURN could race with (e.g.) client expiry,
in which case we could end up putting the delegation hash reference more
than once.
Have unhash_delegation_locked return a bool that indicates whether it
was already unhashed. In the case of destroy_delegation we only
conditionally put the hash reference if that returns true.
The other callers of unhash_delegation_locked call it while walking
list_heads that shouldn't yet be detached. If we find that it doesn't
return true in those cases, then throw a WARN_ON as that indicates that
we have a partially hashed delegation, and that something is likely very
wrong.
Tested-by: Andrew W Elble <aweits@rit.edu> Tested-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When an open or lock stateid is hashed, we take an extra reference to
it. When we unhash it, we drop that reference. The code however does
not properly account for the case where we have two callers concurrently
trying to unhash the stateid. This can lead to list corruption and the
hash reference being put more than once.
Fix this by having unhash_ol_stateid use list_del_init on the st_perfile
list_head, and then testing to see if that list_head is empty before
releasing the hash reference. This means that some of the unhashing
wrappers now become bool return functions so we can test to see whether
the stateid was unhashed before we put the reference.
Reported-by: Andrew W Elble <aweits@rit.edu> Tested-by: Andrew W Elble <aweits@rit.edu> Reported-by: Anna Schumaker <Anna.Schumaker@netapp.com> Tested-by: Anna Schumaker <Anna.Schumaker@netapp.com> Signed-off-by: Jeff Layton <jeff.layton@primarydata.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently we'll respond correctly to a request for either
FS_LAYOUT_TYPES or LAYOUT_TYPES, but not to a request for both
attributes simultaneously.
Signed-off-by: Kinglong Mee <kinglongmee@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
pNFS writes don't return attributes, however that doesn't mean that we
should ignore the fact that they may be extending the file. This patch
ensures that if a write is seen to extend the file, then we always set
an attribute barrier, and update the cached file size.
While we are committing a transaction, it's possible the previous one is
still finishing its commit and therefore we wait for it to finish first.
However we were not checking if that previous transaction ended up getting
aborted after we waited for it to commit, so we ended up committing the
current transaction which can lead to fs corruption because the new
superblock can point to trees that have had one or more nodes/leafs that
were never durably persisted.
The following sequence diagram exemplifies how this is possible:
btrfs_write_and_wait_transaction() continues
writing ebs
--> fails writing eb X, we abort transaction N
and set bit BTRFS_FS_STATE_ERROR on
fs_info->fs_state, so no new transactions
can start after setting that bit
cleanup_transaction()
btrfs_cleanup_one_transaction()
wakes up task at CPU 1
continues, doesn't abort because
cur_trans->aborted (transaction N + 1)
is zero, and no checks for bit
BTRFS_FS_STATE_ERROR in fs_info->fs_state
are made
btrfs_write_and_wait_transaction(trans, root);
--> succeeds, no errors during writeback
write_ctree_super(trans, root, 0);
--> succeeds
--> we have now a superblock that points us
to some root that uses eb X, which was
never written to disk
In this scenario future attempts to read eb X from disk results in an
error message like "parent transid verify failed on X wanted Y found Z".
So fix this by aborting the current transaction if after waiting for the
previous transaction we verify that it was aborted.
Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Josef Bacik <jbacik@fb.com> Reviewed-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There was a race condition where during cleanup/release operation
on-going streaming would cause a kernel panic because the hardware
module was disabled prematurely with IRQ still pending.
Fixes: 417d2e507edc ("[media] media: platform: add VPFE capture driver support for AM437X") Signed-off-by: Benoit Parrot <bparrot@ti.com> Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Upon a S_FMT the input/requested frame size and pixel format is
overwritten by the current sub-device settings.
Fix this so application can actually set the frame size and format.
Commit 813f5c0ac5cc ("media: Change media device link_notify behaviour")
modified the media controller link setup notification API and updated the
OMAP3 ISP driver accordingly. As a side effect it introduced a bug by
turning power on after setting the link instead of before. This results in
sub-devices not being powered down in some cases when they should be. Fix
it.
The input_dev is already gone when the rc device is being unregistered
so checking for its presence only means that no remove uevent will be
generated.
Commit c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb") added
checks for page->pfmemalloc to __skb_fill_page_desc():
if (page->pfmemalloc && !page->mapping)
skb->pfmemalloc = true;
It assumes page->mapping == NULL implies that page->pfmemalloc can be
trusted. However, __delete_from_page_cache() can set set page->mapping
to NULL and leave page->index value alone. Due to being in union, a
non-zero page->index will be interpreted as true page->pfmemalloc.
So the assumption is invalid if the networking code can see such a page.
And it seems it can. We have encountered this with a NFS over loopback
setup when such a page is attached to a new skbuf. There is no copying
going on in this case so the page confuses __skb_fill_page_desc which
interprets the index as pfmemalloc flag and the network stack drops
packets that have been allocated using the reserves unless they are to
be queued on sockets handling the swapping which is the case here and
that leads to hangs when the nfs client waits for a response from the
server which has been dropped and thus never arrive.
The struct page is already heavily packed so rather than finding another
hole to put it in, let's do a trick instead. We can reuse the index
again but define it to an impossible value (-1UL). This is the page
index so it should never see the value that large. Replace all direct
users of page->pfmemalloc by page_is_pfmemalloc which will hide this
nastiness from unspoiled eyes.
The information will get lost if somebody wants to use page->index
obviously but that was the case before and the original code expected
that the information should be persisted somewhere else if that is
really needed (e.g. what SLAB and SLUB do).
[akpm@linux-foundation.org: fix blooper in slub] Fixes: c48a11c7ad26 ("netvm: propagate page->pfmemalloc to skb") Signed-off-by: Michal Hocko <mhocko@suse.com> Debugged-by: Vlastimil Babka <vbabka@suse.com> Debugged-by: Jiri Bohac <jbohac@suse.com> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: David Miller <davem@davemloft.net> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The system has non continuous RAM address:
BIOS-e820: [mem 0x0000001300000000-0x0000001cffffffff] usable
BIOS-e820: [mem 0x0000001d70000000-0x0000001ec7ffefff] usable
BIOS-e820: [mem 0x0000001f00000000-0x0000002bffffffff] usable
BIOS-e820: [mem 0x0000002c18000000-0x0000002d6fffefff] usable
BIOS-e820: [mem 0x0000002e00000000-0x00000039ffffffff] usable
So there are start sections in memory block not present. For example:
memory block : [0x2c18000000, 0x2c20000000) 512M
first three sections are not present.
The current register_mem_sect_under_node() assume first section is
present, but memory block section number range [start_section_nr,
end_section_nr] would include not present section.
For arch that support vmemmap, we don't setup memmap for struct page
area within not present sections area.
So skip the pfn range that belong to absent section.
[akpm@linux-foundation.org: simplification]
[rientjes@google.com: more simplification] Fixes: bdee237c0343 ("x86: mm: Use 2GB memory block size on large memory x86-64 systems") Fixes: 982792c782ef ("x86, mm: probe memory block size for generic x86 64bit") Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Tony Luck <tony.luck@intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Cc: Greg KH <greg@kroah.com> Cc: Ingo Molnar <mingo@elte.hu> Tested-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With the radeon driver loaded the HP Compaq dc5750
Small Form Factor machine fails to resume from suspend.
Adding a quirk similar to other devices avoids
the problem and the system resumes properly.
Signed-off-by: Jeffery Miller <jmiller@neverware.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This might lead to local privilege escalation (code execution as
kernel) for systems where the following conditions are met:
- CONFIG_CIFS_SMB2 and CONFIG_CIFS_POSIX are enabled
- a cifs filesystem is mounted where:
- the mount option "vers" was used and set to a value >=2.0
- the attacker has write access to at least one file on the filesystem
To attack this, an attacker would have to guess the target_tcon
pointer (but guessing wrong doesn't cause a crash, it just returns an
error code) and win a narrow race.
Signed-off-by: Jann Horn <jann@thejh.net> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If we had secondary hash flag set, we ended up modifying hash value in
the updatepp code path. Hence with a failed updatepp we will be using
a wrong hash value for the following hash insert. Fix this by
recomputing hash before insert.
Without this patch we can end up with using wrong slot number in linux
pte. That can result in us missing an hash pte update or invalidate
which can cause memory corruption or even machine check.
Fixes: 6d492ecc6489 ("powerpc/THP: Add code to handle HPTE faults for hugepages") Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The kernel does it, not the boot wrapper, which breaks with some
cross compilers that still default to ABI v1.
Fixes: 147c05168fc8 ("powerpc/boot: Add support for 64bit little endian wrapper") Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
vmx-crypto driver make use of some VSX instructions which are
only available if VSX is enabled. Running in cases where VSX
are not enabled vmx-crypto fails in a VSX exception.
In order to fix this enable_kernel_vsx() was added to turn on
VSX instructions for vmx-crypto.
Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
enable_kernel_vsx() function was commented since anything was using
it. However, vmx-crypto driver uses VSX instructions which are
only available if VSX is enable. Otherwise it rises an exception oops.
This patch uncomment enable_kernel_vsx() routine and makes it available.
Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The EPOW interrupt handler uses rtas_get_sensor(), which in turn
uses rtas_busy_delay() to wait for RTAS becoming ready in case it
is necessary. But rtas_busy_delay() is annotated with might_sleep()
and thus may not be used by interrupts handlers like the EPOW handler!
This leads to the following BUG when CONFIG_DEBUG_ATOMIC_SLEEP is
enabled:
Fix this issue by introducing a new rtas_get_sensor_fast() function
that does not use rtas_busy_delay() - and thus can only be used for
sensors that do not cause a BUSY condition - known as "fast" sensors.
The EPOW sensor is defined to be "fast" in sPAPR - mpe.
Fixes: 587f83e8dd50 ("powerpc/pseries: Use rtas_get_sensor in RAS code") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The powerpc kernel can be built to have either a 4K PAGE_SIZE or a 64K
PAGE_SIZE.
However when built with a 4K PAGE_SIZE there is an additional config
option which can be enabled, PPC_HAS_HASH_64K, which means the kernel
also knows how to hash a 64K page even though the base PAGE_SIZE is 4K.
This is used in one obscure configuration, to support 64K pages for SPU
local store on the Cell processor when the rest of the kernel is using
4K pages.
In this configuration, pte_pagesize_index() is defined to just pass
through its arguments to get_slice_psize(). However pte_pagesize_index()
is called for both user and kernel addresses, whereas get_slice_psize()
only knows how to handle user addresses.
This has been broken forever, however until recently it happened to
work. That was because in get_slice_psize() the large kernel address
would cause the right shift of the slice mask to return zero.
However in commit 7aa0727f3302 ("powerpc/mm: Increase the slice range to
64TB"), the get_slice_psize() code was changed so that instead of a
right shift we do an array lookup based on the address. When passed a
kernel address this means we index way off the end of the slice array
and return random junk.
That is only fatal if we happen to hit something non-zero, but when we
do return a non-zero value we confuse the MMU code and eventually cause
a check stop.
This fix is ugly, but simple. When we're called for a kernel address we
return 4K, which is always correct in this configuration, otherwise we
use the slice mask.
Fixes: 7aa0727f3302 ("powerpc/mm: Increase the slice range to 64TB") Reported-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The config space of some PCI devices can't be accessed when their
PEs are in frozen state. Otherwise, fenced PHB might be seen.
Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing
EEH_PE_CFG_BLOCKED is set automatically when the PE is put to
frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores
PCI device BARs with eeh_pe_restore_bars(), which then calls
eeh_ops->restore_config() to reinitialize the PCI device in
(OPAL) firmware. eeh_ops->restore_config() produces PCI config
access that causes fenced PHB. The problem was reported on below
adapter:
In the complete hotplug case, EEH PEs are supposed to be released
and set to NULL. Normally, this is done by eeh_remove_device(),
which is called from pcibios_release_device().
However, if something is holding a kref to the device, it will not
be released, and the PE will remain. eeh_add_device_late() has
a check for this which will explictly destroy the PE in this case.
This check in eeh_add_device_late() occurs after a call to
eeh_ops->probe(). On PowerNV, probe is a pointer to pnv_eeh_probe(),
which will exit without probing if there is an existing PE.
This means that on PowerNV, devices with outstanding krefs will not
be rediscovered by EEH correctly after a complete hotplug. This is
affecting CXL (CAPI) devices in the field.
Put the probe after the kref check so that the PE is destroyed
and affected devices are correctly rediscovered by EEH.
Fixes: d91dafc02f42 ("powerpc/eeh: Delay probing EEH device during hotplug") Cc: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit cca87d30 ("powerpc/pci: Refactor pci_dn") introduced pdn
list for SRIOV VFs. It means the pdn is be put into the child list
of its parent pdn when the pdn is created. When doing PCI hot
unplugging on pSeries, the PCI device node as well as its pdn are
released through procfs entry "powerpc/ofdt". Some one else grabs
the memory chunk of the pdn and update it accordingly. At the same
time, the pdn is still tracked in the child list of parent pdn. It
leads to corrupted child list in the parent pdn.
This fixes above issue by removing the pdn from the child list of
its parent pdn when the device node is detached from the system.
Note the pdn is free'd when the device node is released if the
device node is dynamic one. Otherwise, the device node as well
as the pdn won't be released.
The M3800 is very minor workstation variant of the XPS 15 which has
already been patched for this issue. I figured it's probably more
important for this version of the laptop to be patched than the
regular XPS as Dell sells is pre-configured with Ubuntu to be used as
a Linux workstation. I have tested the patch on my the hardware on
Linux 4.2.0.
Dell laptop has a series model to use the same codec but different subsystem ID.
At the same time they happens the white noise by login screen and headphone;
for fixing them together, I only can add these IDs to FIXUP function ALC292_FIXUP_DISABLE_AAMIX,
then try to solve such the similar issues.
According to the bug report, FSC Amilo laptops with ALC880 can detect
the headphone jack but currently the driver disables it. It's partly
intentionally, as non-working jack detect was reported in the past.
Let's enable now.
We've got bug reports showing the old systemd-logind (at least
system-210) aborting unexpectedly, and this turned out to be because
of an invalid error code from close() call to evdev devices. close()
is supposed to return only either EINTR or EBADFD, while the device
returned ENODEV. logind was overreacting to it and decided to kill
itself when an unexpected error code was received. What a tragedy.
The bad error code comes from flush fops, and actually evdev_flush()
returns ENODEV when device is disconnected or client's access to it is
revoked. But in these cases the fact that flush did not actually happen is
not an error, but rather normal behavior. For non-disconnected devices
result of flush is also not that interesting as there is no potential of
data loss and even if it fails application has no way of handling the
error. Because of that we are better off always returning success from
evdev_flush().
Also returning EINTR from flush()/close() is discouraged (as it is not
clear how application should handle this error), so let's stop taking
evdev->mutex interruptibly.
When running a guest with the architected timer disabled (with QEMU and
the kernel_irqchip=off option, for example), it is important to make
sure the timer gets turned off. Otherwise, the guest may try to
enable it anyway, leading to a screaming HW interrupt.
The fix is to unconditionally turn off the virtual timer on guest
exit.
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When restoring the system register state for an AArch32 guest at EL2,
writes to DACR32_EL2 may not be correctly synchronised by Cortex-A57,
which can lead to the guest effectively running with junk in the DACR
and running into unexpected domain faults.
This patch works around the issue by re-ordering our restoration of the
AArch32 register aliases so that they happen before the AArch64 system
registers. Ensuring that the registers are restored in this order
guarantees that they will be correctly synchronised by the core.
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Until b26e5fdac43c ("arm/arm64: KVM: introduce per-VM ops"),
kvm_vgic_map_resources() used to include a check on irqchip_in_kernel(),
and vgic_v2_map_resources() still has it.
But now vm_ops are not initialized until we call kvm_vgic_create().
Therefore kvm_vgic_map_resources() can being called without a VGIC,
and we die because vm_ops.map_resources is NULL.
Fixing this restores QEMU's kernel-irqchip=off option to a working state,
allowing to use GIC emulation in userspace.
Cortex-A53 processors <= r0p4 are affected by erratum #843419 which can
lead to a memory access using an incorrect address in certain sequences
headed by an ADRP instruction.
There is a linker fix to generate veneers for ADRP instructions, but
this doesn't work for kernel modules which are built as unlinked ELF
objects.
This patch adds a new config option for the erratum which, when enabled,
builds kernel modules with the mcmodel=large flag. This uses absolute
addressing for all kernel symbols, thereby removing the use of ADRP as
a PC-relative form of addressing. The ADRP relocs are removed from the
module loader so that we fail to load any potentially affected modules.
When entering the kernel at EL2, we fail to initialise the MDCR_EL2
register which controls debug access and PMU capabilities at EL1.
This patch ensures that the register is initialised so that all traps
are disabled and all the PMU counters are available to the host. When a
guest is scheduled, KVM takes care to configure trapping appropriately.
Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When saving/restoring the VFP registers from a compat (AArch32)
signal frame, we rely on the compat registers forming a prefix of the
native register file and therefore make use of copy_{to,from}_user to
transfer between the native fpsimd_state and the compat_vfp_sigframe.
Unfortunately, this doesn't work so well in a big-endian environment.
Our fpsimd save/restore code operates directly on 128-bit quantities
(Q registers) whereas the compat_vfp_sigframe represents the registers
as an array of 64-bit (D) registers. The architecture packs the compat D
registers into the Q registers, with the least significant bytes holding
the lower register. Consequently, we need to swap the 64-bit halves when
converting between these two representations on a big-endian machine.
This patch replaces the __copy_{to,from}_user invocations in our
compat VFP signal handling code with explicit __put_user loops that
operate on 64-bit values and swap them accordingly.
The linear region size of a 39-bit VA kernel is only 256 GB, which
may be insufficient to cover all of system RAM, even on platforms
that have much less than 256 GB of memory but which is laid out
very sparsely.
So make sure we clip the memory we will not be able to map before
installing it into the memblock memory table, by setting
MAX_MEMBLOCK_ADDR accordingly.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When parsing the memory nodes to populate the memblock memory
table, we check against high and low limits and clip any memory
that exceeds either one of them.
However, for arm64, the high limit of (phys_addr_t)~0 is not very
meaningful, since phys_addr_t is 64 bits (i.e., no limit) but there
may be other constraints that limit the memory ranges that we can
support.
So rename MAX_PHYS_ADDR to MAX_MEMBLOCK_ADDR (for clarity) and only
define it if the arch does not supply a definition of its own.
Acked-by: Rob Herring <robh@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a task calls execve(), its FP/SIMD state is flushed so that
none of the original program state is observeable by the incoming
program.
However, since this flushing consists of setting the in-memory copy
of the FP/SIMD state to all zeroes, the CPU field is set to CPU 0 as
well, which indicates to the lazy FP/SIMD preserve/restore code that
the FP/SIMD state does not need to be reread from memory if the task
is scheduled again on CPU 0 without any other tasks having entered
userland (or used the FP/SIMD in kernel mode) on the same CPU in the
mean time. If this happens, the FP/SIMD state of the old program will
still be present in the registers when the new program starts.
So set the CPU field to the invalid value of NR_CPUS when performing
the flush, by calling fpsimd_flush_task_state().
Reported-by: Chunyan Zhang <chunyan.zhang@spreadtrum.com> Reported-by: Janet Liu <janet.liu@spreadtrum.com> Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
f6e63f9 ext4: fold ext4_nojournal_sops into ext4_sops bb04457 ext4: support freezing ext2 (nojournal) file systems 9ca9238 ext4: Use separate super_operations structure for no_journal filesystems
ext4 started setting needs_recovery on filesystems without journals
when they are unfrozen. This makes no sense, and in fact confuses
blkid to the point where it doesn't recognize the filesystem at all.
(freeze ext2; unfreeze ext2; run blkid; see no output; run dumpe2fs,
see needs_recovery set on fs w/ no journal).
To fix this, don't manipulate the INCOMPAT_RECOVER feature on
filesystems without journals.
Reported-by: Stu Mark <smark@datto.com> Reviewed-by: Jan Kara <jack@suse.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Currently the first thing we do in cxl_probe is to grab a reference
on the pci device. Later on, we call device_register on our adapter.
In our remove path, we call device_unregister, but we never call
pci_dev_put. We therefore leak the device every time we do a
reflash.
device_register/unregister is sufficient to hold the reference.
Therefore, drop the call to pci_dev_get.
Here's why this is safe.
The proposed cxl_probe(pdev) calls cxl_adapter_init:
a) init calls cxl_adapter_alloc, which creates a struct cxl,
conventionally called adapter. This struct contains a
device entry, adapter->dev.
b) init calls cxl_configure_adapter, where we set
adapter->dev.parent = &dev->dev (here dev is the pci dev)
So at this point, the cxl adapter's device's parent is the PCI
device that I want to be refcounted properly.
c) init calls cxl_register_adapter
*) cxl_register_adapter calls device_register(&adapter->dev)
So now we're in device_register, where dev is the adapter device, and
we want to know if the PCI device is safe after we return.
device_register(&adapter->dev) calls device_initialize() and then
device_add().
device_add() does a get_device(). device_add() also explicitly grabs
the device's parent, and calls get_device() on it:
parent = get_device(dev->parent);
So therefore, device_register() takes a lock on the parent PCI dev,
which is what pci_dev_get() was guarding. pci_dev_get() can therefore
be safely removed.
Fixes: f204e0b8cedd ("cxl: Driver code for powernv PCIe based cards for userspace access") Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
cxl_reset currently PERSTs the slot, and then repeatedly tries to
read MMIO space in order to kick off EEH.
There are 2 problems with this: it's unnecessary, and it's racy.
It's unnecessary because the PERST will bring down the PHB link.
That will be picked up by the CAPP, which will send out an HMI.
Skiboot, noticing an HMI from the CAPP, will send an OPAL
notification to the kernel, which will trigger EEH recovery.
It's also racy: the EEH recovery triggered by the CAPP will
eventually cause the MMIO space to have its mapping invalidated
and the pointer NULLed out. This races with our attempt to read
the MMIO space. This is causing OOPSes in testing.
Simply drop all the attempts to force EEH detection, and trust
that Skiboot will send the notification and that we'll act on it.
The Skiboot code to send the EEH notification has been in Skiboot
for as long as CAPP recovery has been supported, so we don't need
to worry about breaking obscure setups with ancient firmware.
Cc: Ryan Grimm <grimm@linux.vnet.ibm.com> Fixes: 62fa19d4b4fd ("cxl: Add ability to reset the card") Signed-off-by: Daniel Axtens <dja@axtens.net> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We already set a station to be associated when peering completes, both
in user space and in the kernel. Thus we should always have an
associated sta before sending data frames to that station.
Failure to check assoc state can cause crashes in the lower-level driver
due to transmitting unicast data frames before driver sta structures
(e.g. ampdu state in ath9k) are initialized. This occurred when
forwarding in the presence of fixed mesh paths: frames were transmitted
to stations with whom we hadn't yet completed peering.
Reported-by: Alexis Green <agreen@cococorp.com> Tested-by: Jesse Jones <jjones@cococorp.com> Signed-off-by: Bob Copeland <me@bobcopeland.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit c8a34581ec09 ("MIPS: Emulate the BC1{EQ,NE}Z FPU instructions")
added support for emulating the new R6 BC1{EQ,NE}Z branches but it missed
the case where the instruction that caused the exception was not on a DS.
The temperature registers appear to report values in degrees Celsius
while the hwmon API mandates values to be exposed in millidegrees
Celsius. Do the conversion so that the values reported by "sensors"
are correct.
Fixes: aed93e0bf493 ("tg3: Add hwmon support for temperature") Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When initializing igb driver (e.g. 82576, I350), IGB_FLAG_QUEUE_PAIRS is
set if adapter->rss_queues exceeds half of max_rss_queues in
igb_init_queue_configuration().
On the other hand, IGB_FLAG_QUEUE_PAIRS is not set even if the number of
queues exceeds half of max_combined in igb_set_channels() when changing
the number of queues by "ethtool -L".
In this case, if numvecs is larger than MAX_MSIX_ENTRIES (10), the size
of adapter->msix_entries[], an overflow can occur in
igb_set_interrupt_capability(), which in turn leads to an oops.
Fix this problem as follows:
- When changing the number of queues by "ethtool -L", set
IGB_FLAG_QUEUE_PAIRS in the same way as initializing igb driver.
- When increasing the size of q_vector, reallocate it appropriately.
(With IGB_FLAG_QUEUE_PAIRS set, the size of q_vector gets larger.)
Another possible way to fix this problem is to cap the queues at its
initial number, which is the number of the initial online cpus. But this
is not the optimal way because we cannot increase queues when another
cpu becomes online.
Note that before commit cd14ef54d25b ("igb: Change to use statically
allocated array for MSIx entries"), this problem did not cause oops
but just made the number of queues become 1 because of entering msi_only
mode in igb_set_interrupt_capability().
Fixes: 907b7835799f ("igb: Add ethtool support to configure number of channels") Signed-off-by: Shota Suzuki <suzuki_shota_t3@lab.ntt.co.jp> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In routine _rtl8821ae_set_media_status(), an incorrect mask results in a test
for AP status to always be false. Similar bugs were fixed in rtl8192cu and
rtl8192de, but this instance was missed at that time.
Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Cc: David Binderman <dcb314@hotmail.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In the logic in the initial commit of unshare made creating a new
thread group for a process, contingent upon creating a new memory
address space for that process. That is wrong. Two separate
processes in different thread groups can share a memory address space
and clone allows creation of such proceses.
This is significant because it was observed that mm_users > 1 does not
mean that a process is multi-threaded, as reading /proc/PID/maps
temporarily increments mm_users, which allows other processes to
(accidentally) interfere with unshare() calls.
Correct the check in check_unshare_flags() to test for
!thread_group_empty() for CLONE_THREAD, CLONE_SIGHAND, and CLONE_VM.
For sighand->count > 1 for CLONE_SIGHAND and CLONE_VM.
For !current_is_single_threaded instead of mm_users > 1 for CLONE_VM.
By using the correct checks in unshare this removes the possibility of
an accidental denial of service attack.
Additionally using the correct checks in unshare ensures that only an
explicit unshare(CLONE_VM) can possibly trigger the slow path of
current_is_single_threaded(). As an explict unshare(CLONE_VM) is
pointless it is not expected there are many applications that make
that call.
Fixes: b2e0d98705e60e45bbb3c0032c48824ad7ae0704 userns: Implement unshare of the user namespace Reported-by: Ricky Zhou <rickyz@chromium.org> Reported-by: Kees Cook <keescook@chromium.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There may be lots of pending requests so that the buffer of PAGE_SIZE
can't hold them at all.
One typical example is scsi-mq, the queue depth(.can_queue) of
scsi_host and blk-mq is quite big but scsi_device's queue_depth
is a bit small(.cmd_per_lun), then it is quite easy to have lots
of pending requests in hw queue.
This patch fixes the following warning and the related memory
destruction.
st21nfca_hci_load_session() calls kfree_skb() on unitialized
variables skb_pipe_info and skb_pipe_list if the call to
nfc_hci_connect_gate() failed. Reword the error path to not use
these variables when they are not initialized. While at it, there
seemed to be a memory leak because skb_pipe_info was only freed
once, after the for-loop, even though several ones were created
by nfc_hci_send_cmd.
Fixes: ec03ff1a8f9a
("NFC: st21nfca: Remove skb_pipe_list and skb_pipe_info
useless allocation")
Acked-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The broken build is caused by the commit fe4407c0dc58
("ARM: rockchip: fix the CPU soft reset").
Signed-off-by: Caesar Wang <wxt@rock-chips.com>
The breakage was a result of it being wrongly merged in my branch with
the cache invalidation rework from Russell 02b4e2756e01c
("ARM: v7 setup function should invalidate L1 cache").
Many file systems that implement the show_options hook fail to correctly
escape their output which could lead to unescaped characters (e.g. new
lines) leaking into /proc/mounts and /proc/[pid]/mountinfo files. This
could lead to confusion, spoofed entries (resulting in things like
systemd issuing false d-bus "mount" notifications), and who knows what
else. This looks like it would only be the root user stepping on
themselves, but it's possible weird things could happen in containers or
in other situations with delegated mount privileges.
Here's an example using overlay with setuid fusermount trusting the
contents of /proc/mounts (via the /etc/mtab symlink). Imagine the use
of "sudo" is something more sneaky:
This fixes the problem by adding new seq_show_option and
seq_show_option_n helpers, and updating the vulnerable show_option
handlers to use them as needed. Some, like SELinux, need to be open
coded due to unusual existing escape mechanisms.
[akpm@linux-foundation.org: add lost chunk, per Kees]
[keescook@chromium.org: seq_show_option should be using const parameters] Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Jan Kara <jack@suse.com> Acked-by: Paul Moore <paul@paul-moore.com> Cc: J. R. Okajima <hooanon05g@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Before the make_empty_dir_inode calls were introduce into proc, sysfs,
and sysctl those directories when stated reported an i_size of 0.
make_empty_dir_inode started reporting an i_size of 2. At least one
userspace application depended on stat returning i_size of 0. So
modify make_empty_dir_inode to cause an i_size of 0 to be reported for
these directories.
The unregister path of platform_device is broken. On registration, it
will register all resources with either a parent already set, or
type==IORESOURCE_{IO,MEM}. However, on unregister it will release
everything with type==IORESOURCE_{IO,MEM}, but ignore the others. There
are also cases where resources don't get registered in the first place,
like with devices created by of_platform_populate()*.
Fix the unregister path to be symmetrical with the register path by
checking the parent pointer instead of the type field to decide which
resources to unregister. This is safe because the upshot of the
registration path algorithm is that registered resources have a parent
pointer, and non-registered resources do not.
* It can be argued that of_platform_populate() should be registering
it's resources, and they argument has some merit. However, there are
quite a few platforms that end up broken if we try to do that due to
overlapping resources in the device tree. Until that is fixed, we need
to solve the immediate problem.
Cc: Pantelis Antoniou <pantelis.antoniou@konsulko.com> Cc: Wolfram Sang <wsa@the-dreams.de> Cc: Rob Herring <robh@kernel.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Signed-off-by: Grant Likely <grant.likely@linaro.org> Tested-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Tested-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nick Meier reported a regression with HyperV that "
After rebooting the VM, the following messages are logged in syslog
when trying to load the tulip driver:
tulip: Linux Tulip drivers version 1.1.15 (Feb 27, 2007)
tulip: 0000:00:0a.0: PCI INT A: failed to register GSI
tulip: Cannot enable tulip board #0, aborting
tulip: probe of 0000:00:0a.0 failed with error -16
Errors occur in 3.19.0 kernel
Works in 3.17 kernel.
"
According to the ACPI dump file posted by Nick at
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072
And in DSDT table, we have _PRT method to define PCI interrupts, which
eventually goes to:
Name (PRSA, ResourceTemplate ()
{
IRQ (Level, ActiveLow, Shared, )
{3,4,5,7,9,10,11,12,14,15}
})
Name (PRSB, ResourceTemplate ()
{
IRQ (Level, ActiveLow, Shared, )
{3,4,5,7,9,10,11,12,14,15}
})
Name (PRSC, ResourceTemplate ()
{
IRQ (Level, ActiveLow, Shared, )
{3,4,5,7,9,10,11,12,14,15}
})
Name (PRSD, ResourceTemplate ()
{
IRQ (Level, ActiveLow, Shared, )
{3,4,5,7,9,10,11,12,14,15}
})
According to the MADT and DSDT tables, IRQ 9 may be used for:
1) ACPI SCI in level, high mode
2) PCI legacy IRQ in level, low mode
So there's a conflict in polarity setting for IRQ 9.
Prior to commit cd68f6bd53cf ("x86, irq, acpi: Get rid of special
handling of GSI for ACPI SCI"), ACPI SCI is handled specially and
there's no check for conflicts between ACPI SCI and PCI legagy IRQ.
And it seems that the HyperV hypervisor doesn't make use of the
polarity configuration in IOAPIC entry, so it just works.
Commit cd68f6bd53cf gets rid of the specially handling of ACPI SCI,
and then the pin attribute checking code discloses the conflicts
between ACPI SCI and PCI legacy IRQ on HyperV virtual machine,
and rejects the request to assign IRQ9 to PCI devices.
So penalize legacy IRQ used by ACPI SCI and mark it unusable if ACPI
SCI attributes conflict with PCI IRQ attributes.
Please refer to following links for more information:
https://bugzilla.kernel.org/show_bug.cgi?id=101301
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1440072
Fixes: cd68f6bd53cf ("x86, irq, acpi: Get rid of special handling of GSI for ACPI SCI") Reported-and-tested-by: Nick Meier <nmeier@microsoft.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The watchdog irq is actually SPI 79, which translates to the original
111 in the manual where the SPI irqs start at 32.
The current dw_wdt driver does not use the irq at all, so this issue
never surfaced. Nevertheless fix this for a time we want to use the irq.
We need different orderings when turning a core on and turning a core
off. In one case we need to assert reset before turning power off.
In ther other case we need to turn power on and the deassert reset.
In general, the correct flow is:
CPU off:
reset_control_assert
regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), BIT(pd))
wait_for_power_domain_to_turn_off
CPU on:
regmap_update_bits(pmu, PMU_PWRDN_CON, BIT(pd), 0)
wait_for_power_domain_to_turn_on
reset_control_deassert
This is needed for stressing CPU up/down, as per:
cd /sys/devices/system/cpu/
for i in $(seq 10000); do
echo "================= $i ============"
for j in $(seq 100); do
while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" != "000"" ]]
echo 0 > cpu1/online
echo 0 > cpu2/online
echo 0 > cpu3/online
done
while [[ "$(cat cpu1/online)$(cat cpu2/online)$(cat cpu3/online)" != "111" ]]; do
echo 1 > cpu1/online
echo 1 > cpu2/online
echo 1 > cpu3/online
done
done
done
The following is reproducable log:
[34466.186812] PM: noirq suspend of devices complete after 0.669 msecs
[34466.186824] Disabling non-boot CPUs ...
[34466.187509] CPU1: shutdown
[34466.188672] CPU2: shutdown
[34473.736627] Kernel panic - not syncing:Watchdog detected hard LOCKUP on cpu 0
.......
or others similar log:
.......
[ 4072.454453] CPU1: shutdown
[ 4072.504436] CPU2: shutdown
[ 4072.554426] CPU3: shutdown
[ 4072.577827] CPU1: Booted secondary processor
[ 4072.582611] CPU2: Booted secondary processor
<hang>
Tested by cpu up/down scripts, the results told us need delay more time
before write the sram. The wait time is affected by many aspects
(e.g: cpu frequency, bootrom frequency, sram frequency, bus speed, ...).
Although the cpus other than cpu0 will write the sram, the speedy is
no the same as cpu0, if the cpu0 early wake up, perhaps the other cpus
can't startup. As we know, the cpu0 can wake up when the cpu1/2/3 write
the 'sram+4/8' and send the sev.
Anyway.....
At the moment, 1ms delay will be happy work for cpu up/down scripts test.
Signed-off-by: Caesar Wang <wxt@rock-chips.com> Reviewed-by: Doug Anderson <dianders@chromium.org> Reviewed-by: Kever Yang <kever.yang@rock-chips.com> Fixes: 3ee851e212d0 ("ARM: rockchip: add basic smp support for rk3288") Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Legacy IPs like PWMSS, present under l4per2_7xx_clkdm, cannot support
smart-idle when its clock domain is in HW_AUTO on DRA7 SoCs. Hence,
program clock domain to SW_WKUP.
Signed-off-by: Vignesh R <vigneshr@ti.com> Acked-by: Tero Kristo <t-kristo@ti.com> Reviewed-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Paul Walmsley <paul@pwsan.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
After the commit abc0b1447d49 ("drm: Perform basic sanity checks on
probed modes"), proper clock-frequency becomes mandatory for
validating the mode of panel. The display does not work if there is
no mode validated. Also, this clock-frequency must be set
appropriately for getting required frame rate.
Fixes: abc0b1447d49 ("drm: Perform basic sanity checks on probed modes") Signed-off-by: Hyungwon Hwang <human.hwang@samsung.com> Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Sigend-off-by: Kukjin Kim <kgene@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since v3.18, attempts to deliver IRQ0 are rejected, breaking orion5x.
Fix this by increasing all interrupts by one, as did 5d6bed2a9c8b for
dove. Also, force MULTI_IRQ_HANDLER for all orion platforms (including
dove) as the specific handler is needed to shift back IRQ numbers by
one.
[gregory.clement@free-electrons.com]: moved the select
MULTI_IRQ_HANDLER from PLAT_ORION_LEGACY to ARCH_ORION5X as it broke
the build for dove.
Fixes: a71b092a9c68 ("ARM: Convert handle_IRQ to use __handle_domain_irq") Signed-off-by: Benjamin Cama <benoar@dolka.fr> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Tested-by: Detlef Vollmann <dv@vollmann.ch> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the internal call to of_address_to_resource() fails, we end up
looping forever in of_find_matching_node_by_address(). This can be
caused by a defective device tree, or calling with an incorrect
matches argument.
Fix by calling of_find_matching_node() unconditionally at the end of
the loop.
Signed-off-by: David Daney <david.daney@cavium.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
regulator_disable of pbias always writes '0' to the enable_reg.
However actual disable value of pbias regulator is not always '0'.
Fix it by populating the disable_val in pbias_reg_info for the
various platforms and assign it to the disable_val of
pbias regulator descriptor. This will be used by
regulator_disable_regmap while disabling pbias regulator.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
parport_find_base() will implicitly do parport_get_port() which
increases the refcount. Then parport_register_device() will again
increment the refcount. But while unloading the module we are only
doing parport_unregister_device() decrementing the refcount only once.
We add an parport_put_port() to neutralize the effect of
parport_get_port().
Once the module process a transfer in irq mode, the next poll transfer
will not work because the transmitter is left in inhibited state.
Fixes: 22417352f6b7f623 (Use polling mode on small transfers) Reported-by: Edward Kigwana <ekigwana@scires.com> Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
call12 can't be safely used as the first call in the inline function,
because the compiler does not extend the stack frame of the bounding
function accordingly, which may result in corruption of local variables.
If a call needs to be done, do call8 first followed by call12.
For pure assembly code in _switch_to increase stack frame size of the
bounding function.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Userspace return code may skip restoring THREADPTR register if there are
no registers that need to be zeroed. This leads to spurious failures in
libc NPTL tests.
Always restore THREADPTR on return to userspace.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When kvm_set_msr_common() handles a guest's write to
MSR_IA32_TSC_ADJUST, it will calcuate an adjustment based on the data
written by guest and then use it to adjust TSC offset by calling a
call-back adjust_tsc_offset(). The 3rd parameter of adjust_tsc_offset()
indicates whether the adjustment is in host TSC cycles or in guest TSC
cycles. If SVM TSC scaling is enabled, adjust_tsc_offset()
[i.e. svm_adjust_tsc_offset()] will first scale the adjustment;
otherwise, it will just use the unscaled one. As the MSR write here
comes from the guest, the adjustment is in guest TSC cycles. However,
the current kvm_set_msr_common() uses it as a value in host TSC
cycles (by using true as the 3rd parameter of adjust_tsc_offset()),
which can result in an incorrect adjustment of TSC offset if SVM TSC
scaling is enabled. This patch fixes this problem.
The reference (R) and change (C) bits in a HPT entry can be set by
hardware at any time up until the HPTE is invalidated and the TLB
invalidation sequence has completed. This means that when removing
a HPTE, we need to read the HPTE after the invalidation sequence has
completed in order to obtain reliable values of R and C. The code
in kvmppc_do_h_remove() used to do this. However, commit 6f22bd3265fb
("KVM: PPC: Book3S HV: Make HTAB code LE host aware") removed the
read after invalidation as a side effect of other changes. This
restores the read of the HPTE after invalidation.
The user-visible effect of this bug would be that when migrating a
guest, there is a small probability that a page modified by the guest
and then unmapped by the guest might not get re-transmitted and thus
the destination might end up with a stale copy of the page.
Fixes: 6f22bd3265fb Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The code that handles the case when we receive a H_DOORBELL interrupt
has a comment which says "Hypervisor doorbell - exit only if host IPI
flag set". However, the current code does not actually check if the
host IPI flag is set. This is due to a comparison instruction that
got missed.
As a result, the current code performs the exit to host only
if some sibling thread or a sibling sub-core is exiting to the
host. This implies that, an IPI sent to a sibling core in
(subcores-per-core != 1) mode will be missed by the host unless the
sibling core is on the exit path to the host.
This patch adds the missing comparison operation which will ensure
that when HOST_IPI flag is set, we unconditionally exit to the host.
Fixes: 66feed61cdf6 Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We got the bug that qemu complained with "KVM: unknown exit, hardware
reason 31" and KVM shown these info:
[84245.284948] EPT: Misconfiguration.
[84245.285056] EPT: GPA: 0xfeda848
[84245.285154] ept_misconfig_inspect_spte: spte 0x5eaef50107 level 4
[84245.285344] ept_misconfig_inspect_spte: spte 0x5f5fadc107 level 3
[84245.285532] ept_misconfig_inspect_spte: spte 0x5141d18107 level 2
[84245.285723] ept_misconfig_inspect_spte: spte 0x52e40dad77 level 1
This is because we got a mmio #PF and the handler see the mmio spte becomes
normal (points to the ram page)
However, this is valid after introducing fast mmio spte invalidation which
increases the generation-number instead of zapping mmio sptes, a example
is as follows:
1. QEMU drops mmio region by adding a new memslot
2. invalidate all mmio sptes
3.
VCPU 0 VCPU 1
access the invalid mmio spte
access the region originally was MMIO before
set the spte to the normal ram map
mmio #PF
check the spte and see it becomes normal ram mapping !!!
This patch fixes the bug just by dropping the check in mmio handler, it's
good for backport. Full check will be introduced in later patches
Reported-by: Pavel Shirshov <ru.pchel@gmail.com> Tested-by: Pavel Shirshov <ru.pchel@gmail.com> Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>