[WHAT & HOW]
Functions dp_enable_link_phy and dp_disable_link_phy can pass link_res
without initializing hpo_dp_link_enc and it is necessary to check for
null before dereferencing.
This fixes 1 FORWARD_NULL issue reported by Coverity.
Fixes: 0beca868cde8 ("drm/amd/display: Check link_res->hpo_dp_link_enc before using it") Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The fields in the hist_entry are filled on-demand which means they only
have meaningful values when relevant sort keys are used.
So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
the hist entry can be garbage. So it shouldn't access it
unconditionally.
I got a segfault, when I wanted to see cgroup profiles.
$ sudo perf record -a --all-cgroups --synth=cgroup true
$ sudo perf report -s cgroup
Program received signal SIGSEGV, Segmentation fault.
0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
48 return RC_CHK_ACCESS(map)->dso;
(gdb) bt
#0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
#1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
#2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
#3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
at util/hist.c:644
#4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
#5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
#6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
#7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
at util/hist.c:1260
#8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
machine=0x5555560388e8) at builtin-report.c:334
#9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
#10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
#11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
#12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
#13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
#14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
#15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
#16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
at util/session.c:780
#17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
file_path=0x555556038ff0 "perf.data") at util/session.c:1406
As you can see the entry->ms.map was NULL even if he->ms.map has a
value. This is because 'sym' sort key is not given, so it cannot assume
whether he->ms.sym and entry->ms.sym is the same. I only checked the
'sym' sort key here as it implies 'dso' behavior (so maps are the same).
Fixes: ac01c8c4246546fd ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Matt Fleming <matt@readmodwrite.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240826221045.1202305-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This reverts commit a15268787b79 ("drm/amd/display: Avoid overflow assignment in link_dp_cts")
Due to regression causing DPMS hang.
Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Gabe Teeger <Gabe.Teeger@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since commit 3f8ca2e115e5 ("vhost/scsi: Extract common handling code
from control queue handler") a null pointer dereference bug can be
triggered when guest sends an SCSI AN request.
In vhost_scsi_ctl_handle_vq(), `vc.target` is assigned with
`&v_req.tmf.lun[1]` within a switch-case block and is then passed to
vhost_scsi_get_req() which extracts `vc->req` and `tpg`. However, for
a `VIRTIO_SCSI_T_AN_*` request, tpg is not required, so `vc.target` is
set to NULL in this branch. Later, in vhost_scsi_get_req(),
`vc->target` is dereferenced without being checked, leading to a null
pointer dereference bug. This bug can be triggered from guest.
When this bug occurs, the vhost_worker process is killed while holding
`vq->mutex` and the corresponding tpg will remain occupied
indefinitely.
In rxrpc_open_socket(), it sets up the socket and then sets up the I/O
thread that will handle it. This is a problem, however, as there's a gap
between the two phases in which a packet may come into rxrpc_encap_rcv()
from the UDP packet but we oops when trying to wake the not-yet created I/O
thread.
As a quick fix, just make rxrpc_encap_rcv() discard the packet if there's
no I/O thread yet.
A better, but more intrusive fix would perhaps be to rearrange things such
that the socket creation is done by the I/O thread.
Fixes: a275da62e8c1 ("rxrpc: Create a per-local endpoint receive queue and I/O thread") Signed-off-by: David Howells <dhowells@redhat.com>
cc: yuxuanzhe@outlook.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: Simon Horman <horms@kernel.org>
cc: linux-afs@lists.infradead.org Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20241001132702.3122709-2-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
When a battery hook returns an error when adding a new battery, then
the battery hook is automatically unregistered.
However the battery hook provider cannot know that, so it will later
call battery_hook_unregister() on the already unregistered battery
hook, resulting in a crash.
Fix this by using the list head to mark already unregistered battery
hooks as already being unregistered so that they can be ignored by
battery_hook_unregister().
Fixes: fa93854f7a7e ("battery: Add the battery hooking API") Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://patch.msgid.link/20241001212835.341788-3-W_Armin@gmx.de Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Move the conditional locking from __battery_hook_unregister()
into battery_hook_unregister() and rename the low-level function
to simplify the locking during battery hook removal.
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Pali Rohár <pali@kernel.org> Signed-off-by: Armin Wolf <W_Armin@gmx.de> Link: https://patch.msgid.link/20241001212835.341788-2-W_Armin@gmx.de Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Stable-dep-of: 76959aff14a0 ("ACPI: battery: Fix possible crash when unregistering a battery hook") Signed-off-by: Sasha Levin <sashal@kernel.org>
RTL8125 added fields to the tally counter, what may result in the chip
dma'ing these new fields to unallocated memory. Therefore make sure
that the allocated memory area is big enough to hold all of the
tally counter values, even if we use only parts of it.
Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/741d26a9-2b2b-485d-91d9-ecb302e345b5@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
According to the datasheet, both pressure and temperature can go up to
oversampling x32. With this option, the maximum measurement time is not
80ms (this is for press x32 and temp x2), but it is 130ms nominal
(calculated from table 3.9.2) and since most of the maximum values
are around +15%, it is configured to 150ms.
Fixes: 8d329309184d ("iio: pressure: bmp280: Add support for BMP380 sensor family") Signed-off-by: Vasileios Amoiridis <vassilisamir@gmail.com> Link: https://patch.msgid.link/20240711211558.106327-3-vassilisamir@gmail.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Change the rest of the defines and function names that are
used specifically by the BME280 humidity sensor to BME280
as it is done for the rest of the BMP{0,1,3,5}80 sensors.
Few times, core1 was scheduled to boot first before core0, which leads
to error:
'k3_r5_rproc_start: can not start core 1 before core 0'.
This was happening due to some scheduling between prepare and start
callback. The probe function waits for event, which is getting
triggered by prepare callback. To avoid above condition move event
trigger to start instead of prepare callback.
Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1") Signed-off-by: Udit Kumar <u-kumar1@ti.com>
[ Applied wakeup event trigger only for Split-Mode booted rprocs ] Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240820105004.2788327-1-b-padhi@ti.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Acquire the mailbox handle during device probe and do not release handle
in stop/detach routine or error paths. This removes the redundant
requests for mbox handle later during rproc start/attach. This also
allows to defer remoteproc driver's probe if mailbox is not probed yet.
MANA hardware uses 4k page size. When calculating the page table index,
it should use the hardware page size, not the system page size.
Cc: stable@vger.kernel.org Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter") Signed-off-by: Long Li <longli@microsoft.com> Link: https://patch.msgid.link/1725030993-16213-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
As defined by the MANA Hardware spec, the queue size for DMA is 4KB
minimal, and power of 2. And, the HWC queue size has to be exactly
4KB.
To support page sizes other than 4KB on ARM64, define the minimal
queue size as a macro separately from the PAGE_SIZE, which we always
assumed it to be 4KB before supporting ARM64.
Also, add MANA specific macros and update code related to size
alignment, DMA region calculations, etc.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1718655446-6576-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Stable-dep-of: 9e517a8e9d9a ("RDMA/mana_ib: use the correct page table index based on hardware page size") Signed-off-by: Sasha Levin <sashal@kernel.org>
Nothing appears to limit the number of concurrent async COPY
operations that clients can start. In addition, AFAICT each async
COPY can copy an unlimited number of 4MB chunks, so can run for a
long time. Thus IMO async COPY can become a DoS vector.
Add a restriction mechanism that bounds the number of concurrent
background COPY operations. Start simple and try to be fair -- this
patch implements a per-namespace limit.
An async COPY request that occurs while this limit is exceeded gets
NFS4ERR_DELAY. The requesting client can choose to send the request
again after a delay or fall back to a traditional read/write style
copy.
If there is need to make the mechanism more sophisticated, we can
visit that in future patches.
Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Currently, when NFSD handles an asynchronous COPY, it returns a
zero write verifier, relying on the subsequent CB_OFFLOAD callback
to pass the write verifier and a stable_how4 value to the client.
However, if the CB_OFFLOAD never arrives at the client (for example,
if a network partition occurs just as the server sends the
CB_OFFLOAD operation), the client will never receive this verifier.
Thus, if the client sends a follow-up COMMIT, there is no way for
the client to assess the COMMIT result.
The usual recovery for a missing CB_OFFLOAD is for the client to
send an OFFLOAD_STATUS operation, but that operation does not carry
a write verifier in its result. Neither does it carry a stable_how4
value, so the client /must/ send a COMMIT in this case -- which will
always fail because currently there's still no write verifier in the
COPY result.
Thus the server needs to return a normal write verifier in its COPY
result even if the COPY operation is to be performed asynchronously.
If the server recognizes the callback stateid in subsequent
OFFLOAD_STATUS operations, then obviously it has not restarted, and
the write verifier the client received in the COPY result is still
valid and can be used to assess a COMMIT of the copied data, if one
is needed.
Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Stable-dep-of: aadc3bbea163 ("NFSD: Limit the number of concurrent async COPY operations") Signed-off-by: Sasha Levin <sashal@kernel.org>
Brandon reports sporadic, non-sensical spikes in cumulative pressure
time (total=) when reading cpu.pressure at a high rate. This is due to
a race condition between reader aggregation and tasks changing states.
While it affects all states and all resources captured by PSI, in
practice it most likely triggers with CPU pressure, since scheduling
events are so frequent compared to other resource events.
The race context is the live snooping of ongoing stalls during a
pressure read. The read aggregates per-cpu records for stalls that
have concluded, but will also incorporate ad-hoc the duration of any
active state that hasn't been recorded yet. This is important to get
timely measurements of ongoing stalls. Those ad-hoc samples are
calculated on-the-fly up to the current time on that CPU; since the
stall hasn't concluded, it's expected that this is the minimum amount
of stall time that will enter the per-cpu records once it does.
The problem is that the path that concludes the state uses a CPU clock
read that is not synchronized against aggregators; the clock is read
outside of the seqlock protection. This allows aggregators to race and
snoop a stall with a longer duration than will actually be recorded.
With the recorded stall time being less than the last snapshot
remembered by the aggregator, a subsequent sample will underflow and
observe a bogus delta value, resulting in an erratic jump in pressure.
Fix this by moving the clock read of the state change into the seqlock
protection. This ensures no aggregation can snoop live stalls past the
time that's recorded when the state concludes.
We currently do stuff like queuing the final destruction step on a
random system wq, which will outlive the driver instance. With bad
timing we can teardown the driver with one or more work workqueue still
being alive leading to various UAF splats. Add a fini step to ensure
user queues are properly torn down. At this point GuC should already be
nuked so queue itself should no longer be referenced from hw pov.
v2 (Matt B)
- Looks much safer to use a waitqueue and then just wait for the
xa_array to become empty before triggering the drain.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2317 Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240923145647.77707-2-matthew.auld@intel.com
(cherry picked from commit 861108666cc0e999cffeab6aff17b662e68774e3) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
Harden build ID parsing logic, adding explicit READ_ONCE() where it's
important to have a consistent value read and validated just once.
Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
note is within a page bounds, so move the overflow check up and add an
extra note_size boundaries validation.
Fixes tag below points to the code that moved this code into
lib/buildid.c, and then subsequently was used in perf subsystem, making
this code exposed to perf_event_open() users in v5.12+.
The z3fold compressed pages allocator is rarely used, most users use
zsmalloc. The only disadvantage of zsmalloc in comparison is the
dependency on MMU, and zbud is a more common option for !MMU as it was the
default zswap allocator for a long time.
Historically, zsmalloc had worse latency than zbud and z3fold but offered
better memory savings. This is no longer the case as shown by a simple
recent analysis [1]. That analysis showed that z3fold does not have any
advantage over zsmalloc or zbud considering both performance and memory
usage. In a kernel build test on tmpfs in a limited cgroup, z3fold took
3% more time and used 1.8% more memory. The latency of zswap_load() was
7% higher, and that of zswap_store() was 10% higher. Zsmalloc is better
in all metrics.
Moreover, z3fold apparently has latent bugs, which was made noticeable by
a recent soft lockup bug report with z3fold [2]. Switching to zsmalloc
not only fixed the problem, but also reduced the swap usage from 6~8G to
1~2G. Other users have also reported being bitten by mistakenly enabling
z3fold.
Other than hurting users, z3fold is repeatedly causing wasted engineering
effort. Apart from investigating the above bug, it came up in multiple
development discussions (e.g. [3]) as something we need to handle, when
there aren't any legit users (at least not intentionally).
The natural course of action is to deprecate z3fold, and remove in a few
cycles if no objections are raised from active users. Next on the list
should be zbud, as it offers marginal latency gains at the cost of huge
memory waste when compared to zsmalloc. That one will need to wait until
zsmalloc does not depend on MMU.
Rename the user-visible config option from CONFIG_Z3FOLD to
CONFIG_Z3FOLD_DEPRECATED so that users with CONFIG_Z3FOLD=y get a new
prompt with explanation during make oldconfig. Also, remove
CONFIG_Z3FOLD=y from defconfigs.
xol_add_vma() maps the uninitialized page allocated by __create_xol_area()
into userspace. On some architectures (x86) this memory is readable even
without VM_READ, VM_EXEC results in the same pgprot_t as VM_EXEC|VM_READ,
although this doesn't really matter, debugger can read this memory anyway.
A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
special-purpose register does not affect subsequent speculative
instructions, permitting speculative store bypassing for a window of
time.
We worked around this for a number of CPUs in commits:
Since then, a (hopefully final) batch of updates have been published,
with two more affected CPUs. For the affected CPUs the existing
mitigation is sufficient, as described in their respective Software
Developer Errata Notice (SDEN) documents:
notify_hwp_interrupt() is called via sysvec_thermal() ->
smp_thermal_vector() -> intel_thermal_interrupt() in hard irq context.
For this reason it must not use a simple spin_lock that sleeps with
PREEMPT_RT enabled. So convert it to a raw spinlock.
Reported-by: xiao sheng wen <atzlinux@sina.com> Link: https://bugs.debian.org/1076483 Signed-off-by: Uwe Kleine-König <ukleinek@debian.org> Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Tested-by: xiao sheng wen <atzlinux@sina.com> Link: https://patch.msgid.link/20240919081121.10784-2-ukleinek@debian.org Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[ukleinek: Backport to v6.10.y] Signed-off-by: Uwe Kleine-König <ukleinek@debian.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
[Why]
Connected with a Thunderbolt monitor and do the suspend and the system
may hang while resume.
The TBT monitor HPD will be triggered during the resume procedure
and call the drm_client_modeset_probe() while
struct drm_connector connector->dev->master is NULL.
It will mess up the pipe topology after resume.
[How]
Skip the TBT monitor HPD during the resume procedure because we
currently will probe the connectors after resume by default.
Reviewed-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 453f86a26945207a16b8f66aaed5962dc2b95b85) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[WHY & HOW]
Mismatch in DCN35 DML2 cause bw validation failed to acquire unexpected DPP pipe to cause
grey screen and system hang. Remove EnhancedPrefetchScheduleAccelerationFinal value override
to match HW spec.
Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9dad21f910fcea2bdcff4af46159101d7f9cd8ba) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[WHY & HOW]
Some eDP panels suffer from flicking when HDR is enabled in KDE. This
quirk works around it by skipping VSC that is incompatible with eDP
panels.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3151 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 4d4257280d7957727998ef90ccc7b69c7cca8376) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Entities run queue can change during drm_sched_entity_push_job() so make
sure to update the score consistently.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: d41a39dda140 ("drm/scheduler: improve job distribution with multiple queues") Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.9+ Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240924101914.2713-4-tursulin@igalia.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Since drm_sched_entity_modify_sched() can modify the entities run queue,
lets make sure to only dereference the pointer once so both adding and
waking up are guaranteed to be consistent.
Alternative of moving the spin_unlock to after the wake up would for now
be more problematic since the same lock is taken inside
drm_sched_rq_update_fifo().
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Philipp Stanner <pstanner@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240924101914.2713-3-tursulin@igalia.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Without the locking amdgpu currently can race between
amdgpu_ctx_set_entity_priority() (via drm_sched_entity_modify_sched()) and
drm_sched_job_arm(), leading to the latter accesing potentially
inconsitent entity->sched_list and entity->num_sched_list pair.
v2:
* Improve commit message. (Philipp)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: b37aced31eb0 ("drm/scheduler: implement a function to modify sched list") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Luben Tuikov <ltuikov89@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: Philipp Stanner <pstanner@redhat.com> Cc: <stable@vger.kernel.org> # v5.7+ Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240913160559.49054-2-tursulin@igalia.com Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Fixes a race condition reported here: https://github.com/AsahiLinux/linux/issues/309#issuecomment-2238968609
The whole premise of lockless access to a single-producer-single-
consumer queue is that there is just a single producer and single
consumer. That means we can't call drm_sched_can_queue() (which is
about queueing more work to the hw, not to the spsc queue) from
anywhere other than the consumer (wq).
This call in the producer is just an optimization to avoid scheduling
the consuming worker if it cannot yet queue more work to the hw. It
is safe to drop this optimization to avoid the race condition.
If deferred operations are pending, we want to wait for those to
land before declaring the queue blocked on a SYNC_WAIT. We need
this to deal with the case where the sync object is signalled through
a deferred SYNC_{ADD,SET} from the same queue. If we don't do that
and the group gets scheduled out before the deferred SYNC_{SET,ADD}
is executed, we'll end up with a timeout, because no external
SYNC_{SET,ADD} will make the scheduler reconsider the group for
execution.
Fixes: de8548813824 ("drm/panthor: Add the scheduler logical block") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905071914.3278599-1-boris.brezillon@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The group variable can't be used to retrieve ptdev in our second loop,
because it points to the previously iterated list_head, not a valid
group. Get the ptdev object from the scheduler instead.
The only user (the mesa gallium driver) is already assuming explicit
synchronization and doing the export/import dance on shared BOs. The
only reason we were registering ourselves as writers on external BOs
is because Xe, which was the reference back when we developed Panthor,
was doing so. Turns out Xe was wrong, and we really want bookkeep on
all registered fences, so userspace can explicitly upgrade those to
read/write when needed.
Fixes: 4bdca1150792 ("drm/panthor: Add the driver frontend block") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Simona Vetter <simona.vetter@ffwll.ch> Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240905070155.3254011-1-boris.brezillon@collabora.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND is an int, defaulting to 250. When
the wakeref is non-zero, it's either -1 or a dynamically allocated
pointer, depending on CONFIG_DRM_I915_DEBUG_RUNTIME_PM. It's likely that
the code works by coincidence with the bitwise AND, but with
CONFIG_DRM_I915_DEBUG_RUNTIME_PM=y, there's the off chance that the
condition evaluates to false, and intel_wakeref_auto() doesn't get
called. Switch to the intended logical AND.
v2: Use != to avoid clang -Wconstant-logical-operand (Nathan)
Cloning a descriptor table picks the size that would cover all currently
opened files. That's fine for clone() and unshare(), but for close_range()
there's an additional twist - we clone before we close, and it would be
a shame to have
close_range(3, ~0U, CLOSE_RANGE_UNSHARE)
leave us with a huge descriptor table when we are not going to keep
anything past stderr, just because some large file descriptor used to
be open before our call has taken it out.
Unfortunately, it had been dealt with in an inherently racy way -
sane_fdtable_size() gets a "don't copy anything past that" argument
(passed via unshare_fd() and dup_fd()), close_range() decides how much
should be trimmed and passes that to unshare_fd().
The problem is, a range that used to extend to the end of descriptor
table back when close_range() had looked at it might very well have stuff
grown after it by the time dup_fd() has allocated a new files_struct
and started to figure out the capacity of fdtable to be attached to that.
That leads to interesting pathological cases; at the very least it's a
QoI issue, since unshare(CLONE_FILES) is atomic in a sense that it takes
a snapshot of descriptor table one might have observed at some point.
Since CLOSE_RANGE_UNSHARE close_range() is supposed to be a combination
of unshare(CLONE_FILES) with plain close_range(), ending up with a
weird state that would never occur with unshare(2) is confusing, to put
it mildly.
It's not hard to get rid of - all it takes is passing both ends of the
range down to sane_fdtable_size(). There we are under ->files_lock,
so the race is trivially avoided.
So we do the following:
* switch close_files() from calling unshare_fd() to calling
dup_fd().
* undo the calling convention change done to unshare_fd() in 60997c3d45d9 "close_range: add CLOSE_RANGE_UNSHARE"
* introduce struct fd_range, pass a pointer to that to dup_fd()
and sane_fdtable_size() instead of "trim everything past that point"
they are currently getting. NULL means "we are not going to be punching
any holes"; NR_OPEN_MAX is gone.
* make sane_fdtable_size() use find_last_bit() instead of
open-coding it; it's easier to follow that way.
* while we are at it, have dup_fd() report errors by returning
ERR_PTR(), no need to use a separate int *errorp argument.
The sysfb framebuffer handling only operates on graphics devices
that provide the system's firmware framebuffer. If that device is
not known, assume that any graphics device has been initialized by
firmware.
Fixes a problem on i915 where sysfb does not release the firmware
framebuffer after the native graphics driver loaded.
Reported-by: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com> Closes: https://lore.kernel.org/dri-devel/SJ1PR11MB6129EFB8CE63D1EF6D932F94B96F2@SJ1PR11MB6129.namprd11.prod.outlook.com/ Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12160 Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: b49420d6a1ae ("video/aperture: optionally match the device in sysfb_disable()") Cc: Javier Martinez Canillas <javierm@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Helge Deller <deller@gmx.de> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: dri-devel@lists.freedesktop.org Cc: Linux regression tracking (Thorsten Leemhuis) <regressions@leemhuis.info> Cc: <stable@vger.kernel.org> # v6.11+ Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240924084227.262271-1-tzimmermann@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The help text in osnoise top and timerlat top had some minor errors
and omissions. The -d option was missing the 's' (second) abbreviation and
the error message for '-d' used '-D'.
Cc: stable@vger.kernel.org Fixes: 1eceb2fc2ca54 ("rtla/osnoise: Add osnoise top mode") Fixes: a828cd18bc4ad ("rtla: Add timerlat tool and timelart top mode") Link: https://lore.kernel.org/20240813155831.384446-1-ezulian@redhat.com Suggested-by: Tomas Glozar <tglozar@redhat.com> Reviewed-by: Tomas Glozar <tglozar@redhat.com> Signed-off-by: Eder Zulian <ezulian@redhat.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
osnoise_hotplug_workfn() is the asynchronous online callback for
"trace/osnoise:online". It may be congested when a CPU goes online and
offline repeatedly and is invoked for multiple times after a certain
online.
This will lead to kthread leak and timer corruption. Add a check
in start_kthread() to prevent this situation.
After tracing the scheduling event, it was discovered that the migration
of the "timerlat/1" thread was performed during thread creation. Further
analysis confirmed that it is because the CPU online processing for
osnoise is implemented through workers, which is asynchronous with the
offline processing. When the worker was scheduled to create a thread, the
CPU may has already been removed from the cpu_online_mask during the offline
process, resulting in the inability to select the right CPU:
stop_kthread() is the offline callback for "trace/osnoise:online", since
commit 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing
of kthread in stop_kthread()"), the following ABBA deadlock scenario is
introduced:
As the interface_lock here in just for protecting the "kthread" field of
the osn_var, use xchg() instead to fix this issue. Also use
for_each_online_cpu() back in stop_per_cpu_kthreads() as it can take
cpu_read_lock() again.
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20240924094515.3561410-3-liwei391@huawei.com Fixes: 5bfbcd1ee57b ("tracing/timerlat: Add interface_lock around clearing of kthread in stop_kthread()") Signed-off-by: Wei Li <liwei391@huawei.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
<7>[ 5413.970692] ceph: get_cap_refs 00000000958c114b ret 1 got Fr
<7>[ 5413.970695] ceph: start_read 00000000958c114b, no cache cap
...
<7>[ 5473.934609] ceph: my wanted = Fr, used = Fr, dirty -
<7>[ 5473.934616] ceph: revocation: pAsLsXsFr -> pAsLsXs (revoking Fr)
<7>[ 5473.934632] ceph: __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs
<7>[ 5473.934638] ceph: check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr AUTHONLY NOINVAL FLUSH_FORCE
The MDS subsequently complains that the kernel client is late releasing
caps.
Approximately, a series of changes to this code by commits 49870056005c
("ceph: convert ceph_readpages to ceph_readahead"), 2de160417315
("netfs: Change ->init_request() to return an error code") and a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead")
resulted in subtle resource cleanup to be missed. The main culprit is
the change in error handling in 2de160417315 which meant that a failure
in init_request() would no longer cause cleanup to be called. That
would prevent the ceph_put_cap_refs() call which would cleanup the
leaked cap ref.
Cc: stable@vger.kernel.org Fixes: a5c9dc445139 ("ceph: Make ceph_init_request() check caps on readahead") Link: https://tracker.ceph.com/issues/67008 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If the recv returns zero, or an error, then it doesn't matter if more
data has already been received for this buffer. A condition like that
should terminate the multishot receive. Rather than pass in the
collected return value, pass in whether to terminate or keep the recv
going separately.
Note that this isn't a bug right now, as the only way to get there is
via setting MSG_WAITALL with multishot receive. And if an application
does that, then -EINVAL is returned anyway. But it seems like an easy
bug to introduce, so let's make it a bit more explicit.
In the `mac802154_scan_worker` function, the `scan_req->type` field was
accessed after the RCU read-side critical section was unlocked. According
to RCU usage rules, this is illegal and can lead to unpredictable
behavior, such as accessing memory that has been updated or causing
use-after-free issues.
This possible bug was identified using a static analysis tool developed
by myself, specifically designed to detect RCU-related issues.
To address this, the `scan_req->type` value is now stored in a local
variable `scan_req_type` while still within the RCU read-side critical
section. The `scan_req_type` is then used after the RCU lock is released,
ensuring that the type value is safely accessed without violating RCU
rules.
Fixes: e2c3e6f53a7a ("mac802154: Handle active scanning") Cc: stable@vger.kernel.org Signed-off-by: Jiawei Ye <jiawei.ye@foxmail.com> Acked-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/tencent_3B2F4F2B4DA30FAE2F51A9634A16B3AD4908@qq.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
This aligned BR/EDR JUST_WORKS method with LE which since 92516cd97fd4
("Bluetooth: Always request for user confirmation for Just Works")
always request user confirmation with confirm_hint set since the
likes of bluetoothd have dedicated policy around JUST_WORKS method
(e.g. main.conf:JustWorksRepairing).
CVE: CVE-2024-8805 Cc: stable@vger.kernel.org Fixes: ba15a58b179e ("Bluetooth: Fix SSP acceptor just-works confirmation without MITM") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Tested-by: Kiran K <kiran.k@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The value is read from the register TXGBE_RX_GEN_CTL3, and it should be
written back to TXGBE_RX_GEN_CTL3 when it changes some fields.
Cc: stable@vger.kernel.org Fixes: f629acc6f210 ("net: pcs: xpcs: support to switch mode for Wangxun NICs") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reported-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20240924022857.865422-1-jiawenwu@trustnetic.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On a few platforms such as TI's AM69 device, disable_irq() fails to keep
track of the interrupts that happen between disable_irq() and
enable_irq() and those interrupts are missed. Use the ->irq_unmask() and
->irq_mask() methods instead of ->irq_enable() and ->irq_disable() to
correctly keep track of edges when disable_irq is called.
This solves the issue of disable_irq() not working as expected on such
platforms.
In the parse_perf_domain function, if the call to
of_parse_phandle_with_args returns an error, then the reference to the
CPU device node that was acquired at the start of the function would not
be properly decremented.
Address this by declaring the variable with the __free(device_node)
cleanup attribute.
Signed-off-by: Miquel Sabaté Solà <mikisabate@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/20240917134246.584026-1-mikisabate@gmail.com Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
During unmount, at close_ctree(), we have the following steps in this order:
1) Park the cleaner kthread - this doesn't destroy the kthread, it basically
halts its execution (wake ups against it work but do nothing);
2) We stop the cleaner kthread - this results in freeing the respective
struct task_struct;
3) We call btrfs_stop_all_workers() which waits for any jobs running in all
the work queues and then free the work queues.
Syzbot reported a case where a fixup worker resulted in a crash when doing
a delayed iput on its inode while attempting to wake up the cleaner at
btrfs_add_delayed_iput(), because the task_struct of the cleaner kthread
was already freed. This can happen during unmount because we don't wait
for any fixup workers still running before we call kthread_stop() against
the cleaner kthread, which stops and free all its resources.
Fix this by waiting for any fixup workers at close_ctree() before we call
kthread_stop() against the cleaner and run pending delayed iputs.
The stack traces reported by syzbot were the following:
BUG: KASAN: slab-use-after-free in __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065
Read of size 8 at addr ffff8880272a8a18 by task kworker/u8:3/52
Last potentially related work creation:
kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
__kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
__call_rcu_common kernel/rcu/tree.c:3086 [inline]
call_rcu+0x167/0xa70 kernel/rcu/tree.c:3190
context_switch kernel/sched/core.c:5318 [inline]
__schedule+0x184b/0x4ae0 kernel/sched/core.c:6675
schedule_idle+0x56/0x90 kernel/sched/core.c:6793
do_idle+0x56a/0x5d0 kernel/sched/idle.c:354
cpu_startup_entry+0x42/0x60 kernel/sched/idle.c:424
start_secondary+0x102/0x110 arch/x86/kernel/smpboot.c:314
common_startup_64+0x13e/0x147
The buggy address belongs to the object at ffff8880272a8000
which belongs to the cache task_struct of size 7424
The buggy address is located 2584 bytes inside of
freed 7424-byte region [ffff8880272a8000, ffff8880272a9d00)
During an incremental send we may end up sending an invalid clone
operation, for the last extent of a file which ends at an unaligned offset
that matches the final i_size of the file in the send snapshot, in case
the file had its initial size (the size in the parent snapshot) decreased
in the send snapshot. In this case the destination will fail to apply the
clone operation because its end offset is not sector size aligned and it
ends before the current size of the file.
Sending the truncate operation always happens when we finish processing an
inode, after we process all its extents (and xattrs, names, etc). So fix
this by ensuring the file has a valid size before we send a clone
operation for an unaligned extent that ends at the final i_size of the
file. The size we truncate to matches the start offset of the clone range
but it could be any value between that start offset and the final size of
the file since the clone operation will expand the i_size if the current
size is smaller than the end offset. The start offset of the range was
chosen because it's always sector size aligned and avoids a truncation
into the middle of a page, which results in dirtying the page due to
filling part of it with zeroes and then making the clone operation at the
receiver trigger IO.
The following test reproduces the issue:
$ cat test.sh
#!/bin/bash
DEV=/dev/sdi
MNT=/mnt/sdi
mkfs.btrfs -f $DEV
mount $DEV $MNT
# Create a file with a size of 256K + 5 bytes, having two extents, one
# with a size of 128K and another one with a size of 128K + 5 bytes.
last_ext_size=$((128 * 1024 + 5))
xfs_io -f -d -c "pwrite -S 0xab -b 128K 0 128K" \
-c "pwrite -S 0xcd -b $last_ext_size 128K $last_ext_size" \
$MNT/foo
# Another file which we will later clone foo into, but initially with
# a larger size than foo.
xfs_io -f -c "pwrite -S 0xef 0 1M" $MNT/bar
btrfs subvolume snapshot -r $MNT/ $MNT/snap1
# Now resize bar and clone foo into it.
xfs_io -c "truncate 0" \
-c "reflink $MNT/foo" $MNT/bar
Since the inception of relocation we have maintained the backref cache
across transaction commits, updating the backref cache with the new
bytenr whenever we COWed blocks that were in the cache, and then
updating their bytenr once we detected a transaction id change.
This works as long as we're only ever modifying blocks, not changing the
structure of the tree.
However relocation does in fact change the structure of the tree. For
example, if we are relocating a data extent, we will look up all the
leaves that point to this data extent. We will then call
do_relocation() on each of these leaves, which will COW down to the leaf
and then update the file extent location.
But, a key feature of do_relocation() is the pending list. This is all
the pending nodes that we modified when we updated the file extent item.
We will then process all of these blocks via finish_pending_nodes, which
calls do_relocation() on all of the nodes that led up to that leaf.
The purpose of this is to make sure we don't break sharing unless we
absolutely have to. Consider the case that we have 3 snapshots that all
point to this leaf through the same nodes, the initial COW would have
created a whole new path. If we did this for all 3 snapshots we would
end up with 3x the number of nodes we had originally. To avoid this we
will cycle through each of the snapshots that point to each of these
nodes and update their pointers to point at the new nodes.
Once we update the pointer to the new node we will drop the node we
removed the link for and all of its children via btrfs_drop_subtree().
This is essentially just btrfs_drop_snapshot(), but for an arbitrary
point in the snapshot.
The problem with this is that we will never reflect this in the backref
cache. If we do this btrfs_drop_snapshot() for a node that is in the
backref tree, we will leave the node in the backref tree. This becomes
a problem when we change the transid, as now the backref cache has
entire subtrees that no longer exist, but exist as if they still are
pointed to by the same roots.
In the best case scenario you end up with "adding refs to an existing
tree ref" errors from insert_inline_extent_backref(), where we attempt
to link in nodes on roots that are no longer valid.
Worst case you will double free some random block and re-use it when
there's still references to the block.
This is extremely subtle, and the consequences are quite bad. There
isn't a way to make sure our backref cache is consistent between
transid's.
In order to fix this we need to simply evict the entire backref cache
anytime we cross transid's. This reduces performance in that we have to
rebuild this backref cache every time we change transid's, but fixes the
bug.
This has existed since relocation was added, and is a pretty critical
bug. There's a lot more cleanup that can be done now that this
functionality is going away, but this patch is as small as possible in
order to fix the problem and make it easy for us to backport it to all
the kernels it needs to be backported to.
Followup series will dismantle more of this code and simplify relocation
drastically to remove this functionality.
We have a reproducer that reproduced the corruption within a few minutes
of running. With this patch it survives several iterations/hours of
running the reproducer.
Fixes: 3fd0a5585eb9 ("Btrfs: Metadata ENOSPC handling for balance") CC: stable@vger.kernel.org Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[CAUSE]
The allocation failure happens at the start_transaction() inside
prepare_to_relocate(), and during the error handling we call
unset_reloc_control(), which makes fs_info->balance_ctl to be NULL.
Then we continue the error path cleanup in btrfs_balance() by calling
reset_balance_state() which will call del_balance_item() to fully delete
the balance item in the root tree.
However during the small window between set_reloc_contrl() and
unset_reloc_control(), we can have a subvolume tree update and created a
reloc_root for that subvolume.
Then we go into the final btrfs_commit_transaction() of
del_balance_item(), and into btrfs_update_reloc_root() inside
commit_fs_roots().
That function checks if fs_info->reloc_ctl is in the merge_reloc_tree
stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer
dereference.
[FIX]
Just add extra check on fs_info->reloc_ctl inside
btrfs_update_reloc_root(), before checking
fs_info->reloc_ctl->merge_reloc_tree.
That DEAD_RELOC_TREE handling is to prevent further modification to the
reloc tree during merge stage, but since there is no reloc_ctl at all,
we do not need to bother that.
Reported-by: syzbot+283673dbc38527ef9f3d@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/66f6bfa7.050a0220.38ace9.0019.GAE@google.com/ CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Like other Asus ExpertBook models the B2502CVA has its keybopard IRQ (1)
described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
which breaks the keyboard.
Add the B2502CVA to the irq1_level_low_skip_override[] quirk table to fix
this.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217760 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20240927141606.66826-4-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Like other Asus Vivobook models the X1704VAP has its keybopard IRQ (1)
described as ActiveLow in the DSDT, which the kernel overrides to EdgeHigh
which breaks the keyboard.
Add the X1704VAP to the irq1_level_low_skip_override[] quirk table to fix
this.
Reported-by: Lamome Julien <julien.lamome@wanadoo.fr> Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1078696 Closes: https://lore.kernel.org/all/1226760b-4699-4529-bf57-6423938157a3@wanadoo.fr/ Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20240927141606.66826-3-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There already is an entry in the irq1_level_low_skip_override[] DMI match
table for the "E1404GAB", change this to match on "E1404GA" to cover
the E1404GA model as well (DMI_MATCH() does a substring match).
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219224 Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20240927141606.66826-2-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook
E1504GA and E1504GAB") does exactly what the subject says, adding DMI
matches for both the E1504GA and E1504GAB.
But DMI_MATCH() does a substring match, so checking for E1504GA will also
match E1504GAB.
Drop the unnecessary E1504GAB entry since that is covered already by
the E1504GA entry.
Fixes: d2aaf1996504 ("ACPI: resource: Add DMI quirks for ASUS Vivobook E1504GA and E1504GAB") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20240927141606.66826-1-hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dell All In One (AIO) models released after 2017 may use a backlight
controller board connected to an UART.
In DSDT this uart port will be defined as:
Name (_HID, "DELL0501")
Name (_CID, EisaId ("PNP0501")
The Dell OptiPlex 5480 AIO has an ACPI device for one of its UARTs with
the above _HID + _CID. Loading the dell-uart-backlight driver fails with
the following errors:
[ 18.261353] dell_uart_backlight serial0-0: Timed out waiting for response.
[ 18.261356] dell_uart_backlight serial0-0: error -ETIMEDOUT: getting firmware version
[ 18.261359] dell_uart_backlight serial0-0: probe with driver dell_uart_backlight failed with error -110
Indicating that there is no backlight controller board attached to
the UART, while the GPU's native backlight control method does work.
Add a quirk to use the GPU's native backlight control method on this model.
Fixes: cd8e468efb4f ("ACPI: video: Add Dell UART backlight controller detection") Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://patch.msgid.link/20240918153849.37221-1-hdegoede@redhat.com
[ rjw: Changelog edit ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A dentry leak may be caused when a lookup cookie and a cull are concurrent:
P1 | P2
-----------------------------------------------------------
cachefiles_lookup_cookie
cachefiles_look_up_object
lookup_one_positive_unlocked
// get dentry
cachefiles_cull
inode->i_flags |= S_KERNEL_FILE;
cachefiles_open_file
cachefiles_mark_inode_in_use
__cachefiles_mark_inode_in_use
can_use = false
if (!(inode->i_flags & S_KERNEL_FILE))
can_use = true
return false
return false
// Returns an error but doesn't put dentry
After that the following WARNING will be triggered when the backend folder
is umounted:
==================================================================
BUG: Dentry 000000008ad87947{i=7a,n=Dx_1_1.img} still in use (1) [unmount of ext4 sda]
WARNING: CPU: 4 PID: 359261 at fs/dcache.c:1767 umount_check+0x5d/0x70
CPU: 4 PID: 359261 Comm: umount Not tainted 6.6.0-dirty #25
RIP: 0010:umount_check+0x5d/0x70
Call Trace:
<TASK>
d_walk+0xda/0x2b0
do_one_tree+0x20/0x40
shrink_dcache_for_umount+0x2c/0x90
generic_shutdown_super+0x20/0x160
kill_block_super+0x1a/0x40
ext4_kill_sb+0x22/0x40
deactivate_locked_super+0x35/0x80
cleanup_mnt+0x104/0x160
==================================================================
Whether cachefiles_open_file() returns true or false, the reference count
obtained by lookup_positive_unlocked() in cachefiles_look_up_object()
should be released.
Therefore release that reference count in cachefiles_look_up_object() to
fix the above issue and simplify the code.
Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling") Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240829083409.3788142-1-libaokun@huaweicloud.com Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The adp5589 seems to have the same behavior as similar devices as
explained in commit 910a9f5636f5 ("Input: adp5588-keys - get value from
data out when dir is out").
Basically, when the gpio is set as output we need to get the value from
ADP5589_GPO_DATA_OUT_A register instead of ADP5589_GPI_STATUS_A.
Fixes: 9d2e173644bb ("Input: ADP5589 - new driver for I2C Keypad Decoder and I/O Expander") Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-2-fca0149dfc47@analog.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
We register a devm action to call adp5589_clear_config() and then pass
the i2c client as argument so that we can call i2c_get_clientdata() in
order to get our device object. However, i2c_set_clientdata() is only
being set at the end of the probe function which means that we'll get a
NULL pointer dereference in case the probe function fails early.
Fixes: 30df385e35a4 ("Input: adp5589-keys - use devm_add_action_or_reset() for register clear") Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://lore.kernel.org/r/20241001-b4-dev-adp5589-fw-conversion-v1-1-fca0149dfc47@analog.com Cc: stable@vger.kernel.org Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The commit b8c43360f6e4 ("net: stmmac: No need to calculate speed divider
when offload is disabled") allows the "port_transmit_rate_kbps" to be
set to a value of 0, which is then passed to the "div_s64" function when
tc-cbs is disabled. This leads to a zero-division error.
When tc-cbs is disabled, the idleslope, sendslope, and credit values the
credit values are not required to be configured. Therefore, adding a return
statement after setting the txQ mode to DCB when tc-cbs is disabled would
prevent a zero-division error.
Fixes: b8c43360f6e4 ("net: stmmac: No need to calculate speed divider when offload is disabled") Cc: <stable@vger.kernel.org> Co-developed-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: Choong Yong Liang <yong.liang.choong@linux.intel.com> Signed-off-by: KhaiWenTan <khai.wen.tan@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240918061422.1589662-1-khai.wen.tan@linux.intel.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alfred Agrell found that TOMOYO cannot handle execveat(AT_EMPTY_PATH)
inside chroot environment where /dev and /proc are not mounted, for
commit 51f39a1f0cea ("syscalls: implement execveat() system call") missed
that TOMOYO tries to canonicalize argv[0] when the filename fed to the
executed program as argv[0] is supplied using potentially nonexistent
pathname.
Since "/dev/fd/<fd>" already lost symlink information used for obtaining
that <fd>, it is too late to reconstruct symlink's pathname. Although
<filename> part of "/dev/fd/<fd>/<filename>" might not be canonicalized,
TOMOYO cannot use tomoyo_realpath_nofollow() when /dev or /proc is not
mounted. Therefore, fallback to tomoyo_realpath_from_path() when
tomoyo_realpath_nofollow() failed.
Detect gso fraglist skbs with corrupted geometry (see below) and
pass these to skb_segment instead of skb_segment_list, as the first
can segment them correctly.
Valid SKB_GSO_FRAGLIST skbs
- consist of two or more segments
- the head_skb holds the protocol headers plus first gso_size
- one or more frag_list skbs hold exactly one segment
- all but the last must be gso_size
Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
modify these skbs, breaking these invariants.
In extreme cases they pull all data into skb linear. For UDP, this
causes a NULL ptr deref in __udpv4_gso_segment_list_csum at
udp_hdr(seg->next)->dest.
Detect invalid geometry due to pull, by checking head_skb size.
Don't just drop, as this may blackhole a destination. Convert to be
able to pass to regular skb_segment.
Detect tcp gso fraglist skbs with corrupted geometry (see below) and
pass these to skb_segment instead of skb_segment_list, as the first
can segment them correctly.
Valid SKB_GSO_FRAGLIST skbs
- consist of two or more segments
- the head_skb holds the protocol headers plus first gso_size
- one or more frag_list skbs hold exactly one segment
- all but the last must be gso_size
Optional datapath hooks such as NAT and BPF (bpf_skb_pull_data) can
modify these skbs, breaking these invariants.
In extreme cases they pull all data into skb linear. For TCP, this
causes a NULL ptr deref in __tcpv4_gso_segment_list_csum at
tcp_hdr(seg->next).
Detect invalid geometry due to pull, by checking head_skb size.
Don't just drop, as this may blackhole a destination. Convert to be
able to pass to regular skb_segment.
Approach and description based on a patch by Willem de Bruijn.
Move ST2 reading with overflow handling after measurement data
reading.
ST2 register read have to be read after read measurment data,
because it means end of the reading and realease the lock on the data.
Remove ST2 read skip on interrupt based waiting because ST2 required to
be read out at and of the axis read.
Commands like "chmod 0444" mark a file readonly via the attribute flag
(when mapping of mode bits into the ACL are not set, or POSIX extensions
are not negotiated), but they were not reported correctly for stat of
directories (they were reported ok for files and for "ls"). See example
below:
Due to server permission control, the client does not have access to
the shared root directory, but can access subdirectories normally, so
users usually mount the shared subdirectories directly. In this case,
queryfs should use the actual path instead of the root directory to
avoid the call returning an error (EACCES).
Signed-off-by: wangrong <wangrong@uniontech.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In LUCID EVO PLL CAL_L_VAL and L_VAL bitfields are part of single
PLL_L_VAL register. Update for L_VAL bitfield values in PLL_L_VAL
register using regmap_write() API in __alpha_pll_trion_set_rate
callback will override LUCID EVO PLL initial configuration related
to PLL_CAL_L_VAL bit fields in PLL_L_VAL register.
Observed random PLL lock failures during PLL enable due to such
override in PLL calibration value. Use regmap_update_bits() with
L_VAL bitfield mask instead of regmap_write() API to update only
PLL_L_VAL bitfields in __alpha_pll_trion_set_rate callback.
pm_runtime_enable() should happen prior to vfe_get() since vfe_get() calls
pm_runtime_resume_and_get().
This is a basic race condition that doesn't show up for most users so is
not widely reported. If you blacklist qcom-camss in modules.d and then
subsequently modprobe the module post-boot it is possible to reliably show
this error up.
The kernel log for this error looks like this:
qcom-camss ac5a000.camss: Failed to power up pipeline: -13
Fixes: 02afa816dbbf ("media: camss: Add basic runtime PM support") Reported-by: Johan Hovold <johan+linaro@kernel.org> Closes: https://lore.kernel.org/lkml/ZoVNHOTI0PKMNt4_@hovoldconsulting.com/ Tested-by: Johan Hovold <johan+linaro@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Reviewed-by: Konrad Dybcio <konradybcio@kernel.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The use_count check was introduced so that multiple concurrent Raw Data
Interfaces RDIs could be driven by different virtual channels VCs on the
CSIPHY input driving the video pipeline.
This is an invalid use of use_count though as use_count pertains to the
number of times a video entity has been opened by user-space not the number
of active streams.
If use_count and stream-on count don't agree then stop_streaming() will
break as is currently the case and has become apparent when using CAMSS
with libcamera's released softisp 0.3.
The use of use_count like this is a bit hacky and right now breaks regular
usage of CAMSS for a single stream case. Stopping qcam results in the splat
below, and then it cannot be started again and any attempts to do so fails
with -EBUSY.
[ 1265.509831] WARNING: CPU: 5 PID: 919 at drivers/media/common/videobuf2/videobuf2-core.c:2183 __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
...
[ 1265.510630] Call trace:
[ 1265.510636] __vb2_queue_cancel+0x230/0x2c8 [videobuf2_common]
[ 1265.510648] vb2_core_streamoff+0x24/0xcc [videobuf2_common]
[ 1265.510660] vb2_ioctl_streamoff+0x5c/0xa8 [videobuf2_v4l2]
[ 1265.510673] v4l_streamoff+0x24/0x30 [videodev]
[ 1265.510707] __video_do_ioctl+0x190/0x3f4 [videodev]
[ 1265.510732] video_usercopy+0x304/0x8c4 [videodev]
[ 1265.510757] video_ioctl2+0x18/0x34 [videodev]
[ 1265.510782] v4l2_ioctl+0x40/0x60 [videodev]
...
[ 1265.510944] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 0 in active state
[ 1265.511175] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 1 in active state
[ 1265.511398] videobuf2_common: driver bug: stop_streaming operation is leaving buffer 2 in active st
One CAMSS specific way to handle multiple VCs on the same RDI might be:
- Reference count each pipeline enable for CSIPHY, CSID, VFE and RDIx.
- The video buffers are already associated with msm_vfeN_rdiX so
release video buffers when told to do so by stop_streaming.
- Only release the power-domains for the CSIPHY, CSID and VFE when
their internal refcounts drop.
Either way refusing to release video buffers based on use_count is
erroneous and should be reverted. The silicon enabling code for selecting
VCs is perfectly fine. Its a "known missing feature" that concurrent VCs
won't work with CAMSS right now.
Initial testing with this code didn't show an error but, SoftISP and "real"
usage with Google Hangouts breaks the upstream code pretty quickly, we need
to do a partial revert and take another pass at VCs.
This commit partially reverts commit 89013969e232 ("media: camss: sm8250:
Pipeline starting and stopping for multiple virtual channels")
Fixes: 89013969e232 ("media: camss: sm8250: Pipeline starting and stopping for multiple virtual channels") Reported-by: Johan Hovold <johan+linaro@kernel.org> Closes: https://lore.kernel.org/lkml/ZoVNHOTI0PKMNt4_@hovoldconsulting.com/ Tested-by: Johan Hovold <johan+linaro@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org> Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
can happen during scenarios such as system suspend and breaks the resume
of PCIe controllers from suspend.
So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
during gdsc_disable() and allow the hardware to transition the GDSCs to
retention when the parent domain enters low power state during system
suspend.
in venus_probe, core->work is bound with venus_sys_error_handler, which is
used to handle error. The code use core->sys_err_done to make sync work.
The core->work is started in venus_event_notify.
If we call venus_remove, there might be an unfished work. The possible
sequence is as follows:
The branch clocks of gcc_cpuss_ahb_clk_src are marked critical
and hence these clocks vote on XO blocking the suspend.
De-register these clocks and its source as there is no rate
setting happening on them.
Valid frequencies may result in BCM votes that exceed the max HW value.
Set vote ceiling to BCM_TCS_CMD_VOTE_MASK to ensure the votes aren't
truncated, which can result in lower frequencies than desired.
The cec_msg_set_reply_to() helper function never zeroed the
struct cec_msg flags field, this can cause unexpected behavior
if flags was uninitialized to begin with.
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: 0dbacebede1e ("[media] cec: move the CEC framework out of staging and to media") Cc: <stable@vger.kernel.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This
can happen during scenarios such as system suspend and breaks the resume
of PCIe controllers from suspend.
So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs
during gdsc_disable() and allow the hardware to transition the GDSCs to
retention when the parent domain enters low power state during system
suspend.
The sun4i_csi driver doesn't implement link validation for the subdev it
registers, leaving the link between the subdev and its source
unvalidated. Fix it, using the v4l2_subdev_link_validate() helper.
When introducing the ability for drivers to indicate the minimum number
of buffers they require an application to allocate, commit 6662edcd32cc
("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue
structure") also introduced a global minimum of 2 buffers. It turns out
this breaks the Renesas R-Car VSP test suite, where a test that
allocates a single buffer fails when two buffers are used.
One may consider debatable whether test suite failures without failures
in production use cases should be considered as a regression, but
operation with a single buffer is a valid use case. While full frame
rate can't be maintained, memory-to-memory devices can still be used
with a decent efficiency, and requiring applications to allocate
multiple buffers for single-shot use cases with capture devices would
just waste memory.
For those reasons, fix the regression by dropping the global minimum of
buffers. Individual drivers can still set their own minimum.
Fixes: 6662edcd32cc ("media: videobuf2: Add min_reqbufs_allocation field to vb2_queue structure") Cc: stable@vger.kernel.org Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Reviewed-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Acked-by: Tomasz Figa <tfiga@chromium.org> Link: https://lore.kernel.org/r/20240825232449.25905-1-laurent.pinchart+renesas@ideasonboard.com Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
By simply bailing out, the driver was violating its rule and internal
assumptions that either both or no rproc should be initialized. E.g.,
this could cause the first core to be available but not the second one,
leading to crashes on its shutdown later on while trying to dereference
that second instance.
Fixes: 61f6f68447ab ("remoteproc: k3-r5: Wait for core0 power-up before powering up core1") Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Beleswar Padhi <b-padhi@ti.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/9f481156-f220-4adf-b3d9-670871351e26@siemens.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>