Jes Sorensen [Wed, 17 Dec 2003 00:26:43 +0000 (16:26 -0800)]
[PATCH] qla1280 crash fix in error handling
This fixes a bug in the qla1280 driver where it would leave a pointer to
an on the stack completion event in a command structure if
qla1280_mailbox_command fails. The result is that the interrupt handler
later tries to complete() garbage on the stack. The mailbox command can
fail if a device on the bus decides to lock up etc.
Jens Axboe [Wed, 17 Dec 2003 00:12:29 +0000 (16:12 -0800)]
[PATCH] CDROM_SEND_PACKET bug
I just found Yet Another Bug in scsi_ioctl - CDROM_SEND_PACKET puts a
kernel pointer in hdr->cmdp, where sg_io() expects to find user address.
This worked up until recently because of the memcpy bug, but now it
doesn't because we do the proper copy_from_user().
This fix undoes the user copy code from sg_io, and instead makes the
SG_IO ioctl copy it locally. This makes SG_IO and CDROM_SEND_PACKET
agree on the calling convention, and everybody is happy.
I've tested that both
cdrecord -dev=/dev/hdc -inq
and
cdrecord -dev=ATAPI:/dev/hdc -inq
works now. The former will use SG_IO, the latter CDROM_SEND_PACKET (and
incidentally would work in both 2.4 and 2.6, if it wasn't for
CDROM_SEND_PACKET sucking badly in 2.4).
Jens Axboe [Mon, 15 Dec 2003 23:51:55 +0000 (15:51 -0800)]
[PATCH] Fix IDE bus reset and DMA disable when reading blank DVD-R
From Jon Burgess:
There is a problems with blank DVD media using the ide-cd driver.
When we attempt to read the blank disk, the drive responds to the read
request by returning a "blank media" error. The kernel doesn't have
any special case handling for this sense value and retries the request
a couple of times, then gives up and does a bus reset and disables DMA
to the device.
Which obviously doesn't help the situation.
The sense key value of 8 isn't listed in ide-cd.h, but it is listed in
scsi.h as a "BLANK_CHECK" error.
This trivial patch treats this error condition as a reason to abort
the request. This behaviour is the same as what we do with a blank CD-R.
It looks like the same fix might be desired for 2.4 as well, although
is perhaps not so important since scsi-ide is normally used instead.
Neil Brown [Mon, 15 Dec 2003 08:35:56 +0000 (00:35 -0800)]
[PATCH] Fix possible bio corruption with RAID5
1/ make sure raid5 doesn't try to handle multiple overlaping
requests at the same time as this would confuse things badly.
Currently it justs BUGs if this is attempted.
2/ Fix a possible data-loss-on-write problem. If two or
more bio's that write to the same page are processed at the
same time, only the first was actually commited to storage.
3/ Fix a use-after-free bug. raid5 keeps the bio's it is given
in linked lists when more than one bio touch a single page.
In some cases the tail of this list can be freed, and
the current test for 'are we at the end' isn't reliable.
This patch strengths the test to make it reliable.
Linus Torvalds [Sun, 14 Dec 2003 10:28:27 +0000 (02:28 -0800)]
Fix thread group leader zombie leak
Petr Vandrovec noticed a problem where the thread group leader
would not be properly reaped if the parent of the thread group
was ignoring SIGCHLD, and the thread group leader had exited
before the last sub-thread.
Linus Torvalds [Sat, 13 Dec 2003 13:36:30 +0000 (05:36 -0800)]
More subtle SMP bugs in prepare_to_wait()/finish_wait().
This time we have a SMP memory ordering issue in prepare_to_wait(),
where we really need to make sure that subsequent tests for the
event we are waiting for can not migrate up to before the wait
queue has been set up.
[PATCH] HPFS: missing lock_kernel() in hpfs_readdir()
In 2.5.x, the BKL was pushed from vfs_readdir() into the filesystem
specific functions. But only the unlock_kernel() made it into the HPFS
code, lock_kernel() got lost on the way. This rendered the filesystem
unusable.
This adds the missing lock_kernel(). It's been tested by Timo Maier who
also reported the problem earlier today.
Linus Torvalds [Fri, 12 Dec 2003 06:20:08 +0000 (22:20 -0800)]
Fix subtle bug in "finish_wait()", which can cause kernel stack
corruption on SMP because of another CPU still accessing a waitqueue
even after it was de-allocated.
Use a careful version of the list emptiness check to make sure we
don't de-allocate the stack frame before the waitqueue is all done.
Herbert Xu [Tue, 9 Dec 2003 09:40:44 +0000 (01:40 -0800)]
[PATCH] USB: Fix connect/disconnect race
This patch was integrated by you in 2.4 six months ago. Unfortunately
it never got into 2.5. Without it you can end up with crashes such
as http://bugs.debian.org/218670
Tom Rini [Tue, 9 Dec 2003 01:42:34 +0000 (17:42 -0800)]
[PATCH] USB: mark the scanner driver as obsolete
On Mon, Dec 01, 2003 at 11:21:58AM -0800, Greg KH wrote:
> Can't you use xsane without the scanner kernel driver? I thought the
> latest versions used libusb/usbfs to talk directly to the hardware.
> Because of this, the USB scanner driver is marked to be removed from the
> kernel sometime in the near future.
After a bit of mucking around (and possibly finding a bug with debian's
libusb/xsane/hotplug interaction, nothing seems to run
/etc/hotplug/usb/libusbscanner and thus only root can scan, anyone whose
got this working please let me know), the problem does not exist if I
only use libusb xsane.
Matthew Dharm [Tue, 9 Dec 2003 01:35:37 +0000 (17:35 -0800)]
[PATCH] USB storage: fix for jumpshot and datafab devices
This patch fixes some obvious errors in the jumpshot and datafab drivers.
This should close out Bugzilla bug #1408
> Date: Mon, 1 Dec 2003 12:14:53 -0500 (EST)
> From: Alan Stern <stern@rowland.harvard.edu>
> Subject: Patch from Eduard Hasenleithner
> To: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
> cc: USB Storage List <usb-storage@one-eyed-alien.net>
>
> Matt:
>
> Did you see this patch? It was posted to the usb-development mailing list
> about a week ago, before I started making all my changes. It is clearly
> correct and necessary.
>
> Alan Stern
Jens Axboe [Tue, 9 Dec 2003 01:03:05 +0000 (17:03 -0800)]
[PATCH] scsi_ioctl memcpy'ing user address
James reported a bug in scsi_ioctl.c where it mem copies a user pointer
instead of using copy_from_user(). I inadvertently introduced this one
when getting rid of CDROM_SEND_PACKET. Here's a trivial patch to fix it.
David Brownell [Mon, 8 Dec 2003 05:28:46 +0000 (21:28 -0800)]
[PATCH] USB: fix remove device after set_configuration
If a device can't be configured, the current test9 code forgets
to clean it out of sysfs. This resolves that issue, so the retry
in usb_new_device() stands a chance of working.
The enumeration code still doesn't handle such errors well, but
at least this way that hub port can be used for another device.
James McMechan [Sun, 7 Dec 2003 13:57:40 +0000 (05:57 -0800)]
[PATCH] tmpfs oops fix
The problem was that the cursor was in the list being walked, and when
the pointer pointed to the cursor the list_del/list_add_tail pair would
oops trying to find the entry pointed to by the prev pointer of the
deleted cursor element.
The solution I found was to move the list_del earlier, before the
beginning of the list walk. since it is not used during the list walk and
should not count in the list enumeration it can be deleted, then the
list pointer cannot point to it so it can be added safely with the
list_add_tail without oopsing, and everything works as expected.
I am unable to oops this version with any of my test programs.
Jeff Garzik [Fri, 5 Dec 2003 15:34:00 +0000 (07:34 -0800)]
[PATCH] fix oops on unload in pcnet32
The driver was calling pci_unregister_driver for each _device_, and then
again at the end of the module unload routine. Remove the call that's
inside the loop, pci_unregister_driver should only be called once.
Ulrich Drepper [Thu, 4 Dec 2003 14:26:06 +0000 (06:26 -0800)]
[PATCH] Fix 'noexec' behaviour
We should not allow mmap() with PROT_EXEC on mounts marked "noexec",
since otherwise there is no way for user-supplied executable loaders
(like ld.so and emulator environments) to properly honour the
"noexec"ness of the target.
Jean Delvare [Thu, 4 Dec 2003 06:14:33 +0000 (22:14 -0800)]
[PATCH] I2C: fix i2c_smbus_write_byte() for i2c-nforce2
This patch fixes i2c_smbus_write_byte() being broken for i2c-nforce2.
This causes trouble when that module is used together with eeprom (which
is also in 2.6). We have had three user reports about the problem.
Credits go to Mark D. Studebaker for finding and fixing the problem.
Jens Axboe [Wed, 3 Dec 2003 23:53:31 +0000 (15:53 -0800)]
[PATCH] fix broken x86_64 rdtscll
The scheduler is completed b0rked on x86_64, and I finally found out
why. sched_clock() always returned 0, because rdtscll() always returned
0. The 'a' in the macro doesn't agree with the 'a' in the function,
yippe :-)
Ingo Molnar [Wed, 3 Dec 2003 04:59:12 +0000 (20:59 -0800)]
[PATCH] Fix /proc access to dead thread group list oops
The pid_alive() check within the loop is incorrect. If we are within
the tasklist lock and the thread group leader is valid then the thread
chain will be fully intact.
Instead, the check should be _outside_ the loop, since if the group
leader no longer exists, the whole list is gone and we must not try
to access it.
Hirofumi Ogawa [Mon, 1 Dec 2003 02:40:47 +0000 (18:40 -0800)]
[PATCH] Missing initialization of /proc/net/tcp seq_file
We need to initialize st->state in tcp_seq_start(). Otherwise
tcp_seq_stop() is run with previous st->state, and it calls the unneeded
unlock etc, causing a kernel crash.
Ben Collins [Wed, 26 Nov 2003 04:15:46 +0000 (20:15 -0800)]
[PATCH] Lastminute IEEE-1394 fixes
I've got a lot more changes than what's included here. I've put this
down to the bear minimum to get things working sanely.
Mainly, I just want to get all the people hit by this a chance to use
2.6.0 without having to get our tree. Changes itemized:
- Fix deadlock possibility in csr.c:read_maps()
- Fix kmalloc to use ATOMIC in highlevel.c.
- s/in_interrupt/irqs_disabled/ in ieee1394_transactions.c to fix
warnings when transactions occured.
- Introduce a release callback for the host driver and use it correctly.
- Reorganize the nodemgr probe so we do an initial scan to discover
devices, check IRM/CycleMaster, then do a final full probe when things
are kosher. Fixes a problem where device registration and hotplug
would cause some serious problems when a bus reset was forced in the
middle of the probe.
[PATCH] prevent oops from read of proc entry for tty drivers
There are /proc handles there setup by proc_tty_register_driver, but there is
no module ownership association, so anything that reads after module unload
will blow.
The trivial fix is to propagate the owner of tty_driver to proc entry.
David S. Miller [Mon, 24 Nov 2003 11:44:51 +0000 (03:44 -0800)]
[NET]: In sock_queue_rcv_skb(), do not deref skb->len after it is queued to the socket.
In implementations that use no socket locking, such as RAW sockets,
once we queue the SKB to the socket another cpu can remove the SKB
from the socket queue and free up the SKB making the skb->len access
touch freed memory.
Based upon a report from Burton Windle, kernel bugzilla #937
James Bottomley [Mon, 24 Nov 2003 04:02:15 +0000 (22:02 -0600)]
Fix locking problems in scsi_report_bus_reset() causing aic7xxx to hang
All the users of this function in the SCSI tree call it with the host
lock held. With the new list traversal code, it was trying to take
the lock again to traverse the list.
Fix it to use the unlocked version of list traversal and modify the
header comments to make it clear that the lock is expected to be held
on calling it.
James Bottomley [Sat, 22 Nov 2003 12:21:22 +0000 (06:21 -0600)]
Updated state model for SCSI devices
I've been looking at enforcing lifetime phases for SCSI devices
(primarily to try to get the mid layer to offload as much of the device
creation and hotplug pieces as it can).
I've hijacked the sdev_state field of the struct scsi_device (formerly
this was a bitmap, now it becomes an enumerated state).
I've also begun adding references sdev_gendev into the code to pin the
scsi_device---initially in the queue function, but possibly this should
also be done in the scsi_command_get/put, the idea being to prevent
scsi_device freeing while there's still device activity.
The object phases I identified are:
1. SDEV_CREATED - we've just allocated the device. It may respond to
internally generated commands, but not to user ones (the user should
actually have no way to access a device in this state, but just in
case).
2. SDEV_RUNNING - the device is fully operational
3. SDEV_CANCEL - The device is cleanly shutting down. It may respond to
internally generated commands (for cancellation/recovery) only; all user
commands are errored unless they have already been queued (QUEUE_FULL
handling and the like).
4. SDEV_DEL - The device is gone. *all* commands are errored out.
Ordinarily, the device should move through all four phases from creation
to destruction, but moving SDEV_RUNNING->SDEV_DEL because of surprise
ejection should work.
It's starting to look like the online flag should be absorbed into this
(offlined devices move essentially to SDEV_CANCEL and could be
reactivated by moving to SDEV_RUNNING).
I haven't altered the similar bitmap model that scsi_host has, although
this too should probably move to an enumerated state model.
I've tested this by physically yanking a module out from underneath a
running filesystem with no ill effects (other than a slew of I/O
errors).
The obvious problem is that this kills possible user error handling, but
we don't do any of that yet.
Mike Anderson [Sat, 22 Nov 2003 03:13:02 +0000 (21:13 -0600)]
[PATCH] scsi device ref count (update)
This patch is against scsi-bugfixes-2.6. I updated it based on comments
received. It breaks up the reference count initialization for scsi_device
and restores calling slave_destroy for all scsi_devices configured or
not. I ran a small regression using the scsi_debug, aic7xxx, and qla2xxx
driver. I also had a debug patch for more verbose kobject cleanup and
patch for a badness check on atomic_dec going negative (previously
provided by Linus).
The object cleanup appears to being functioning correctly. I only saw
previously reported badness output:
- Synchronizing SCSI cache fails on cleanup.
- scsi_debug.c missing release (I believe Doug posted a patch)
- aic7xxx warnings on rmmod due to ahc_platform_free calling
scsi_remove_host with ahc_list_lock held.
This patch splits the scsi device struct device register into init and
add. It also addresses memory leak issues of not calling slave_destroy
on scsi_devices that are not configured in.
Details:
* Make scsi_device_dev_release extern for scsi_scan to use in
alloc_sdev.
* Move scsi_free_sdev code to scsi_device_dev_release. Have
previous callers of scsi_free_sdev call slave_destroy plus put_device.
* Changed name of scsi_device_register to scsi_sysfs_add_sdev to
match host call and align with split struct device init.
* Move sdev_gendev device and class init to scsi_alloc_sdev.
Davide Libenzi [Sat, 22 Nov 2003 00:39:48 +0000 (16:39 -0800)]
[PATCH] More SiS interrupt routing
It turns out that the SiS irq routing logic doesn't go by chipset
after all - it's just that some pirq entries are "legacy" numbers,
while others are raw offsets into PCI config space (and the legacy
numbers are more commonly used with the older chipsets, which
explains the correlations).
Adam Belay [Fri, 21 Nov 2003 11:36:07 +0000 (03:36 -0800)]
[PATCH] reserve resources specified by the PnPBIOS properly
A bug prevents the PnP layer from reserving some of the resources
specified by the PnPBIOS. As a result some systems will have
unpredicable (random crashes etc.) problems because of resource
conflicts, especially when PCMCIA support is enabled. This patch
fixes the problem by ensuring that the proper resource data is
reserved.
David Stevens [Fri, 21 Nov 2003 08:49:32 +0000 (00:49 -0800)]
[IPV6]: Fix header length calculation in multicast input.
It did not account for extension headers properly. If we get
this length wrong, we do not determine if a multicast packet
is MLDv1 vs. MLDv2 correctly.
David Mosberger [Fri, 21 Nov 2003 06:10:39 +0000 (22:10 -0800)]
ia64: Drop printk from ia64_ni_syscall(). This is a temporary fix
for 2.6.0. The proper fix is to replace ia64_ni_syscall with
sys_ni_syscall, but that would make the patch quite large, so
we defer that till 2.6.1.
David Mosberger [Fri, 21 Nov 2003 05:25:04 +0000 (21:25 -0800)]
ia64: Fix off-by-1 error in imm60 patching. The bug hasn't been observed
in practice, but it's clearly wrong and just waiting there to
get triggered...
David Stevens [Thu, 20 Nov 2003 08:34:18 +0000 (00:34 -0800)]
[IPV6]: In igmp6_group_queried, fix address check to comply with RFC2710.
RFC2710 says:
1) MLD messages are never sent for multicast addresses whose scope is 0
(reserved) or 1 (node-local).
2) MLD messages ARE sent for multicast addresses whose scope is 2
(link-local), including Solicited-Node multicast addersses [ADDR-ARCH],
except for the link-scope, all-nodes address (FF02::1).
The current MLDv1 code does not send reports for link-scope addresses
and doesn't restrict scope 0. This may break switches that snoop reports for
determining which ports should receive particular addresses. Patch below.