v2.5.0.10 -> v2.5.0.11

author Linus Torvalds <torvalds@athlon.transmeta.com>

Tue, 5 Feb 2002 07:59:01 +0000 (23:59 -0800)

committer Linus Torvalds <torvalds@athlon.transmeta.com>

Tue, 5 Feb 2002 07:59:01 +0000 (23:59 -0800)
author Linus Torvalds <torvalds@athlon.transmeta.com>
Tue, 5 Feb 2002 07:59:01 +0000 (23:59 -0800)
committer Linus Torvalds <torvalds@athlon.transmeta.com>
Tue, 5 Feb 2002 07:59:01 +0000 (23:59 -0800)
diff --git a/Documentation/driver-model.txt b/Documentation/driver-model.txt

new file mode 100644 (file)

index 0000000..f77e051
--- /dev/null
+++ b/Documentation/driver-model.txt
@@ -0,0 +1,598 @@
+The (New) Linux Kernel Driver Model
+
+Version 0.04
+
+Patrick Mochel <mochel@osdl.org>
+
+03 December 2001
+
+
+Overview
+~~~~~~~~
+
+This driver model is a unification of all the current, disparate driver models
+that are currently in the kernel. It is intended is to augment the
+bus-specific drivers for bridges and devices by consolidating a set of data
+and operations into globally accessible data structures.
+
+Current driver models implement some sort of tree-like structure (sometimes
+just a list) for the devices they control. But, there is no linkage between
+the different bus types.
+
+A common data structure can provide this linkage with little overhead: when a
+bus driver discovers a particular device, it can insert it into the global
+tree as well as its local tree. In fact, the local tree becomes just a subset
+of the global tree.
+
+Common data fields can also be moved out of the local bus models into the
+global model. Some of the manipulation of these fields can also be
+consolidated. Most likely, manipulation functions will become a set
+of helper functions, which the bus drivers wrap around to include any
+bus-specific items.
+
+The common device and bridge interface currently reflects the goals of the
+modern PC: namely the ability to do seamless Plug and Play, power management,
+and hot plug. (The model dictated by Intel and Microsoft (read: ACPI) ensures
+us that any device in the system may fit any of these criteria.)
+
+In reality, not every bus will be able to support such operations. But, most
+buses will support a majority of those operations, and all future buses will.
+In other words, a bus that doesn't support an operation is the exception,
+instead of the other way around.
+
+
+Drivers
+~~~~~~~
+
+The callbacks for bridges and devices are intended to be singular for a
+particular type of bus. For each type of bus that has support compiled in the
+kernel, there should be one statically allocated structure with the
+appropriate callbacks that each device (or bridge) of that type share.
+
+Each bus layer should implement the callbacks for these drivers. It then
+forwards the calls on to the device-specific callbacks. This means that
+device-specific drivers must still implement callbacks for each operation.
+But, they are not called from the top level driver layer.
+
+This does add another layer of indirection for calling one of these functions,
+but there are benefits that are believed to outweigh this slowdown.
+
+First, it prevents device-specific drivers from having to know about the
+global device layer. This speeds up integration time incredibly. It also
+allows drivers to be more portable across kernel versions. Note that the
+former was intentional, the latter is an added bonus.
+
+Second, this added indirection allows the bus to perform any additional logic
+necessary for its child devices. A bus layer may add additional information to
+the call, or translate it into something meaningful for its children.
+
+This could be done in the driver, but if it happens for every object of a
+particular type, it is best done at a higher level.
+
+Recap
+~~~~~
+
+Instances of devices and bridges are allocated dynamically as the system
+discovers their existence. Their fields describe the individual object.
+Drivers - in the global sense - are statically allocated and singular for a
+particular type of bus. They describe a set of operations that every type of
+bus could implement, the implementation following the bus's semantics.
+
+
+Downstream Access
+~~~~~~~~~~~~~~~~~
+
+Common data fields have been moved out of individual bus layers into a common
+data structure. But, these fields must still be accessed by the bus layers,
+and sometimes by the device-specific drivers.
+
+Other bus layers are encouraged to do what has been done for the PCI layer.
+struct pci_dev now looks like this:
+
+struct pci_dev {
+       ...
+
+       struct device device;
+};
+
+Note first that it is statically allocated. This means only one allocation on
+device discovery. Note also that it is at the _end_ of struct pci_dev. This is
+to make people think about what they're doing when switching between the bus
+driver and the global driver; and to prevent against mindless casts between
+the two.
+
+The PCI bus layer freely accesses the fields of struct device. It knows about
+the structure of struct pci_dev, and it should know the structure of struct
+device. PCI devices that have been converted generally do not touch the fields
+of struct device. More precisely, device-specific drivers should not touch
+fields of struct device unless there is a strong compelling reason to do so.
+
+This abstraction is prevention of unnecessary pain during transitional phases.
+If the name of the field changes or is removed, then every downstream driver
+will break. On the other hand, if only the bus layer (and not the device
+layer) accesses struct device, it is only those that need to change.
+
+
+User Interface
+~~~~~~~~~~~~~~
+
+By virtue of having a complete hierarchical view of all the devices in the
+system, exporting a complete hierarchical view to userspace becomes relatively
+easy.
+
+Whenever a device is inserted into the tree, a directory is created for it.
+This directory may be populated at each layer of discovery - the global layer,
+the bus layer, or the device layer.
+
+The global layer currently creates two files - 'status' and 'power'. The
+former only reports the name of the device and its bus ID. The latter reports
+the current power state of the device. It also be used to set the current
+power state.
+
+The bus layer may also create files for the devices it finds while probing the
+bus. For example, the PCI layer currently creates 'wake' and 'resource' files
+for each PCI device.
+
+A device-specific driver may also export files in its directory to expose
+device-specific data or tunable interfaces.
+
+These features were initially implemented using procfs. However, after one
+conversation with Linus, a new filesystem - driverfs - was created to
+implement these features. It is an in-memory filesystem, based heavily off of
+ramfs, though it uses procfs as inspiration for its callback functionality.
+
+Each struct device has a 'struct driver_dir_entry' which encapsulates the
+device's directory and the files within.
+
+Device Structures
+~~~~~~~~~~~~~~~~~
+
+struct device {
+       struct list_head        bus_list;
+       struct iobus            *parent;
+       struct iobus            *subordinate;
+
+       char                    name[DEVICE_NAME_SIZE];
+       char                    bus_id[BUS_ID_SIZE];
+
+       struct driver_dir_entry * dir;
+
+       spinlock_t              lock;
+       atomic_t                refcount;
+
+       struct device_driver    *driver;
+       void                    *driver_data;
+       void                    *platform_data;
+
+       u32                     current_state;
+       unsigned char           *saved_state;
+};
+
+bus_list:
+       List of all devices on a particular bus; i.e. the device's siblings
+
+parent:
+       The parent bridge for the device.
+
+subordinate:
+       If the device is a bridge itself, this points to the struct io_bus that is
+       created for it.
+
+name:
+       Human readable (descriptive) name of device. E.g. "Intel EEPro 100"
+
+bus_id:
+       Parsable (yet ASCII) bus id. E.g. "00:04.00" (PCI Bus 0, Device 4, Function
+       0). It is necessary to have a searchable bus id for each device; making it
+       ASCII allows us to use it for its directory name without translating it.
+
+dir:
+       Driver's driverfs directory.
+
+lock:
+       Driver specific lock.
+
+refcount:
+       Driver's usage count.
+       When this goes to 0, the device is assumed to be removed. It will be removed
+       from its parent's list of children. It's remove() callback will be called to
+       inform the driver to clean up after itself.
+
+driver:
+       Pointer to a struct device_driver, the common operations for each device. See
+       next section.
+
+driver_data:
+       Private data for the driver.
+       Much like the PCI implementation of this field, this allows device-specific
+       drivers to keep a pointer to a device-specific data.
+
+platform_data:
+       Data that the platform (firmware) provides about the device.
+       For example, the ACPI BIOS or EFI may have additional information about the
+       device that is not directly mappable to any existing kernel data structure.
+       It also allows the platform driver (e.g. ACPI) to a driver without the driver
+       having to have explicit knowledge of (atrocities like) ACPI.
+
+
+current_state:
+       Current power state of the device. For PCI and other modern devices, this is
+       0-3, though it's not necessarily limited to those values.
+
+saved_state:
+       Pointer to driver-specific set of saved state.
+       Having it here allows modules to be unloaded on system suspend and reloaded
+       on resume and maintain state across transitions.
+       It also allows generic drivers to maintain state across system state
+       transitions.
+       (I've implemented a generic PCI driver for devices that don't have a
+       device-specific driver. Instead of managing some vector of saved state
+       for each device the generic driver supports, it can simply store it here.)
+
+
+
+struct device_driver {
+        int     (*probe)        (struct device *dev);
+        int     (*remove)       (struct device *dev);
+
+        int     (*suspend)      (struct device *dev, u32 state, u32 level);
+        int     (*resume)       (struct device *dev, u32 level);
+}
+
+probe:
+       Check for device existence and associate driver with it.
+
+remove:
+       Dissociate driver with device. Releases device so that it could be used by
+       another driver. Also, if it is a hotplug device (hotplug PCI, Cardbus), an
+       ejection event could take place here.
+
+suspend:
+       Perform one step of the device suspend process.
+
+resume:
+       Perform one step of the device resume process.
+
+The probe() and remove() callbacks are intended to be much simpler than the
+current PCI correspondents.
+
+probe() should do the following only:
+
+- Check if hardware is present
+- Register device interface
+- Disable DMA/interrupts, etc, just in case.
+
+Some device initialisation was done in probe(). This should not be the case
+anymore. All initialisation should take place in the open() call for the
+device.
+
+Breaking initialisation code out must also be done for the resume() callback,
+as most devices will have to be completely reinitialised when coming back from
+a suspend state.
+
+remove() should simply unregister the device interface.
+
+
+Device power management can be quite complicated, based exactly what is
+desired to be done. Four operations sum up most of it:
+
+- OS directed power management.
+  The OS takes care of notifying all drivers that a suspend is requested,
+  saving device state, and powering devices down.
+- Firmware controlled power management.
+  The OS only wants to notify devices that a suspend is requested.
+- Device power management.
+  A user wants to place only one device in a low power state, and maybe save
+  state.
+- System reboot.
+  The system wants to place devices in a quiescent state before the system is
+  reset.
+
+In an attempt to please all of these scenarios, the power management
+transition for any device is broken up into several stages - notify, save
+state, and power down. The disable stage, which should happen after notify and
+before save state has been considered and may be implemented in the future.
+
+Depending on what the system-wide policy is (usually dictated by the power
+management scheme present), each driver's suspend callback may be called
+multiple times, each with a different stage.
+
+On all power management transitions, the stages should be called sequentially
+(notify before save state; save state before power down). However, drivers
+should not assume that any stage was called before hand. (If a driver gets a
+power down call, it shouldn't assume notify or save state was called first.)
+This allows the framework to be used seamlessly by all power management
+actions. Hopefully.
+
+Resume transitions happen in a similar manner. They are broken up into two
+stages currently (power on and restore state), though a third stage (enable)
+may be added later.
+
+For suspend and resume transitions, the following values are defined to denote
+the stage:
+
+enum{
+       SUSPEND_NOTIFY,
+       SUSPEND_SAVE_STATE,
+       SUSPEND_POWER_DOWN,
+};
+
+enum {
+       RESUME_POWER_ON,
+       RESUME_RESTORE_STATE,
+};
+
+
+During a system power transition, the device tree must be walked in order,
+calling the suspend() or resume() callback for each node. This may happen
+several times.
+
+Initially, this was done in kernel space. However, it has occurred to me that
+doing recursion to a non-bounded depth is dangerous, and that there are a lot
+of inherent race conditions in such an operation.
+
+Non-recursive walking of the device tree is possible. However, this makes for
+convoluted code.
+
+No matter what, if the transition happens in kernel space, it is difficult to
+gracefully recover from errors or to implement a policy that prevents one from
+shutting down the device(s) you want to save state to.
+
+Instead, the walking of the device tree has been moved to userspace. When a
+user requests the system to suspend, it will walk the device tree, as exported
+via driverfs, and tell each device to go to sleep. It will do this multiple
+times based on what the system policy is.
+
+Device resume should happen in the same manner when the system awakens.
+
+Each suspend stage is described below:
+
+SUSPEND_NOTIFY:
+
+This level to notify the driver that it is going to sleep. If it knows that it
+cannot resume the hardware from the requested level, or it feels that it is
+too important to be put to sleep, it should return an error from this function.
+
+It does not have to stop I/O requests or actually save state at this point.
+
+SUSPEND_DISABLE:
+
+The driver should stop taking I/O requests at this stage. Because the save
+state stage happens afterwards, the driver may not want to physically disable
+the device; only mark itself unavailable if possible.
+
+SUSPEND_SAVE_STATE:
+
+The driver should allocate memory and save any device state that is relevant
+for the state it is going to enter.
+
+SUSPEND_POWER_DOWN:
+
+The driver should place the device in the power state requested.
+
+
+For resume, the stages are defined as follows:
+
+RESUME_POWER_ON:
+
+Devices should be powered on and reinitialised to some known working state.
+
+RESUME_RESTORE_STATE:
+
+The driver should restore device state to its pre-suspend state and free any
+memory allocated for its saved state.
+
+RESUME_ENABLE:
+
+The device should start taking I/O requests again.
+
+
+Each driver does not have to implement each stage. But, it if it does
+implemente a stage, it should do what is described above. It should not assume
+that it performed any stage previously, or that it will perform any stage
+later.
+
+It is quite possible that a driver can fail during the suspend process, for
+whatever reason. In this event, the calling process must gracefully recover
+and restore everything to their states before the suspend transition began.
+
+If a driver knows that it cannot suspend or resume properly, it should fail
+during the notify stage. Properly implemented power management schemes should
+make sure that this is the first stage that is called.
+
+If a driver gets a power down request, it should obey it, as it may very
+likely be during a reboot.
+
+
+Bus Structures
+~~~~~~~~~~~~~~
+
+struct iobus {
+       struct  list_head       node;
+       struct  iobus           *parent;
+       struct  list_head       children;
+       struct  list_head       devices;
+
+       struct  list_head       bus_list;
+
+       spinlock_t              lock;
+       atomic_t                refcount;
+
+       struct  device          *self;
+       struct  driver_dir_entry * dir;
+
+       char    name[DEVICE_NAME_SIZE];
+       char    bus_id[BUS_ID_SIZE];
+
+       struct  bus_driver      *driver;
+};
+
+node:
+       Bus's node in sibling list (its parent's list of child buses).
+
+parent:
+       Pointer to parent bridge.
+
+children:
+       List of subordinate buses.
+       In the children, this correlates to their 'node' field.
+
+devices:
+       List of devices on the bus this bridge controls.
+       This field corresponds to the 'bus_list' field in each child device.
+
+bus_list:
+       Each type of bus keeps a list of all bridges that it finds. This is the
+       bridges entry in that list.
+
+self:
+       Pointer to the struct device for this bridge.
+
+lock:
+       Lock for the bus.
+
+refcount:
+       Usage count for the bus.
+
+dir:
+       Driverfs directory.
+
+name:
+       Human readable ASCII name of bus.
+
+bus_id:
+       Machine readable (though ASCII) description of position on parent bus.
+
+driver:
+       Pointer to operations for bus.
+
+
+struct iobus_driver {
+       char    name[16];
+       struct  list_head node;
+
+       int     (*scan)         (struct io_bus*);
+       int     (*add_device)   (struct io_bus*, char*);
+};
+
+name:
+       ASCII name of bus.
+
+node:
+       List of buses of this type in system.
+
+scan:
+       Search the bus for new devices. This may happen either at boot - where every
+       device discovered will be new - or later on - in which there may only be a few
+       (or no) new devices.
+
+add_device:
+       Trigger a device insertion at a particular location.
+
+
+
+The API
+~~~~~~~
+
+There are several functions exported by the global device layer, including
+several optional helper functions, written solely to try and make your life
+easier.
+
+void device_init_dev(struct device * dev);
+
+Initialise a device structure. It first zeros the device, the initialises all
+of the lists. (Note that this would have been called device_init(), but that
+name was already taken. :/)
+
+
+struct device * device_alloc(void)
+
+Allocate memory for a device structure and initialise it.
+First, allocates memory, then calls device_init_dev() with the new pointer.
+
+
+int device_register(struct device * dev);
+
+Register a device with the global device layer.
+The bus layer should call this function upon device discovery, e.g. when
+probing the bus.
+dev should be fully initialised when this is called.
+If dev->parent is not set, it sets its parent to be the device root.
+It then does the following:
+       - inserts it into its parent's list of children
+       - creates a driverfs directory for it
+       - creates a set of default files for the device in its directory
+       - calls platform_notify() to notify the firmware driver of its existence.
+
+
+void get_device(struct device * dev);
+
+Increment the refcount for a device.
+
+
+int valid_device(struct device * dev);
+
+Check if reference count is positive for a device (it's not waiting to be
+freed). If it is positive, it increments the reference count for the device.
+It returns whether or not the device is usable.
+
+
+void put_device(struct device * dev);
+
+Decrement the reference count for the device. If it hits 0, it removes the
+device from its parent's list of children and calls the remove() callback for
+the device.
+
+
+void lock_device(struct device * dev);
+
+Take the spinlock for the device.
+
+
+void unlock_device(struct device * dev);
+
+Release the spinlock for the device.
+
+
+
+void   iobus_init(struct iobus * iobus);
+struct         iobus * iobus_alloc(void);
+int    iobus_register(struct iobus * iobus);
+void   get_iobus(struct iobus * iobus);
+int    valid_iobus(struct iobus * iobus);
+void   put_iobus(struct iobus * iobus);
+void   lock_iobus(struct iobus * iobus);
+void   unlock_iobus(struct iobus * iobus);
+
+These functions provide the same functionality as the device_*
+counterparts, only operating on a struct iobus. One important thing to note,
+though is that iobus_register() and iobus_unregister() operate recursively. It
+is possible to add an entire tree in one call.
+
+
+
+int device_driver_init(void);
+
+Main initialisation routine.
+
+This makes sure driverfs is up and running and initialises the device tree.
+
+
+void device_driver_exit(void);
+
+This frees up the device tree.
+
+
+
+
+Credits
+~~~~~~~
+
+The following people have been extremely helpful in solidifying this document
+and the driver model.
+
+Randy Dunlap           rddunlap@osdl.org
+Jeff Garzik            jgarzik@mandrakesoft.com
+Ben Herrenschmidt      benh@kernel.crashing.org
+
+
diff --git a/Documentation/filesystems/driverfs.txt b/Documentation/filesystems/driverfs.txt

new file mode 100644 (file)

index 0000000..b1f2553
--- /dev/null
+++ b/Documentation/filesystems/driverfs.txt
@@ -0,0 +1,211 @@
+
+driverfs - The Device Driver Filesystem
+
+Patrick Mochel <mochel@osdl.org>
+
+3 December 2001
+
+
+What it is:
+~~~~~~~~~~~
+driverfs is a unified means for device drivers to export interfaces to
+userspace.
+
+Some drivers have a need for exporting interfaces for things like
+setting device-specific parameters, or tuning the device performance.
+For example, wireless networking cards export a file in procfs to set
+their SSID.
+
+Other times, the bus on which a device resides may export other
+information about the device. For example, PCI and USB both export
+device information via procfs or usbdevfs.
+
+In these cases, the files or directories are in nearly random places
+in /proc. One benefit of driverfs is that it can consolidate all of
+these interfaces to one standard location.
+
+
+Why it's better than procfs:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+This of course can't happen without changing every single driver that
+exports a procfs interface, and having some coordination between all
+of them as to what the proper place for their files is. Or can it?
+
+
+driverfs was developed in conjunction with the new driver model for
+the 2.5 kernel. In that model, the system has one unified tree of all
+the devices that are present in the system. It follows naturally that
+this tree can be exported to userspace in the same order.
+
+So, every bus and every device gets a directory in the filesystem.
+This directory is created when the device is registered in the tree;
+before the driver actually gets a initialised. The dentry for this
+directory is stored in the struct device for this driver, so the
+driver has access to it.
+
+Now, every driver has one standard place to export its files.
+
+Granted, the location of the file is not as intuitive as it may have
+been under procfs. But, I argue that with the exception of
+/proc/bus/pci, none of the files had intuitive locations. I also argue
+that the development of userspace tools can help cope with these
+changes and inconsistencies in locations.
+
+
+Why we're not just using procfs:
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When developing the new driver model, it was initially implemented
+with a procfs tree. In explaining the concept to Linus, he said "Don't
+use proc."
+
+I was a little shocked (especially considering I had already
+implemented it using procfs). "What do you mean 'don't use proc'?"
+
+His argument was that too many things use proc that shouldn't. And
+even more things misuse proc that shouldn't. On top of that, procfs
+was written before the VFS layer was written, so it doesn't use the
+dcache. It reimplements many of the same features that the dcache
+does, and is in general, crufty.
+
+So, he told me to write my own. Soon after, he pointed me at ramfs,
+the simplest filesystem known to man.
+
+Consequently, we have a virtual fileystem based heavily on ramfs, and
+borrowing some conceptual functionality from procfs.
+
+It may suck, but it does what it was designed to. At least so far.
+
+
+How it works:
+~~~~~~~~~~~~~
+
+Directories are encapsulated like this:
+
+struct driver_dir_entry {
+       char                    * name;
+       struct dentry           * dentry;
+       mode_t                  mode;
+       struct list_head        files;
+};
+
+name:
+       Name of the directory.
+dentry:
+       Dentry for the directory.
+mode:
+       Permissions of the directory.
+files:
+       Linked list of driver_file_entry's that are in the directory.
+
+
+To create a directory, one first calls
+
+struct driver_dir_entry *
+driverfs_create_dir_entry(const char * name, mode_t mode);
+
+which allocates and initialises a struct driver_dir_entry. Then to actually
+create the directory:
+
+int driverfs_create_dir(struct driver_dir_entry *, struct driver_dir_entry *);
+
+To remove a directory:
+
+void driverfs_remove_dir(struct driver_dir_entry * entry);
+
+
+Files are encapsulated like this:
+
+struct driver_file_entry {
+       struct driver_dir_entry * parent;
+       struct list_head        node;
+       char                    * name;
+       mode_t                  mode;
+       struct dentry           * dentry;
+       void                    * data;
+       struct driverfs_operations      * ops;
+};
+
+struct driverfs_operations {
+       ssize_t (*read) (char *, size_t, loff_t, void *);
+       ssize_t (*write)(const char *, size_t, loff_t, void*);
+};
+
+node:
+       Node in its parent directory's list of files.
+
+name:
+       The name of the file.
+
+dentry:
+       The dentry for the file.
+
+data:
+       Caller specific data that is passed to the callbacks when they
+       are called.
+
+ops:
+       Operations for the file. Currently, this only contains read() and write()
+       callbacks for the file.
+
+To create a file, one first calls
+
+struct driver_file_entry *
+driverfs_create_entry (const char * name, mode_t mode,
+                       struct driverfs_operations * ops, void * data);
+
+That allocates and initialises a struct driver_file_entry. Then, to actually
+create a file, one calls
+
+int driverfs_create_file(struct driver_file_entry * entry,
+                       struct driver_dir_entry * parent);
+
+
+To remove a file, one calls
+
+void driverfs_remove_file(struct driver_dir_entry *, const char * name);
+
+
+The callback functionality is similar to the way procfs works. When a
+user performs a read(2) or write(2) on the file, it first calls a
+driverfs function. This function then checks for a non-NULL pointer in
+the file->private_data field, which it assumes to be a pointer to a
+struct driver_file_entry.
+
+It then checks for the appropriate callback and calls it.
+
+
+What driverfs is not:
+~~~~~~~~~~~~~~~~~~~~~
+It is not a replacement for either devfs or procfs.
+
+It does not handle device nodes, like devfs is intended to do. I think
+this functionality is possible, but indeed think that integration of
+the device nodes and control files should be done. Whether driverfs or
+devfs, or something else, is the place to do it, I don't know.
+
+It is not intended to be a replacement for all of the procfs
+functionality. I think that many of the driver files should be moved
+out of /proc (and maybe a few other things as well ;).
+
+
+
+Limitations:
+~~~~~~~~~~~~
+The driverfs functions assume that at most a page is being either read
+or written each time.
+
+
+Possible bugs:
+~~~~~~~~~~~~~~
+It may not deal with offsets and/or seeks very well, especially if
+they cross a page boundary.
+
+There may be locking issues when dynamically adding/removing files and
+directories rapidly (like if you have a hot plug device).
+
+There are some people that believe that filesystems which add
+files/directories dynamically based on the presence of devices are
+inherently flawed. Though not as technically versed in this area as
+some of those people, I like to believe that they can be made to work,
+with the right guidance.
+
diff --git a/Makefile b/Makefile

index 02dbfd5dd4477403b0bb3d21c9226c0ac206d8b9..a62e69d0f39c975d7db7f60014d1b74f76ced0f3 100644 (file)
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
  VERSION = 2
  PATCHLEVEL = 5
  SUBLEVEL = 1
-EXTRAVERSION =-pre10
+EXTRAVERSION =-pre11
  
  KERNELRELEASE=$(VERSION).$(PATCHLEVEL).$(SUBLEVEL)$(EXTRAVERSION)
  
diff --git a/arch/i386/lib/iodebug.c b/arch/i386/lib/iodebug.c

index 701a07fe72292d734c800e3c71bec8d316c332b1..3f74de6a05fa707323668748f35d094c7a07621b 100644 (file)
--- a/arch/i386/lib/iodebug.c
+++ b/arch/i386/lib/iodebug.c
@@ -9,11 +9,3 @@ void * __io_virt_debug(unsigned long x, const char *file, int line)
         return (void *)x;
  }
  
-unsigned long __io_phys_debug(unsigned long x, const char *file, int line)
-{
-       if (x < PAGE_OFFSET) {
-               printk("io mapaddr 0x%05lx not valid at %s:%d!\n", x, file, line);
-               return x;
-       }
-       return __pa(x);
-}
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c

index 371755761a53d64fa6675ebed03092c5e5be1a70..74aca53678130f736e5d65ab4fa114bdff59311d 100644 (file)
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -1237,7 +1237,7 @@ queue:
  
         blkdev_dequeue_request(creq);
  
-       spin_unlock_irq(&q->queue_lock);
+       spin_unlock_irq(q->queue_lock);
  
         c->cmd_type = CMD_RWREQ;
         c->rq = creq;
@@ -1298,7 +1298,7 @@ queue:
         c->Request.CDB[8]= creq->nr_sectors & 0xff; 
         c->Request.CDB[9] = c->Request.CDB[11] = c->Request.CDB[12] = 0;
  
-       spin_lock_irq(&q->queue_lock);
+       spin_lock_irq(q->queue_lock);
  
         addQ(&(h->reqQ),c);
         h->Qdepth++;
@@ -1866,7 +1866,7 @@ static int __init cciss_init_one(struct pci_dev *pdev,
  
         q = BLK_DEFAULT_QUEUE(MAJOR_NR + i);
          q->queuedata = hba[i];
-        blk_init_queue(q, do_cciss_request);
+        blk_init_queue(q, do_cciss_request, &hba[i]->lock);
         blk_queue_bounce_limit(q, hba[i]->pdev->dma_mask);
         blk_queue_max_segments(q, MAXSGENTRIES);
         blk_queue_max_sectors(q, 512);
diff --git a/drivers/block/cciss.h b/drivers/block/cciss.h

index 357088d21918ea045ea0c8cf9106c4b9f7f80905..03afe43dacf9924ec6ff614d8675b26523386f7a 100644 (file)
--- a/drivers/block/cciss.h
+++ b/drivers/block/cciss.h
@@ -66,6 +66,7 @@ struct ctlr_info
         unsigned int Qdepth;
         unsigned int maxQsinceinit;
         unsigned int maxSG;
+       spinlock_t lock;
  
         //* pointers to command and error info pool */ 
         CommandList_struct      *cmd_pool;
@@ -242,7 +243,7 @@ struct board_type {
         struct access_method *access;
  };
  
-#define CCISS_LOCK(i)  (&((BLK_DEFAULT_QUEUE(MAJOR_NR + i))->queue_lock))
+#define CCISS_LOCK(i)  ((BLK_DEFAULT_QUEUE(MAJOR_NR + i))->queue_lock)
  
  #endif /* CCISS_H */
  
diff --git a/drivers/block/cpqarray.c b/drivers/block/cpqarray.c

index 4ff77277d51905c38b9ef69d64e85ecf821d5fb8..5f85cb0b5b6b9c8db8b0022e74b437170bfe8a89 100644 (file)
--- a/drivers/block/cpqarray.c
+++ b/drivers/block/cpqarray.c
@@ -467,7 +467,7 @@ int __init cpqarray_init(void)
  
                 q = BLK_DEFAULT_QUEUE(MAJOR_NR + i);
                 q->queuedata = hba[i];
-               blk_init_queue(q, do_ida_request);
+               blk_init_queue(q, do_ida_request, &hba[i]->lock);
                 blk_queue_bounce_limit(q, hba[i]->pci_dev->dma_mask);
                 blk_queue_max_segments(q, SG_MAX);
                 blksize_size[MAJOR_NR+i] = ida_blocksizes + (i*256);
@@ -882,7 +882,7 @@ queue_next:
  
         blkdev_dequeue_request(creq);
  
-       spin_unlock_irq(&q->queue_lock);
+       spin_unlock_irq(q->queue_lock);
  
         c->ctlr = h->ctlr;
         c->hdr.unit = MINOR(creq->rq_dev) >> NWD_SHIFT;
@@ -915,7 +915,7 @@ DBGPX(      printk("Submitting %d sectors in %d segments\n", creq->nr_sectors, seg);
         c->req.hdr.cmd = (rq_data_dir(creq) == READ) ? IDA_READ : IDA_WRITE;
         c->type = CMD_RWREQ;
  
-       spin_lock_irq(&q->queue_lock);
+       spin_lock_irq(q->queue_lock);
  
         /* Put the request on the tail of the request queue */
         addQ(&h->reqQ, c);
diff --git a/drivers/block/cpqarray.h b/drivers/block/cpqarray.h

index bdb8e4108f9cfe836b051faa1340e68e35864e42..80b4dba8b83e4a98f68527bddbc95cc6d10c6ed6 100644 (file)
--- a/drivers/block/cpqarray.h
+++ b/drivers/block/cpqarray.h
@@ -106,6 +106,7 @@ struct ctlr_info {
         cmdlist_t *cmd_pool;
         dma_addr_t cmd_pool_dhandle;
         __u32   *cmd_pool_bits;
+       spinlock_t lock;
  
         unsigned int Qdepth;
         unsigned int maxQsinceinit;
@@ -117,7 +118,7 @@ struct ctlr_info {
         unsigned int misc_tflags;
  };
  
-#define IDA_LOCK(i)    (&((BLK_DEFAULT_QUEUE(MAJOR_NR + i))->queue_lock))
+#define IDA_LOCK(i)    ((BLK_DEFAULT_QUEUE(MAJOR_NR + i))->queue_lock)
  
  #endif
  
diff --git a/drivers/block/floppy.c b/drivers/block/floppy.c

index 897f3c886b4505d2f5065af70a36bf41f6d6a701..2417023debafe3694a56655c42ed7c0f7d1c826c 100644 (file)
--- a/drivers/block/floppy.c
+++ b/drivers/block/floppy.c
@@ -204,6 +204,8 @@ static int use_virtual_dma;
   * record each buffers capabilities
   */
  
+static spinlock_t floppy_lock;
+
  static unsigned short virtual_dma_port=0x3f0;
  void floppy_interrupt(int irq, void *dev_id, struct pt_regs * regs);
  static int set_dor(int fdc, char mask, char data);
@@ -2296,7 +2298,7 @@ static void request_done(int uptodate)
                         DRS->maxtrack = 1;
  
                 /* unlock chained buffers */
-               spin_lock_irqsave(&QUEUE->queue_lock, flags);
+               spin_lock_irqsave(QUEUE->queue_lock, flags);
                 while (current_count_sectors && !QUEUE_EMPTY &&
                        current_count_sectors >= CURRENT->current_nr_sectors){
                         current_count_sectors -= CURRENT->current_nr_sectors;
@@ -2304,7 +2306,7 @@ static void request_done(int uptodate)
                         CURRENT->sector += CURRENT->current_nr_sectors;
                         end_request(1);
                 }
-               spin_unlock_irqrestore(&QUEUE->queue_lock, flags);
+               spin_unlock_irqrestore(QUEUE->queue_lock, flags);
  
                 if (current_count_sectors && !QUEUE_EMPTY){
                         /* "unlock" last subsector */
@@ -2329,9 +2331,9 @@ static void request_done(int uptodate)
                         DRWE->last_error_sector = CURRENT->sector;
                         DRWE->last_error_generation = DRS->generation;
                 }
-               spin_lock_irqsave(&QUEUE->queue_lock, flags);
+               spin_lock_irqsave(QUEUE->queue_lock, flags);
                 end_request(0);
-               spin_unlock_irqrestore(&QUEUE->queue_lock, flags);
+               spin_unlock_irqrestore(QUEUE->queue_lock, flags);
         }
  }
  
@@ -2433,17 +2435,20 @@ static void rw_interrupt(void)
  static int buffer_chain_size(void)
  {
         struct bio *bio;
-       int size;
+       struct bio_vec *bv;
+       int size, i;
         char *base;
  
-       base = CURRENT->buffer;
+       base = bio_data(CURRENT->bio);
         size = 0;
  
         rq_for_each_bio(bio, CURRENT) {
-               if (bio_data(bio) != base + size)
-                       break;
+               bio_for_each_segment(bv, bio, i) {
+                       if (page_address(bv->bv_page) + bv->bv_offset != base + size)
+                               break;
  
-               size += bio->bi_size;
+                       size += bv->bv_len;
+               }
         }
  
         return size >> 9;
@@ -2469,9 +2474,10 @@ static int transfer_size(int ssize, int max_sector, int max_size)
  static void copy_buffer(int ssize, int max_sector, int max_sector_2)
  {
         int remaining; /* number of transferred 512-byte sectors */
+       struct bio_vec *bv;
         struct bio *bio;
         char *buffer, *dma_buffer;
-       int size;
+       int size, i;
  
         max_sector = transfer_size(ssize,
                                    minimum(max_sector, max_sector_2),
@@ -2501,12 +2507,17 @@ static void copy_buffer(int ssize, int max_sector, int max_sector_2)
  
         dma_buffer = floppy_track_buffer + ((fsector_t - buffer_min) << 9);
  
-       bio = CURRENT->bio;
         size = CURRENT->current_nr_sectors << 9;
-       buffer = CURRENT->buffer;
  
-       while (remaining > 0){
-               SUPBOUND(size, remaining);
+       rq_for_each_bio(bio, CURRENT) {
+               bio_for_each_segment(bv, bio, i) {
+                       if (!remaining)
+                               break;
+
+                       size = bv->bv_len;
+                       SUPBOUND(size, remaining);
+
+                       buffer = page_address(bv->bv_page) + bv->bv_offset;
  #ifdef FLOPPY_SANITY_CHECK
                 if (dma_buffer + size >
                     floppy_track_buffer + (max_buffer_sectors << 10) ||
@@ -2526,24 +2537,14 @@ static void copy_buffer(int ssize, int max_sector, int max_sector_2)
                 if (((unsigned long)buffer) % 512)
                         DPRINT("%p buffer not aligned\n", buffer);
  #endif
-               if (CT(COMMAND) == FD_READ)
-                       memcpy(buffer, dma_buffer, size);
-               else
-                       memcpy(dma_buffer, buffer, size);
-               remaining -= size;
-               if (!remaining)
-                       break;
+                       if (CT(COMMAND) == FD_READ)
+                               memcpy(buffer, dma_buffer, size);
+                       else
+                               memcpy(dma_buffer, buffer, size);
  
-               dma_buffer += size;
-               bio = bio->bi_next;
-#ifdef FLOPPY_SANITY_CHECK
-               if (!bio){
-                       DPRINT("bh=null in copy buffer after copy\n");
-                       break;
+                       remaining -= size;
+                       dma_buffer += size;
                 }
-#endif
-               size = bio->bi_size;
-               buffer = bio_data(bio);
         }
  #ifdef FLOPPY_SANITY_CHECK
         if (remaining){
@@ -4169,7 +4170,7 @@ int __init floppy_init(void)
  
         blk_size[MAJOR_NR] = floppy_sizes;
         blksize_size[MAJOR_NR] = floppy_blocksizes;
-       blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST);
+       blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST, &floppy_lock);
         reschedule_timeout(MAXTIMEOUT, "floppy init", MAXTIMEOUT);
         config_types();
  
@@ -4477,6 +4478,7 @@ MODULE_LICENSE("GPL");
  #else
  
  __setup ("floppy=", floppy_setup);
+module_init(floppy_init)
  
  /* eject the boot floppy (if we need the drive for a different root floppy) */
  /* This should only be called at boot time when we're sure that there's no
diff --git a/drivers/block/ll_rw_blk.c b/drivers/block/ll_rw_blk.c

index 048dcbdef1ca1b722550befdb9d58e2b9817a802..9849061f045aac78b269f1e930c16a5d983c191f 100644 (file)
--- a/drivers/block/ll_rw_blk.c
+++ b/drivers/block/ll_rw_blk.c
@@ -254,6 +254,12 @@ void blk_queue_segment_boundary(request_queue_t *q, unsigned long mask)
         q->seg_boundary_mask = mask;
  }
  
+void blk_queue_assign_lock(request_queue_t *q, spinlock_t *lock)
+{
+       spin_lock_init(lock);
+       q->queue_lock = lock;
+}
+
  static char *rq_flags[] = { "REQ_RW", "REQ_RW_AHEAD", "REQ_BARRIER",
                            "REQ_CMD", "REQ_NOMERGE", "REQ_STARTED",
                            "REQ_DONTPREP", "REQ_DRIVE_CMD", "REQ_DRIVE_TASK",
@@ -536,9 +542,9 @@ void generic_unplug_device(void *data)
         request_queue_t *q = (request_queue_t *) data;
         unsigned long flags;
  
-       spin_lock_irqsave(&q->queue_lock, flags);
+       spin_lock_irqsave(q->queue_lock, flags);
         __generic_unplug_device(q);
-       spin_unlock_irqrestore(&q->queue_lock, flags);
+       spin_unlock_irqrestore(q->queue_lock, flags);
  }
  
  static int __blk_cleanup_queue(struct request_list *list)
@@ -624,7 +630,6 @@ static int blk_init_free_list(request_queue_t *q)
  
         init_waitqueue_head(&q->rq[READ].wait);
         init_waitqueue_head(&q->rq[WRITE].wait);
-       spin_lock_init(&q->queue_lock);
         return 0;
  nomem:
         blk_cleanup_queue(q);
@@ -661,7 +666,7 @@ static int __make_request(request_queue_t *, struct bio *);
   *    blk_init_queue() must be paired with a blk_cleanup_queue() call
   *    when the block device is deactivated (such as at module unload).
   **/
-int blk_init_queue(request_queue_t *q, request_fn_proc *rfn)
+int blk_init_queue(request_queue_t *q, request_fn_proc *rfn, spinlock_t *lock)
  {
         int ret;
  
@@ -682,6 +687,7 @@ int blk_init_queue(request_queue_t *q, request_fn_proc *rfn)
         q->plug_tq.routine      = &generic_unplug_device;
         q->plug_tq.data         = q;
         q->queue_flags          = (1 << QUEUE_FLAG_CLUSTER);
+       q->queue_lock           = lock;
         
         /*
          * by default assume old behaviour and bounce for any highmem page
@@ -728,7 +734,7 @@ static struct request *get_request_wait(request_queue_t *q, int rw)
         struct request_list *rl = &q->rq[rw];
         struct request *rq;
  
-       spin_lock_prefetch(&q->queue_lock);
+       spin_lock_prefetch(q->queue_lock);
  
         generic_unplug_device(q);
         add_wait_queue(&rl->wait, &wait);
@@ -736,9 +742,9 @@ static struct request *get_request_wait(request_queue_t *q, int rw)
                 set_current_state(TASK_UNINTERRUPTIBLE);
                 if (rl->count < batch_requests)
                         schedule();
-               spin_lock_irq(&q->queue_lock);
+               spin_lock_irq(q->queue_lock);
                 rq = get_request(q, rw);
-               spin_unlock_irq(&q->queue_lock);
+               spin_unlock_irq(q->queue_lock);
         } while (rq == NULL);
         remove_wait_queue(&rl->wait, &wait);
         current->state = TASK_RUNNING;
@@ -949,9 +955,9 @@ void blk_attempt_remerge(request_queue_t *q, struct request *rq)
  {
         unsigned long flags;
  
-       spin_lock_irqsave(&q->queue_lock, flags);
+       spin_lock_irqsave(q->queue_lock, flags);
         __blk_attempt_remerge(q, rq);
-       spin_unlock_irqrestore(&q->queue_lock, flags);
+       spin_unlock_irqrestore(q->queue_lock, flags);
  }
  
  static int __make_request(request_queue_t *q, struct bio *bio)
@@ -974,7 +980,7 @@ static int __make_request(request_queue_t *q, struct bio *bio)
          */
         blk_queue_bounce(q, &bio);
  
-       spin_lock_prefetch(&q->queue_lock);
+       spin_lock_prefetch(q->queue_lock);
  
         latency = elevator_request_latency(elevator, rw);
         barrier = test_bit(BIO_RW_BARRIER, &bio->bi_rw);
@@ -983,7 +989,7 @@ again:
         req = NULL;
         head = &q->queue_head;
  
-       spin_lock_irq(&q->queue_lock);
+       spin_lock_irq(q->queue_lock);
  
         insert_here = head->prev;
         if (blk_queue_empty(q) || barrier) {
@@ -1066,7 +1072,7 @@ get_rq:
                 freereq = NULL;
         } else if ((req = get_request(q, rw)) == NULL) {
  
-               spin_unlock_irq(&q->queue_lock);
+               spin_unlock_irq(q->queue_lock);
  
                 /*
                  * READA bit set
@@ -1111,7 +1117,7 @@ get_rq:
  out:
         if (freereq)
                 blkdev_release_request(freereq);
-       spin_unlock_irq(&q->queue_lock);
+       spin_unlock_irq(q->queue_lock);
         return 0;
  
  end_io:
@@ -1608,3 +1614,4 @@ EXPORT_SYMBOL(blk_nohighio);
  EXPORT_SYMBOL(blk_dump_rq_flags);
  EXPORT_SYMBOL(submit_bio);
  EXPORT_SYMBOL(blk_contig_segment);
+EXPORT_SYMBOL(blk_queue_assign_lock);
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c

index 22e5b4a60718fc992d18127fedaf5b4f094a4525..c16b6163af895ba98957511ed66fea112d01ba5a 100644 (file)
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -62,6 +62,8 @@ static u64 nbd_bytesizes[MAX_NBD];
  static struct nbd_device nbd_dev[MAX_NBD];
  static devfs_handle_t devfs_handle;
  
+static spinlock_t nbd_lock;
+
  #define DEBUG( s )
  /* #define DEBUG( s ) printk( s ) 
   */
@@ -347,22 +349,22 @@ static void do_nbd_request(request_queue_t * q)
  #endif
                 req->errors = 0;
                 blkdev_dequeue_request(req);
-               spin_unlock_irq(&q->queue_lock);
+               spin_unlock_irq(q->queue_lock);
  
                 down (&lo->queue_lock);
                 list_add(&req->queuelist, &lo->queue_head);
                 nbd_send_req(lo->sock, req);    /* Why does this block?         */
                 up (&lo->queue_lock);
  
-               spin_lock_irq(&q->queue_lock);
+               spin_lock_irq(q->queue_lock);
                 continue;
  
               error_out:
                 req->errors++;
                 blkdev_dequeue_request(req);
-               spin_unlock(&q->queue_lock);
+               spin_unlock(q->queue_lock);
                 nbd_end_request(req);
-               spin_lock(&q->queue_lock);
+               spin_lock(q->queue_lock);
         }
         return;
  }
@@ -515,7 +517,7 @@ static int __init nbd_init(void)
  #endif
         blksize_size[MAJOR_NR] = nbd_blksizes;
         blk_size[MAJOR_NR] = nbd_sizes;
-       blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), do_nbd_request);
+       blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), do_nbd_request, &nbd_lock);
         for (i = 0; i < MAX_NBD; i++) {
                 nbd_dev[i].refcnt = 0;
                 nbd_dev[i].file = NULL;
diff --git a/drivers/block/paride/pcd.c b/drivers/block/paride/pcd.c

index 61e50fec569ceca40ab4575e470f7d19f2557538..9604464e325f89facb9883c708f0af91a47f3ac4 100644 (file)
--- a/drivers/block/paride/pcd.c
+++ b/drivers/block/paride/pcd.c
@@ -146,6 +146,8 @@ static int pcd_drive_count;
  
  #include <asm/uaccess.h>
  
+static spinlock_t pcd_lock;
+
  #ifndef MODULE
  
  #include "setup.h"
@@ -355,7 +357,7 @@ int pcd_init (void) /* preliminary initialisation */
                 }
         }
  
-       blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST);
+       blk_init_queue(BLK_DEFAULT_QUEUE(MAJOR_NR), DEVICE_REQUEST, &pcd_lock);
         read_ahead[MAJOR_NR] = 8;       /* 8 sector (4kB) read ahead */
  
         for (i=0;i<PCD_UNITS;i++) pcd_blocksizes[i] = 1024;
@@ -821,11 +823,11 @@ static void pcd_start( void )
  
         if (pcd_command(unit,rd_cmd,2048,"read block")) {
                 pcd_bufblk = -1; 
-               spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+               spin_lock_irqsave(&pcd_lock,saved_flags);
                 pcd_busy = 0;
                 end_request(0);
                 do_pcd_request(NULL);
-               spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+               spin_unlock_irqrestore(&pcd_lock,saved_flags);
                 return;
         }
  
@@ -845,11 +847,11 @@ static void do_pcd_read( void )
         pcd_retries = 0;
         pcd_transfer();
         if (!pcd_count) {
-               spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+               spin_lock_irqsave(&pcd_lock,saved_flags);
                 end_request(1);
                 pcd_busy = 0;
                 do_pcd_request(NULL);
-               spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+               spin_unlock_irqrestore(&pcd_lock,saved_flags);
                 return;
         }
  
@@ -868,19 +870,19 @@ static void do_pcd_read_drq( void )
                         pi_do_claimed(PI,pcd_start);
                          return;
                          }
-               spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+               spin_lock_irqsave(&pcd_lock,saved_flags);
                 pcd_busy = 0;
                 pcd_bufblk = -1;
                 end_request(0);
                 do_pcd_request(NULL);
-               spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+               spin_unlock_irqrestore(&pcd_lock,saved_flags);
                 return;
         }
  
         do_pcd_read();
-       spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+       spin_lock_irqsave(&pcd_lock,saved_flags);
         do_pcd_request(NULL);
-       spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags); 
+       spin_unlock_irqrestore(&pcd_lock,saved_flags); 
  }
  
  /* the audio_ioctl stuff is adapted from sr_ioctl.c */
diff --git a/drivers/block/paride/pf.c b/drivers/block/paride/pf.c

index d659bbe4408abd595482b7f473969fa8c0c7c40d..e49565417eda42eb43ea0078c8659f5f85de8c0e 100644 (file)
--- a/drivers/block/paride/pf.c
+++ b/drivers/block/paride/pf.c
@@ -164,6 +164,8 @@ static int pf_drive_count;
  
  #include <asm/uaccess.h>
  
+static spinlock_t pf_spin_lock;
+
  #ifndef MODULE
  
  #include "setup.h"
@@ -358,7 +360,7 @@ int pf_init (void)      /* preliminary initialisation */
                  return -1;
          }
         q = BLK_DEFAULT_QUEUE(MAJOR_NR);
-       blk_init_queue(q, DEVICE_REQUEST);
+       blk_init_queue(q, DEVICE_REQUEST, &pf_spin_lock);
         blk_queue_max_segments(q, cluster);
          read_ahead[MAJOR_NR] = 8;       /* 8 sector (4kB) read ahead */
          
@@ -876,9 +878,9 @@ static void pf_next_buf( int unit )
  
  {      long    saved_flags;
  
-       spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+       spin_lock_irqsave(&pf_spin_lock,saved_flags);
         end_request(1);
-       if (!pf_run) { spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+       if (!pf_run) { spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
                        return; 
         }
         
@@ -894,7 +896,7 @@ static void pf_next_buf( int unit )
  
         pf_count = CURRENT->current_nr_sectors;
         pf_buf = CURRENT->buffer;
-       spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+       spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
  }
  
  static void do_pf_read( void )
@@ -918,11 +920,11 @@ static void do_pf_read_start( void )
                          pi_do_claimed(PI,do_pf_read_start);
                         return;
                  }
-               spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+               spin_lock_irqsave(&pf_spin_lock,saved_flags);
                  end_request(0);
                  pf_busy = 0;
                 do_pf_request(NULL);
-               spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+               spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
                  return;
          }
         pf_mask = STAT_DRQ;
@@ -944,11 +946,11 @@ static void do_pf_read_drq( void )
                          pi_do_claimed(PI,do_pf_read_start);
                          return;
                  }
-               spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+               spin_lock_irqsave(&pf_spin_lock,saved_flags);
                  end_request(0);
                  pf_busy = 0;
                 do_pf_request(NULL);
-               spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+               spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
                  return;
              }
              pi_read_block(PI,pf_buf,512);
@@ -959,11 +961,11 @@ static void do_pf_read_drq( void )
             if (!pf_count) pf_next_buf(unit);
          }
          pi_disconnect(PI);
-       spin_lock_irqsave(&QUEUE->queue_lock,saved_flags); 
+       spin_lock_irqsave(&pf_spin_lock,saved_flags); 
          end_request(1);
          pf_busy = 0;
         do_pf_request(NULL);
-       spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+       spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
  }
  
  static void do_pf_write( void )
@@ -985,11 +987,11 @@ static void do_pf_write_start( void )
                          pi_do_claimed(PI,do_pf_write_start);
                         return;
                  }
-               spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+               spin_lock_irqsave(&pf_spin_lock,saved_flags);
                  end_request(0);
                  pf_busy = 0;
                 do_pf_request(NULL);
-               spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+               spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
                  return;
          }
  
@@ -1002,11 +1004,11 @@ static void do_pf_write_start( void )
                          pi_do_claimed(PI,do_pf_write_start);
                          return;
                  }
-               spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+               spin_lock_irqsave(&pf_spin_lock,saved_flags);
                  end_request(0);
                  pf_busy = 0;
                 do_pf_request(NULL);
-               spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+               spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
                  return;
              }
              pi_write_block(PI,pf_buf,512);
@@ -1032,19 +1034,19 @@ static void do_pf_write_done( void )
                         pi_do_claimed(PI,do_pf_write_start);
                          return;
                  }
-               spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+               spin_lock_irqsave(&pf_spin_lock,saved_flags);
                  end_request(0);
                  pf_busy = 0;
                 do_pf_request(NULL);
-               spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+               spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
                  return;
          }
          pi_disconnect(PI);
-       spin_lock_irqsave(&QUEUE->queue_lock,saved_flags);
+       spin_lock_irqsave(&pf_spin_lock,saved_flags);
          end_request(1);
          pf_busy = 0;
         do_pf_request(NULL);
-       spin_unlock_irqrestore(&QUEUE->queue_lock,saved_flags);
+       spin_unlock_irqrestore(&pf_spin_lock,saved_flags);
  }
  
  /* end of pf.c */
diff --git a/drivers/block/ps2esdi.c b/drivers/block/ps2esdi.c

index 01c8805b83fc058387dc241572db302321502f8e..b248b437bf7481043a91c2d694625a6a9afd561f 100644 (file)
--- a/drivers/block/ps2esdi.c
+++ b/drivers/block/ps2esdi.c
@@ -189,6 +189,8 @@ int __init ps2esdi_init(void)
         return 0;
  }                              /* ps2esdi_init */
  
+module_init(ps2esdi_init);
+
  #ifdef MODULE
  
  static int cyl[MAX_HD] = {-1,-1};
diff --git a/drivers/block/rd.c b/drivers/block/rd.c

index b2135fc5b31954db6a37853188f8b7bc4208258e..c6fb55df16826358cb27418838f37cd3ba216744 100644 (file)
--- a/drivers/block/rd.c
+++ b/drivers/block/rd.c
@@ -44,9 +44,6 @@
  
  #include <linux/config.h>
  #include <linux/sched.h>
-#include <linux/minix_fs.h>
-#include <linux/ext2_fs.h>
-#include <linux/romfs_fs.h>
  #include <linux/fs.h>
  #include <linux/kernel.h>
  #include <linux/hdreg.h>
@@ -79,19 +76,10 @@ extern void wait_for_keypress(void);
  /* The RAM disk size is now a parameter */
  #define NUM_RAMDISKS 16                /* This cannot be overridden (yet) */ 
  
-#ifndef MODULE
-/* We don't have to load RAM disks or gunzip them in a module. */
-#define RD_LOADER
-#define BUILD_CRAMDISK
-
-void rd_load(void);
-static int crd_load(struct file *fp, struct file *outfp);
-
  #ifdef CONFIG_BLK_DEV_INITRD
  static int initrd_users;
  static spinlock_t initrd_users_lock = SPIN_LOCK_UNLOCKED;
  #endif
-#endif
  
  /* Various static variables go here.  Most are used only in the RAM disk code.
   */
@@ -542,6 +530,8 @@ int __init rd_init (void)
  #ifdef CONFIG_BLK_DEV_INITRD
         /* We ought to separate initrd operations here */
         register_disk(NULL, MKDEV(MAJOR_NR,INITRD_MINOR), 1, &rd_bd_op, rd_size<<1);
+       devfs_register(devfs_handle, "initrd", DEVFS_FL_DEFAULT, MAJOR_NR,
+                       INITRD_MINOR, S_IFBLK | S_IRUSR, &rd_bd_op, NULL);
  #endif
  
         blksize_size[MAJOR_NR] = rd_blocksizes;         /* Avoid set_blocksize() check */
@@ -565,462 +555,3 @@ MODULE_PARM     (rd_blocksize, "i");
  MODULE_PARM_DESC(rd_blocksize, "Blocksize of each RAM disk in bytes.");
  
  MODULE_LICENSE("GPL");
-
-/* End of non-loading portions of the RAM disk driver */
-
-#ifdef RD_LOADER 
-/*
- * This routine tries to find a RAM disk image to load, and returns the
- * number of blocks to read for a non-compressed image, 0 if the image
- * is a compressed image, and -1 if an image with the right magic
- * numbers could not be found.
- *
- * We currently check for the following magic numbers:
- *     minix
- *     ext2
- *     romfs
- *     gzip
- */
-static int __init 
-identify_ramdisk_image(kdev_t device, struct file *fp, int start_block)
-{
-       const int size = 512;
-       struct minix_super_block *minixsb;
-       struct ext2_super_block *ext2sb;
-       struct romfs_super_block *romfsb;
-       int nblocks = -1;
-       unsigned char *buf;
-
-       buf = kmalloc(size, GFP_KERNEL);
-       if (buf == 0)
-               return -1;
-
-       minixsb = (struct minix_super_block *) buf;
-       ext2sb = (struct ext2_super_block *) buf;
-       romfsb = (struct romfs_super_block *) buf;
-       memset(buf, 0xe5, size);
-
-       /*
-        * Read block 0 to test for gzipped kernel
-        */
-       if (fp->f_op->llseek)
-               fp->f_op->llseek(fp, start_block * BLOCK_SIZE, 0);
-       fp->f_pos = start_block * BLOCK_SIZE;
-       
-       fp->f_op->read(fp, buf, size, &fp->f_pos);
-
-       /*
-        * If it matches the gzip magic numbers, return -1
-        */
-       if (buf[0] == 037 && ((buf[1] == 0213) || (buf[1] == 0236))) {
-               printk(KERN_NOTICE
-                      "RAMDISK: Compressed image found at block %d\n",
-                      start_block);
-               nblocks = 0;
-               goto done;
-       }
-
-       /* romfs is at block zero too */
-       if (romfsb->word0 == ROMSB_WORD0 &&
-           romfsb->word1 == ROMSB_WORD1) {
-               printk(KERN_NOTICE
-                      "RAMDISK: romfs filesystem found at block %d\n",
-                      start_block);
-               nblocks = (ntohl(romfsb->size)+BLOCK_SIZE-1)>>BLOCK_SIZE_BITS;
-               goto done;
-       }
-
-       /*
-        * Read block 1 to test for minix and ext2 superblock
-        */
-       if (fp->f_op->llseek)
-               fp->f_op->llseek(fp, (start_block+1) * BLOCK_SIZE, 0);
-       fp->f_pos = (start_block+1) * BLOCK_SIZE;
-
-       fp->f_op->read(fp, buf, size, &fp->f_pos);
-               
-       /* Try minix */
-       if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
-           minixsb->s_magic == MINIX_SUPER_MAGIC2) {
-               printk(KERN_NOTICE
-                      "RAMDISK: Minix filesystem found at block %d\n",
-                      start_block);
-               nblocks = minixsb->s_nzones << minixsb->s_log_zone_size;
-               goto done;
-       }
-
-       /* Try ext2 */
-       if (ext2sb->s_magic == cpu_to_le16(EXT2_SUPER_MAGIC)) {
-               printk(KERN_NOTICE
-                      "RAMDISK: ext2 filesystem found at block %d\n",
-                      start_block);
-               nblocks = le32_to_cpu(ext2sb->s_blocks_count);
-               goto done;
-       }
-
-       printk(KERN_NOTICE
-              "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n",
-              start_block);
-       
-done:
-       if (fp->f_op->llseek)
-               fp->f_op->llseek(fp, start_block * BLOCK_SIZE, 0);
-       fp->f_pos = start_block * BLOCK_SIZE;   
-
-       kfree(buf);
-       return nblocks;
-}
-
-/*
- * This routine loads in the RAM disk image.
- */
-static void __init rd_load_image(kdev_t device, int offset, int unit)
-{
-       struct inode *inode, *out_inode;
-       struct file infile, outfile;
-       struct dentry in_dentry, out_dentry;
-       mm_segment_t fs;
-       kdev_t ram_device;
-       int nblocks, i;
-       char *buf;
-       unsigned short rotate = 0;
-       unsigned short devblocks = 0;
-#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
-       char rotator[4] = { '|' , '/' , '-' , '\\' };
-#endif
-       ram_device = MKDEV(MAJOR_NR, unit);
-
-       if ((inode = get_empty_inode()) == NULL)
-               return;
-       memset(&infile, 0, sizeof(infile));
-       memset(&in_dentry, 0, sizeof(in_dentry));
-       infile.f_mode = 1; /* read only */
-       infile.f_dentry = &in_dentry;
-       in_dentry.d_inode = inode;
-       infile.f_op = &def_blk_fops;
-       init_special_inode(inode, S_IFBLK | S_IRUSR, kdev_t_to_nr(device));
-
-       if ((out_inode = get_empty_inode()) == NULL)
-               goto free_inode;
-       memset(&outfile, 0, sizeof(outfile));
-       memset(&out_dentry, 0, sizeof(out_dentry));
-       outfile.f_mode = 3; /* read/write */
-       outfile.f_dentry = &out_dentry;
-       out_dentry.d_inode = out_inode;
-       outfile.f_op = &def_blk_fops;
-       init_special_inode(out_inode, S_IFBLK | S_IRUSR | S_IWUSR, kdev_t_to_nr(ram_device));
-
-       if (blkdev_open(inode, &infile) != 0) {
-               iput(out_inode);
-               goto free_inode;
-       }
-       if (blkdev_open(out_inode, &outfile) != 0)
-               goto free_inodes;
-
-       fs = get_fs();
-       set_fs(KERNEL_DS);
-       
-       nblocks = identify_ramdisk_image(device, &infile, offset);
-       if (nblocks < 0)
-               goto done;
-
-       if (nblocks == 0) {
-#ifdef BUILD_CRAMDISK
-               if (crd_load(&infile, &outfile) == 0)
-                       goto successful_load;
-#else
-               printk(KERN_NOTICE
-                      "RAMDISK: Kernel does not support compressed "
-                      "RAM disk images\n");
-#endif
-               goto done;
-       }
-
-       /*
-        * NOTE NOTE: nblocks suppose that the blocksize is BLOCK_SIZE, so
-        * rd_load_image will work only with filesystem BLOCK_SIZE wide!
-        * So make sure to use 1k blocksize while generating ext2fs
-        * ramdisk-images.
-        */
-       if (nblocks > (rd_length[unit] >> BLOCK_SIZE_BITS)) {
-               printk("RAMDISK: image too big! (%d/%ld blocks)\n",
-                      nblocks, rd_length[unit] >> BLOCK_SIZE_BITS);
-               goto done;
-       }
-               
-       /*
-        * OK, time to copy in the data
-        */
-       buf = kmalloc(BLOCK_SIZE, GFP_KERNEL);
-       if (buf == 0) {
-               printk(KERN_ERR "RAMDISK: could not allocate buffer\n");
-               goto done;
-       }
-
-       if (blk_size[MAJOR(device)])
-               devblocks = blk_size[MAJOR(device)][MINOR(device)];
-
-#ifdef CONFIG_BLK_DEV_INITRD
-       if (MAJOR(device) == MAJOR_NR && MINOR(device) == INITRD_MINOR)
-               devblocks = nblocks;
-#endif
-
-       if (devblocks == 0) {
-               printk(KERN_ERR "RAMDISK: could not determine device size\n");
-               goto done;
-       }
-
-       printk(KERN_NOTICE "RAMDISK: Loading %d blocks [%d disk%s] into ram disk... ", 
-               nblocks, ((nblocks-1)/devblocks)+1, nblocks>devblocks ? "s" : "");
-       for (i=0; i < nblocks; i++) {
-               if (i && (i % devblocks == 0)) {
-                       printk("done disk #%d.\n", i/devblocks);
-                       rotate = 0;
-                       if (infile.f_op->release(inode, &infile) != 0) {
-                               printk("Error closing the disk.\n");
-                               goto noclose_input;
-                       }
-                       printk("Please insert disk #%d and press ENTER\n", i/devblocks+1);
-                       wait_for_keypress();
-                       if (blkdev_open(inode, &infile) != 0)  {
-                               printk("Error opening disk.\n");
-                               goto noclose_input;
-                       }
-                       infile.f_pos = 0;
-                       printk("Loading disk #%d... ", i/devblocks+1);
-               }
-               infile.f_op->read(&infile, buf, BLOCK_SIZE, &infile.f_pos);
-               outfile.f_op->write(&outfile, buf, BLOCK_SIZE, &outfile.f_pos);
-#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
-               if (!(i % 16)) {
-                       printk("%c\b", rotator[rotate & 0x3]);
-                       rotate++;
-               }
-#endif
-       }
-       printk("done.\n");
-       kfree(buf);
-
-successful_load:
-       ROOT_DEV = MKDEV(MAJOR_NR, unit);
-       if (ROOT_DEVICE_NAME != NULL) strcpy (ROOT_DEVICE_NAME, "rd/0");
-
-done:
-       infile.f_op->release(inode, &infile);
-noclose_input:
-       blkdev_close(out_inode, &outfile);
-       iput(inode);
-       iput(out_inode);
-       set_fs(fs);
-       return;
-free_inodes: /* free inodes on error */ 
-       iput(out_inode);
-       infile.f_op->release(inode, &infile);
-free_inode:
-       iput(inode);
-}
-
-#ifdef CONFIG_MAC_FLOPPY
-int swim3_fd_eject(int devnum);
-#endif
-
-static void __init rd_load_disk(int n)
-{
-
-       if (rd_doload == 0)
-               return;
-
-       if (MAJOR(ROOT_DEV) != FLOPPY_MAJOR
-#ifdef CONFIG_BLK_DEV_INITRD
-               && MAJOR(real_root_dev) != FLOPPY_MAJOR
-#endif
-       )
-               return;
-
-       if (rd_prompt) {
-#ifdef CONFIG_BLK_DEV_FD
-               floppy_eject();
-#endif
-#ifdef CONFIG_MAC_FLOPPY
-               if(MAJOR(ROOT_DEV) == FLOPPY_MAJOR)
-                       swim3_fd_eject(MINOR(ROOT_DEV));
-               else if(MAJOR(real_root_dev) == FLOPPY_MAJOR)
-                       swim3_fd_eject(MINOR(real_root_dev));
-#endif
-               printk(KERN_NOTICE
-                      "VFS: Insert root floppy disk to be loaded into RAM disk and press ENTER\n");
-               wait_for_keypress();
-       }
-
-       rd_load_image(ROOT_DEV,rd_image_start, n);
-
-}
-
-void __init rd_load(void)
-{
-       rd_load_disk(0);
-}
-
-void __init rd_load_secondary(void)
-{
-       rd_load_disk(1);
-}
-
-#ifdef CONFIG_BLK_DEV_INITRD
-void __init initrd_load(void)
-{
-       rd_load_image(MKDEV(MAJOR_NR, INITRD_MINOR),rd_image_start,0);
-}
-#endif
-
-#endif /* RD_LOADER */
-
-#ifdef BUILD_CRAMDISK
-
-/*
- * gzip declarations
- */
-
-#define OF(args)  args
-
-#ifndef memzero
-#define memzero(s, n)     memset ((s), 0, (n))
-#endif
-
-typedef unsigned char  uch;
-typedef unsigned short ush;
-typedef unsigned long  ulg;
-
-#define INBUFSIZ 4096
-#define WSIZE 0x8000    /* window size--must be a power of two, and */
-                       /*  at least 32K for zip's deflate method */
-
-static uch *inbuf;
-static uch *window;
-
-static unsigned insize;  /* valid bytes in inbuf */
-static unsigned inptr;   /* index of next byte to be processed in inbuf */
-static unsigned outcnt;  /* bytes in output buffer */
-static int exit_code;
-static long bytes_out;
-static struct file *crd_infp, *crd_outfp;
-
-#define get_byte()  (inptr < insize ? inbuf[inptr++] : fill_inbuf())
-               
-/* Diagnostic functions (stubbed out) */
-#define Assert(cond,msg)
-#define Trace(x)
-#define Tracev(x)
-#define Tracevv(x)
-#define Tracec(c,x)
-#define Tracecv(c,x)
-
-#define STATIC static
-
-static int  fill_inbuf(void);
-static void flush_window(void);
-static void *malloc(int size);
-static void free(void *where);
-static void error(char *m);
-static void gzip_mark(void **);
-static void gzip_release(void **);
-
-#include "../../lib/inflate.c"
-
-static void __init *malloc(int size)
-{
-       return kmalloc(size, GFP_KERNEL);
-}
-
-static void __init free(void *where)
-{
-       kfree(where);
-}
-
-static void __init gzip_mark(void **ptr)
-{
-}
-
-static void __init gzip_release(void **ptr)
-{
-}
-
-
-/* ===========================================================================
- * Fill the input buffer. This is called only when the buffer is empty
- * and at least one byte is really needed.
- */
-static int __init fill_inbuf(void)
-{
-       if (exit_code) return -1;
-       
-       insize = crd_infp->f_op->read(crd_infp, inbuf, INBUFSIZ,
-                                     &crd_infp->f_pos);
-       if (insize == 0) return -1;
-
-       inptr = 1;
-
-       return inbuf[0];
-}
-
-/* ===========================================================================
- * Write the output window window[0..outcnt-1] and update crc and bytes_out.
- * (Used for the decompressed data only.)
- */
-static void __init flush_window(void)
-{
-    ulg c = crc;         /* temporary variable */
-    unsigned n;
-    uch *in, ch;
-    
-    crd_outfp->f_op->write(crd_outfp, window, outcnt, &crd_outfp->f_pos);
-    in = window;
-    for (n = 0; n < outcnt; n++) {
-           ch = *in++;
-           c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8);
-    }
-    crc = c;
-    bytes_out += (ulg)outcnt;
-    outcnt = 0;
-}
-
-static void __init error(char *x)
-{
-       printk(KERN_ERR "%s", x);
-       exit_code = 1;
-}
-
-static int __init 
-crd_load(struct file * fp, struct file *outfp)
-{
-       int result;
-
-       insize = 0;             /* valid bytes in inbuf */
-       inptr = 0;              /* index of next byte to be processed in inbuf */
-       outcnt = 0;             /* bytes in output buffer */
-       exit_code = 0;
-       bytes_out = 0;
-       crc = (ulg)0xffffffffL; /* shift register contents */
-
-       crd_infp = fp;
-       crd_outfp = outfp;
-       inbuf = kmalloc(INBUFSIZ, GFP_KERNEL);
-       if (inbuf == 0) {
-               printk(KERN_ERR "RAMDISK: Couldn't allocate gzip buffer\n");
-               return -1;
-       }
-       window = kmalloc(WSIZE, GFP_KERNEL);
-       if (window == 0) {
-               printk(KERN_ERR "RAMDISK: Couldn't allocate gzip window\n");
-               kfree(inbuf);
-               return -1;
-       }
-       makecrc();
-       result = gunzip();
-       kfree(inbuf);
-       kfree(window);
-       return result;
-}
-
-#endif  /* BUILD_CRAMDISK */
-
diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c

index 82123b2e05731e23fc30ce32e81e53439aa170ed..6201c2d1600d05dc826e0368c541f576b0be659a 100644 (file)
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -597,7 +597,7 @@ static void ide_init_queue(ide_drive_t *drive)
         int max_sectors;
  
         q->queuedata = HWGROUP(drive);
-       blk_init_queue(q, do_ide_request);
+       blk_init_queue(q, do_ide_request, &ide_lock);
         blk_queue_segment_boundary(q, 0xffff);
  
         /* IDE can do up to 128K per request, pdc4030 needs smaller limit */
diff --git a/drivers/ide/ide.c b/drivers/ide/ide.c

index 8a941308ab1447146e38a527f9957b5e079915cd..c1b19e1d925559cbe5b01f0dd0524955232587ed 100644 (file)
--- a/drivers/ide/ide.c
+++ b/drivers/ide/ide.c
@@ -177,8 +177,6 @@ static int  initializing;     /* set while initializing built-in drivers */
  /*
   * protects global structures etc, we want to split this into per-hwgroup
   * instead.
- *
- * anti-deadlock ordering: ide_lock -> DRIVE_LOCK
   */
  spinlock_t ide_lock __cacheline_aligned = SPIN_LOCK_UNLOCKED;
  
@@ -583,11 +581,9 @@ inline int __ide_end_request(ide_hwgroup_t *hwgroup, int uptodate, int nr_secs)
  
         if (!end_that_request_first(rq, uptodate, nr_secs)) {
                 add_blkdev_randomness(MAJOR(rq->rq_dev));
-               spin_lock(DRIVE_LOCK(drive));
                 blkdev_dequeue_request(rq);
                 hwgroup->rq = NULL;
                 end_that_request_last(rq);
-               spin_unlock(DRIVE_LOCK(drive));
                 ret = 0;
         }
  
@@ -900,11 +896,9 @@ void ide_end_drive_cmd (ide_drive_t *drive, byte stat, byte err)
                 }
         }
  
-       spin_lock(DRIVE_LOCK(drive));
         blkdev_dequeue_request(rq);
         HWGROUP(drive)->rq = NULL;
         end_that_request_last(rq);
-       spin_unlock(DRIVE_LOCK(drive));
  
         spin_unlock_irqrestore(&ide_lock, flags);
  }
@@ -1368,7 +1362,7 @@ repeat:
  
  /*
   * Issue a new request to a drive from hwgroup
- * Caller must have already done spin_lock_irqsave(DRIVE_LOCK(drive), ...)
+ * Caller must have already done spin_lock_irqsave(&ide_lock, ...)
   *
   * A hwgroup is a serialized group of IDE interfaces.  Usually there is
   * exactly one hwif (interface) per hwgroup, but buggy controllers (eg. CMD640)
@@ -1456,9 +1450,7 @@ static void ide_do_request(ide_hwgroup_t *hwgroup, int masked_irq)
                 /*
                  * just continuing an interrupted request maybe
                  */
-               spin_lock(DRIVE_LOCK(drive));
                 rq = hwgroup->rq = elv_next_request(&drive->queue);
-               spin_unlock(DRIVE_LOCK(drive));
  
                 /*
                  * Some systems have trouble with IDE IRQs arriving while
@@ -1496,19 +1488,7 @@ request_queue_t *ide_get_queue (kdev_t dev)
   */
  void do_ide_request(request_queue_t *q)
  {
-       unsigned long flags;
-
-       /*
-        * release queue lock, grab IDE global lock and restore when
-        * we leave...
-        */
-       spin_unlock(&q->queue_lock);
-
-       spin_lock_irqsave(&ide_lock, flags);
         ide_do_request(q->queuedata, 0);
-       spin_unlock_irqrestore(&ide_lock, flags);
-
-       spin_lock(&q->queue_lock);
  }
  
  /*
@@ -1875,7 +1855,6 @@ int ide_do_drive_cmd (ide_drive_t *drive, struct request *rq, ide_action_t actio
         if (action == ide_wait)
                 rq->waiting = &wait;
         spin_lock_irqsave(&ide_lock, flags);
-       spin_lock(DRIVE_LOCK(drive));
         if (blk_queue_empty(&drive->queue) || action == ide_preempt) {
                 if (action == ide_preempt)
                         hwgroup->rq = NULL;
@@ -1886,7 +1865,6 @@ int ide_do_drive_cmd (ide_drive_t *drive, struct request *rq, ide_action_t actio
                         queue_head = queue_head->next;
         }
         q->elevator.elevator_add_req_fn(q, rq, queue_head);
-       spin_unlock(DRIVE_LOCK(drive));
         ide_do_request(hwgroup, 0);
         spin_unlock_irqrestore(&ide_lock, flags);
         if (action == ide_wait) {
diff --git a/drivers/md/linear.c b/drivers/md/linear.c

index c40dd3a1b58da32f6d2c8c5bb368967f3410a124..b65a67357e1d4d266e6f5e0ba33cc4cb31652295 100644 (file)
--- a/drivers/md/linear.c
+++ b/drivers/md/linear.c
@@ -189,7 +189,7 @@ static mdk_personality_t linear_personality=
         status:         linear_status,
  };
  
-static int md__init linear_init (void)
+static int __init linear_init (void)
  {
         return register_md_personality (LINEAR, &linear_personality);
  }
diff --git a/drivers/md/md.c b/drivers/md/md.c

index d474faf734c3e7fe8197ba76208e39bace71ca56..ae96e8648a572318b1a553df6fb0ced13cce3cb4 100644 (file)
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -130,7 +130,7 @@ static struct gendisk md_gendisk=
  /*
   * Enables to iterate over all existing md arrays
   */
-static MD_LIST_HEAD(all_mddevs);
+static LIST_HEAD(all_mddevs);
  
  /*
   * The mapping between kdev and mddev is not necessary a simple
@@ -201,8 +201,8 @@ static mddev_t * alloc_mddev(kdev_t dev)
         init_MUTEX(&mddev->reconfig_sem);
         init_MUTEX(&mddev->recovery_sem);
         init_MUTEX(&mddev->resync_sem);
-       MD_INIT_LIST_HEAD(&mddev->disks);
-       MD_INIT_LIST_HEAD(&mddev->all_mddevs);
+       INIT_LIST_HEAD(&mddev->disks);
+       INIT_LIST_HEAD(&mddev->all_mddevs);
         atomic_set(&mddev->active, 0);
  
         /*
@@ -211,7 +211,7 @@ static mddev_t * alloc_mddev(kdev_t dev)
          * if necessary.
          */
         add_mddev_mapping(mddev, dev, 0);
-       md_list_add(&mddev->all_mddevs, &all_mddevs);
+       list_add(&mddev->all_mddevs, &all_mddevs);
  
         MOD_INC_USE_COUNT;
  
@@ -221,7 +221,7 @@ static mddev_t * alloc_mddev(kdev_t dev)
  mdk_rdev_t * find_rdev_nr(mddev_t *mddev, int nr)
  {
         mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
  
         ITERATE_RDEV(mddev,rdev,tmp) {
                 if (rdev->desc_nr == nr)
@@ -232,7 +232,7 @@ mdk_rdev_t * find_rdev_nr(mddev_t *mddev, int nr)
  
  mdk_rdev_t * find_rdev(mddev_t * mddev, kdev_t dev)
  {
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mdk_rdev_t *rdev;
  
         ITERATE_RDEV(mddev,rdev,tmp) {
@@ -242,17 +242,17 @@ mdk_rdev_t * find_rdev(mddev_t * mddev, kdev_t dev)
         return NULL;
  }
  
-static MD_LIST_HEAD(device_names);
+static LIST_HEAD(device_names);
  
  char * partition_name(kdev_t dev)
  {
         struct gendisk *hd;
         static char nomem [] = "<nomem>";
         dev_name_t *dname;
-       struct md_list_head *tmp = device_names.next;
+       struct list_head *tmp = device_names.next;
  
         while (tmp != &device_names) {
-               dname = md_list_entry(tmp, dev_name_t, list);
+               dname = list_entry(tmp, dev_name_t, list);
                 if (dname->dev == dev)
                         return dname->name;
                 tmp = tmp->next;
@@ -275,8 +275,8 @@ char * partition_name(kdev_t dev)
         }
  
         dname->dev = dev;
-       MD_INIT_LIST_HEAD(&dname->list);
-       md_list_add(&dname->list, &device_names);
+       INIT_LIST_HEAD(&dname->list);
+       list_add(&dname->list, &device_names);
  
         return dname->name;
  }
@@ -311,7 +311,7 @@ static unsigned int zoned_raid_size(mddev_t *mddev)
  {
         unsigned int mask;
         mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
  
         if (!mddev->sb) {
                 MD_BUG();
@@ -341,7 +341,7 @@ int md_check_ordering(mddev_t *mddev)
  {
         int i, c;
         mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
  
         /*
          * First, all devices must be fully functional
@@ -435,7 +435,7 @@ static int alloc_array_sb(mddev_t * mddev)
         mddev->sb = (mdp_super_t *) __get_free_page (GFP_KERNEL);
         if (!mddev->sb)
                 return -ENOMEM;
-       md_clear_page(mddev->sb);
+       clear_page(mddev->sb);
         return 0;
  }
  
@@ -449,7 +449,7 @@ static int alloc_disk_sb(mdk_rdev_t * rdev)
                 printk(OUT_OF_MEM);
                 return -EINVAL;
         }
-       md_clear_page(rdev->sb);
+       clear_page(rdev->sb);
  
         return 0;
  }
@@ -564,7 +564,7 @@ static kdev_t dev_unit(kdev_t dev)
  
  static mdk_rdev_t * match_dev_unit(mddev_t *mddev, kdev_t dev)
  {
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mdk_rdev_t *rdev;
  
         ITERATE_RDEV(mddev,rdev,tmp)
@@ -576,7 +576,7 @@ static mdk_rdev_t * match_dev_unit(mddev_t *mddev, kdev_t dev)
  
  static int match_mddev_units(mddev_t *mddev1, mddev_t *mddev2)
  {
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mdk_rdev_t *rdev;
  
         ITERATE_RDEV(mddev1,rdev,tmp)
@@ -586,8 +586,8 @@ static int match_mddev_units(mddev_t *mddev1, mddev_t *mddev2)
         return 0;
  }
  
-static MD_LIST_HEAD(all_raid_disks);
-static MD_LIST_HEAD(pending_raid_disks);
+static LIST_HEAD(all_raid_disks);
+static LIST_HEAD(pending_raid_disks);
  
  static void bind_rdev_to_array(mdk_rdev_t * rdev, mddev_t * mddev)
  {
@@ -605,7 +605,7 @@ static void bind_rdev_to_array(mdk_rdev_t * rdev, mddev_t * mddev)
                         mdidx(mddev), partition_name(rdev->dev),
                                 partition_name(same_pdev->dev));
  
-       md_list_add(&rdev->same_set, &mddev->disks);
+       list_add(&rdev->same_set, &mddev->disks);
         rdev->mddev = mddev;
         mddev->nb_dev++;
         printk(KERN_INFO "md: bind<%s,%d>\n", partition_name(rdev->dev), mddev->nb_dev);
@@ -617,8 +617,8 @@ static void unbind_rdev_from_array(mdk_rdev_t * rdev)
                 MD_BUG();
                 return;
         }
-       md_list_del(&rdev->same_set);
-       MD_INIT_LIST_HEAD(&rdev->same_set);
+       list_del(&rdev->same_set);
+       INIT_LIST_HEAD(&rdev->same_set);
         rdev->mddev->nb_dev--;
         printk(KERN_INFO "md: unbind<%s,%d>\n", partition_name(rdev->dev),
                                                  rdev->mddev->nb_dev);
@@ -664,13 +664,13 @@ static void export_rdev(mdk_rdev_t * rdev)
                 MD_BUG();
         unlock_rdev(rdev);
         free_disk_sb(rdev);
-       md_list_del(&rdev->all);
-       MD_INIT_LIST_HEAD(&rdev->all);
+       list_del(&rdev->all);
+       INIT_LIST_HEAD(&rdev->all);
         if (rdev->pending.next != &rdev->pending) {
                 printk(KERN_INFO "md: (%s was pending)\n",
                         partition_name(rdev->dev));
-               md_list_del(&rdev->pending);
-               MD_INIT_LIST_HEAD(&rdev->pending);
+               list_del(&rdev->pending);
+               INIT_LIST_HEAD(&rdev->pending);
         }
  #ifndef MODULE
         md_autodetect_dev(rdev->dev);
@@ -688,7 +688,7 @@ static void kick_rdev_from_array(mdk_rdev_t * rdev)
  
  static void export_array(mddev_t *mddev)
  {
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mdk_rdev_t *rdev;
         mdp_super_t *sb = mddev->sb;
  
@@ -723,14 +723,14 @@ static void free_mddev(mddev_t *mddev)
          * Make sure nobody else is using this mddev
          * (careful, we rely on the global kernel lock here)
          */
-       while (md_atomic_read(&mddev->resync_sem.count) != 1)
+       while (atomic_read(&mddev->resync_sem.count) != 1)
                 schedule();
-       while (md_atomic_read(&mddev->recovery_sem.count) != 1)
+       while (atomic_read(&mddev->recovery_sem.count) != 1)
                 schedule();
  
         del_mddev_mapping(mddev, MKDEV(MD_MAJOR, mdidx(mddev)));
-       md_list_del(&mddev->all_mddevs);
-       MD_INIT_LIST_HEAD(&mddev->all_mddevs);
+       list_del(&mddev->all_mddevs);
+       INIT_LIST_HEAD(&mddev->all_mddevs);
         kfree(mddev);
         MOD_DEC_USE_COUNT;
  }
@@ -793,7 +793,7 @@ static void print_rdev(mdk_rdev_t *rdev)
  
  void md_print_devices(void)
  {
-       struct md_list_head *tmp, *tmp2;
+       struct list_head *tmp, *tmp2;
         mdk_rdev_t *rdev;
         mddev_t *mddev;
  
@@ -871,12 +871,12 @@ static int uuid_equal(mdk_rdev_t *rdev1, mdk_rdev_t *rdev2)
  
  static mdk_rdev_t * find_rdev_all(kdev_t dev)
  {
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mdk_rdev_t *rdev;
  
         tmp = all_raid_disks.next;
         while (tmp != &all_raid_disks) {
-               rdev = md_list_entry(tmp, mdk_rdev_t, all);
+               rdev = list_entry(tmp, mdk_rdev_t, all);
                 if (rdev->dev == dev)
                         return rdev;
                 tmp = tmp->next;
@@ -980,7 +980,7 @@ static int sync_sbs(mddev_t * mddev)
  {
         mdk_rdev_t *rdev;
         mdp_super_t *sb;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
  
         ITERATE_RDEV(mddev,rdev,tmp) {
                 if (rdev->faulty || rdev->alias_device)
@@ -996,15 +996,15 @@ static int sync_sbs(mddev_t * mddev)
  int md_update_sb(mddev_t * mddev)
  {
         int err, count = 100;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mdk_rdev_t *rdev;
  
  repeat:
         mddev->sb->utime = CURRENT_TIME;
-       if ((++mddev->sb->events_lo)==0)
+       if (!(++mddev->sb->events_lo))
                 ++mddev->sb->events_hi;
  
-       if ((mddev->sb->events_lo|mddev->sb->events_hi)==0) {
+       if (!(mddev->sb->events_lo | mddev->sb->events_hi)) {
                 /*
                  * oops, this 64-bit counter should never wrap.
                  * Either we are in around ~1 trillion A.C., assuming
@@ -1128,8 +1128,8 @@ static int md_import_device(kdev_t newdev, int on_disk)
                         rdev->desc_nr = -1;
                 }
         }
-       md_list_add(&rdev->all, &all_raid_disks);
-       MD_INIT_LIST_HEAD(&rdev->pending);
+       list_add(&rdev->all, &all_raid_disks);
+       INIT_LIST_HEAD(&rdev->pending);
  
         if (rdev->faulty && rdev->sb)
                 free_disk_sb(rdev);
@@ -1167,7 +1167,7 @@ abort_free:
  static int analyze_sbs(mddev_t * mddev)
  {
         int out_of_date = 0, i, first;
-       struct md_list_head *tmp, *tmp2;
+       struct list_head *tmp, *tmp2;
         mdk_rdev_t *rdev, *rdev2, *freshest;
         mdp_super_t *sb;
  
@@ -1225,7 +1225,7 @@ static int analyze_sbs(mddev_t * mddev)
                  */
                 if (calc_sb_csum(rdev->sb) != rdev->sb->sb_csum) {
                         if (rdev->sb->events_lo || rdev->sb->events_hi)
-                               if ((rdev->sb->events_lo--)==0)
+                               if (!(rdev->sb->events_lo--))
                                         rdev->sb->events_hi--;
                 }
  
@@ -1513,7 +1513,7 @@ static int device_size_calculation(mddev_t * mddev)
         int data_disks = 0, persistent;
         unsigned int readahead;
         mdp_super_t *sb = mddev->sb;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mdk_rdev_t *rdev;
  
         /*
@@ -1572,7 +1572,7 @@ static int device_size_calculation(mddev_t * mddev)
                 md_size[mdidx(mddev)] = sb->size * data_disks;
  
         readahead = MD_READAHEAD;
-       if ((sb->level == 0) || (sb->level == 4) || (sb->level == 5)) {
+       if (!sb->level || (sb->level == 4) || (sb->level == 5)) {
                 readahead = (mddev->sb->chunk_size>>PAGE_SHIFT) * 4 * data_disks;
                 if (readahead < data_disks * (MAX_SECTORS>>(PAGE_SHIFT-9))*2)
                         readahead = data_disks * (MAX_SECTORS>>(PAGE_SHIFT-9))*2;
@@ -1608,7 +1608,7 @@ static int do_md_run(mddev_t * mddev)
  {
         int pnum, err;
         int chunk_size;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mdk_rdev_t *rdev;
  
  
@@ -1873,7 +1873,7 @@ int detect_old_array(mdp_super_t *sb)
  static void autorun_array(mddev_t *mddev)
  {
         mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         int err;
  
         if (mddev->disks.prev == &mddev->disks) {
@@ -1913,8 +1913,8 @@ static void autorun_array(mddev_t *mddev)
   */
  static void autorun_devices(kdev_t countdev)
  {
-       struct md_list_head candidates;
-       struct md_list_head *tmp;
+       struct list_head candidates;
+       struct list_head *tmp;
         mdk_rdev_t *rdev0, *rdev;
         mddev_t *mddev;
         kdev_t md_kdev;
@@ -1922,11 +1922,11 @@ static void autorun_devices(kdev_t countdev)
  
         printk(KERN_INFO "md: autorun ...\n");
         while (pending_raid_disks.next != &pending_raid_disks) {
-               rdev0 = md_list_entry(pending_raid_disks.next,
+               rdev0 = list_entry(pending_raid_disks.next,
                                          mdk_rdev_t, pending);
  
                 printk(KERN_INFO "md: considering %s ...\n", partition_name(rdev0->dev));
-               MD_INIT_LIST_HEAD(&candidates);
+               INIT_LIST_HEAD(&candidates);
                 ITERATE_RDEV_PENDING(rdev,tmp) {
                         if (uuid_equal(rdev0, rdev)) {
                                 if (!sb_equal(rdev0->sb, rdev->sb)) {
@@ -1936,8 +1936,8 @@ static void autorun_devices(kdev_t countdev)
                                         continue;
                                 }
                                 printk(KERN_INFO "md:  adding %s ...\n", partition_name(rdev->dev));
-                               md_list_del(&rdev->pending);
-                               md_list_add(&rdev->pending, &candidates);
+                               list_del(&rdev->pending);
+                               list_add(&rdev->pending, &candidates);
                         }
                 }
                 /*
@@ -1964,8 +1964,8 @@ static void autorun_devices(kdev_t countdev)
                 printk(KERN_INFO "md: created md%d\n", mdidx(mddev));
                 ITERATE_RDEV_GENERIC(candidates,pending,rdev,tmp) {
                         bind_rdev_to_array(rdev, mddev);
-                       md_list_del(&rdev->pending);
-                       MD_INIT_LIST_HEAD(&rdev->pending);
+                       list_del(&rdev->pending);
+                       INIT_LIST_HEAD(&rdev->pending);
                 }
                 autorun_array(mddev);
         }
@@ -2025,7 +2025,7 @@ static int autostart_array(kdev_t startdev, kdev_t countdev)
                                                 partition_name(startdev));
                 goto abort;
         }
-       md_list_add(&start_rdev->pending, &pending_raid_disks);
+       list_add(&start_rdev->pending, &pending_raid_disks);
  
         sb = start_rdev->sb;
  
@@ -2058,7 +2058,7 @@ static int autostart_array(kdev_t startdev, kdev_t countdev)
                         MD_BUG();
                         goto abort;
                 }
-               md_list_add(&rdev->pending, &pending_raid_disks);
+               list_add(&rdev->pending, &pending_raid_disks);
         }
  
         /*
@@ -2091,7 +2091,7 @@ static int get_version(void * arg)
         ver.minor = MD_MINOR_VERSION;
         ver.patchlevel = MD_PATCHLEVEL_VERSION;
  
-       if (md_copy_to_user(arg, &ver, sizeof(ver)))
+       if (copy_to_user(arg, &ver, sizeof(ver)))
                 return -EFAULT;
  
         return 0;
@@ -2128,7 +2128,7 @@ static int get_array_info(mddev_t * mddev, void * arg)
         SET_FROM_SB(layout);
         SET_FROM_SB(chunk_size);
  
-       if (md_copy_to_user(arg, &info, sizeof(info)))
+       if (copy_to_user(arg, &info, sizeof(info)))
                 return -EFAULT;
  
         return 0;
@@ -2144,7 +2144,7 @@ static int get_disk_info(mddev_t * mddev, void * arg)
         if (!mddev->sb)
                 return -EINVAL;
  
-       if (md_copy_from_user(&info, arg, sizeof(info)))
+       if (copy_from_user(&info, arg, sizeof(info)))
                 return -EFAULT;
  
         nr = info.number;
@@ -2156,7 +2156,7 @@ static int get_disk_info(mddev_t * mddev, void * arg)
         SET_FROM_SB(raid_disk);
         SET_FROM_SB(state);
  
-       if (md_copy_to_user(arg, &info, sizeof(info)))
+       if (copy_to_user(arg, &info, sizeof(info)))
                 return -EFAULT;
  
         return 0;
@@ -2191,7 +2191,7 @@ static int add_new_disk(mddev_t * mddev, mdu_disk_info_t *info)
                         return -EINVAL;
                 }
                 if (mddev->nb_dev) {
-                       mdk_rdev_t *rdev0 = md_list_entry(mddev->disks.next,
+                       mdk_rdev_t *rdev0 = list_entry(mddev->disks.next,
                                                         mdk_rdev_t, same_set);
                         if (!uuid_equal(rdev0, rdev)) {
                                 printk(KERN_WARNING "md: %s has different UUID to %s\n",
@@ -2223,7 +2223,7 @@ static int add_new_disk(mddev_t * mddev, mdu_disk_info_t *info)
         SET_SB(raid_disk);
         SET_SB(state);
  
-       if ((info->state & (1<<MD_DISK_FAULTY))==0) {
+       if (!(info->state & (1<<MD_DISK_FAULTY))) {
                 err = md_import_device (dev, 0);
                 if (err) {
                         printk(KERN_WARNING "md: error, md_import_device() returned %d\n", err);
@@ -2566,7 +2566,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
         mddev_t *mddev = NULL;
         kdev_t dev;
  
-       if (!md_capable_admin())
+       if (!capable(CAP_SYS_ADMIN))
                 return -EACCES;
  
         dev = inode->i_rdev;
@@ -2604,12 +2604,12 @@ static int md_ioctl(struct inode *inode, struct file *file,
                                 MD_BUG();
                                 goto abort;
                         }
-                       err = md_put_user(md_hd_struct[minor].nr_sects,
+                       err = put_user(md_hd_struct[minor].nr_sects,
                                                 (unsigned long *) arg);
                         goto done;
  
                 case BLKGETSIZE64:      /* Return device size */
-                       err = md_put_user((u64)md_hd_struct[minor].nr_sects << 9,
+                       err = put_user((u64)md_hd_struct[minor].nr_sects << 9,
                                                 (u64 *) arg);
                         goto done;
  
@@ -2618,7 +2618,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
                 case BLKFLSBUF:
                 case BLKBSZGET:
                 case BLKBSZSET:
-                       err = blk_ioctl (dev, cmd, arg);
+                       err = blk_ioctl(dev, cmd, arg);
                         goto abort;
  
                 default:;
@@ -2670,7 +2670,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
                         }
                         if (arg) {
                                 mdu_array_info_t info;
-                               if (md_copy_from_user(&info, (void*)arg, sizeof(info))) {
+                               if (copy_from_user(&info, (void*)arg, sizeof(info))) {
                                         err = -EFAULT;
                                         goto abort_unlock;
                                 }
@@ -2753,17 +2753,17 @@ static int md_ioctl(struct inode *inode, struct file *file,
                                 err = -EINVAL;
                                 goto abort_unlock;
                         }
-                       err = md_put_user (2, (char *) &loc->heads);
+                       err = put_user (2, (char *) &loc->heads);
                         if (err)
                                 goto abort_unlock;
-                       err = md_put_user (4, (char *) &loc->sectors);
+                       err = put_user (4, (char *) &loc->sectors);
                         if (err)
                                 goto abort_unlock;
-                       err = md_put_user (md_hd_struct[mdidx(mddev)].nr_sects/8,
+                       err = put_user (md_hd_struct[mdidx(mddev)].nr_sects/8,
                                                 (short *) &loc->cylinders);
                         if (err)
                                 goto abort_unlock;
-                       err = md_put_user (get_start_sect(dev),
+                       err = put_user (get_start_sect(dev),
                                                 (long *) &loc->start);
                         goto done_unlock;
         }
@@ -2787,7 +2787,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
                 case ADD_NEW_DISK:
                 {
                         mdu_disk_info_t info;
-                       if (md_copy_from_user(&info, (void*)arg, sizeof(info)))
+                       if (copy_from_user(&info, (void*)arg, sizeof(info)))
                                 err = -EFAULT;
                         else
                                 err = add_new_disk(mddev, &info);
@@ -2828,7 +2828,7 @@ static int md_ioctl(struct inode *inode, struct file *file,
                 {
  /* The data is never used....
                         mdu_param_t param;
-                       err = md_copy_from_user(&param, (mdu_param_t *)arg,
+                       err = copy_from_user(&param, (mdu_param_t *)arg,
                                                          sizeof(param));
                         if (err)
                                 goto abort_unlock;
@@ -2887,7 +2887,7 @@ static int md_release(struct inode *inode, struct file * file)
         return 0;
  }
  
-static struct block_device_operations md_fops=
+static struct block_device_operations md_fops =
  {
         owner:          THIS_MODULE,
         open:           md_open,
@@ -2896,11 +2896,18 @@ static struct block_device_operations md_fops=
  };
  
  
+static inline void flush_curr_signals(void)
+{
+       spin_lock(&current->sigmask_lock);
+       flush_signals(current);
+       spin_unlock(&current->sigmask_lock);
+}
+
  int md_thread(void * arg)
  {
         mdk_thread_t *thread = arg;
  
-       md_lock_kernel();
+       lock_kernel();
  
         /*
          * Detach thread
@@ -2909,8 +2916,9 @@ int md_thread(void * arg)
         daemonize();
  
         sprintf(current->comm, thread->name);
-       md_init_signals();
-       md_flush_signals();
+       current->exit_signal = SIGCHLD;
+       siginitsetinv(&current->blocked, sigmask(SIGKILL));
+       flush_curr_signals();
         thread->tsk = current;
  
         /*
@@ -2926,7 +2934,7 @@ int md_thread(void * arg)
          */
         current->policy = SCHED_OTHER;
         current->nice = -20;
-       md_unlock_kernel();
+       unlock_kernel();
  
         complete(thread->event);
         while (thread->run) {
@@ -2949,8 +2957,8 @@ int md_thread(void * arg)
                         run(thread->data);
                         run_task_queue(&tq_disk);
                 }
-               if (md_signal_pending(current))
-                       md_flush_signals();
+               if (signal_pending(current))
+                       flush_curr_signals();
         }
         complete(thread->event);
         return 0;
@@ -2976,7 +2984,7 @@ mdk_thread_t *md_register_thread(void (*run) (void *),
                 return NULL;
  
         memset(thread, 0, sizeof(mdk_thread_t));
-       md_init_waitqueue_head(&thread->wqueue);
+       init_waitqueue_head(&thread->wqueue);
  
         init_completion(&event);
         thread->event = &event;
@@ -3064,7 +3072,7 @@ static int status_unused(char * page)
  {
         int sz = 0, i = 0;
         mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
  
         sz += sprintf(page + sz, "unused devices: ");
  
@@ -3150,7 +3158,7 @@ static int md_status_read_proc(char *page, char **start, off_t off,
                         int count, int *eof, void *data)
  {
         int sz = 0, j, size;
-       struct md_list_head *tmp, *tmp2;
+       struct list_head *tmp, *tmp2;
         mdk_rdev_t *rdev;
         mddev_t *mddev;
  
@@ -3207,7 +3215,7 @@ static int md_status_read_proc(char *page, char **start, off_t off,
                 if (mddev->curr_resync) {
                         sz += status_resync (page+sz, mddev);
                 } else {
-                       if (md_atomic_read(&mddev->resync_sem.count) != 1)
+                       if (atomic_read(&mddev->resync_sem.count) != 1)
                                 sz += sprintf(page + sz, "      resync=DELAYED");
                 }
                 sz += sprintf(page + sz, "\n");
@@ -3251,7 +3259,7 @@ mdp_disk_t *get_spare(mddev_t *mddev)
         mdp_super_t *sb = mddev->sb;
         mdp_disk_t *disk;
         mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
  
         ITERATE_RDEV(mddev,rdev,tmp) {
                 if (rdev->faulty)
@@ -3288,7 +3296,7 @@ void md_sync_acct(kdev_t dev, unsigned long nr_sectors)
  static int is_mddev_idle(mddev_t *mddev)
  {
         mdk_rdev_t * rdev;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         int idle;
         unsigned long curr_events;
  
@@ -3311,7 +3319,7 @@ static int is_mddev_idle(mddev_t *mddev)
         return idle;
  }
  
-MD_DECLARE_WAIT_QUEUE_HEAD(resync_wait);
+DECLARE_WAIT_QUEUE_HEAD(resync_wait);
  
  void md_done_sync(mddev_t *mddev, int blocks, int ok)
  {
@@ -3333,7 +3341,7 @@ int md_do_sync(mddev_t *mddev, mdp_disk_t *spare)
         unsigned long mark[SYNC_MARKS];
         unsigned long mark_cnt[SYNC_MARKS];
         int last_mark,m;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         unsigned long last_check;
  
  
@@ -3356,8 +3364,8 @@ recheck:
         }
         if (serialize) {
                 interruptible_sleep_on(&resync_wait);
-               if (md_signal_pending(current)) {
-                       md_flush_signals();
+               if (signal_pending(current)) {
+                       flush_curr_signals();
                         err = -EINTR;
                         goto out;
                 }
@@ -3365,8 +3373,7 @@ recheck:
         }
  
         mddev->curr_resync = 1;
-
-       max_sectors = mddev->sb->size<<1;
+       max_sectors = mddev->sb->size << 1;
  
         printk(KERN_INFO "md: syncing RAID array md%d\n", mdidx(mddev));
         printk(KERN_INFO "md: minimum _guaranteed_ reconstruction speed: %d KB/sec/disc.\n",
@@ -3403,7 +3410,6 @@ recheck:
                 int sectors;
  
                 sectors = mddev->pers->sync_request(mddev, j);
-
                 if (sectors < 0) {
                         err = sectors;
                         goto out;
@@ -3432,13 +3438,13 @@ recheck:
                 }
  
  
-               if (md_signal_pending(current)) {
+               if (signal_pending(current)) {
                         /*
                          * got a signal, exit.
                          */
                         mddev->curr_resync = 0;
                         printk(KERN_INFO "md: md_do_sync() got signal ... exiting\n");
-                       md_flush_signals();
+                       flush_curr_signals();
                         err = -EINTR;
                         goto out;
                 }
@@ -3451,7 +3457,7 @@ recheck:
                  * about not overloading the IO subsystem. (things like an
                  * e2fsck being done on the RAID array should execute fast)
                  */
-               if (md_need_resched(current))
+               if (current->need_resched)
                         schedule();
  
                 currspeed = (j-mddev->resync_mark_cnt)/2/((jiffies-mddev->resync_mark)/HZ +1) +1;
@@ -3462,7 +3468,7 @@ recheck:
                         if ((currspeed > sysctl_speed_limit_max) ||
                                         !is_mddev_idle(mddev)) {
                                 current->state = TASK_INTERRUPTIBLE;
-                               md_schedule_timeout(HZ/4);
+                               schedule_timeout(HZ/4);
                                 goto repeat;
                         }
                 } else
@@ -3474,7 +3480,7 @@ recheck:
          * this also signals 'finished resyncing' to md_stop
          */
  out:
-       wait_event(mddev->recovery_wait, atomic_read(&mddev->recovery_active)==0);
+       wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active));
         up(&mddev->resync_sem);
  out_nolock:
         mddev->curr_resync = 0;
@@ -3497,7 +3503,7 @@ void md_do_recovery(void *data)
         mddev_t *mddev;
         mdp_super_t *sb;
         mdp_disk_t *spare;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
  
         printk(KERN_INFO "md: recovery thread got woken up ...\n");
  restart:
@@ -3581,13 +3587,13 @@ restart:
  int md_notify_reboot(struct notifier_block *this,
                                         unsigned long code, void *x)
  {
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         mddev_t *mddev;
  
-       if ((code == MD_SYS_DOWN) || (code == MD_SYS_HALT)
-                                       || (code == MD_SYS_POWER_OFF)) {
+       if ((code == SYS_DOWN) || (code == SYS_HALT) || (code == SYS_POWER_OFF)) {
  
                 printk(KERN_INFO "md: stopping all md devices.\n");
+               return NOTIFY_DONE;
  
                 ITERATE_MDDEV(mddev,tmp)
                         do_md_stop (mddev, 1);
@@ -3597,7 +3603,7 @@ int md_notify_reboot(struct notifier_block *this,
                  * right place to handle this issue is the given
                  * driver, we do want to have a safe RAID driver ...
                  */
-               md_mdelay(1000*1);
+               mdelay(1000*1);
         }
         return NOTIFY_DONE;
  }
@@ -3628,7 +3634,7 @@ static void md_geninit(void)
  #endif
  }
  
-int md__init md_init(void)
+int __init md_init(void)
  {
         static char * name = "mdrecoveryd";
         int minor;
@@ -3665,7 +3671,7 @@ int md__init md_init(void)
                 printk(KERN_ALERT
                        "md: bug: couldn't allocate md_recovery_thread\n");
  
-       md_register_reboot_notifier(&md_notifier);
+       register_reboot_notifier(&md_notifier);
         raid_table_header = register_sysctl_table(raid_root_table, 1);
  
         md_geninit();
@@ -3687,7 +3693,7 @@ int md__init md_init(void)
  struct {
         int set;
         int noautodetect;
-} raid_setup_args md__initdata;
+} raid_setup_args __initdata;
  
  /*
   * Searches all registered partitions for autorun RAID arrays
@@ -3730,7 +3736,7 @@ static void autostart_arrays(void)
                         MD_BUG();
                         continue;
                 }
-               md_list_add(&rdev->pending, &pending_raid_disks);
+               list_add(&rdev->pending, &pending_raid_disks);
         }
         dev_cnt = 0;
  
@@ -3742,7 +3748,7 @@ static struct {
         int pers[MAX_MD_DEVS];
         int chunk[MAX_MD_DEVS];
         char *device_names[MAX_MD_DEVS];
-} md_setup_args md__initdata;
+} md_setup_args __initdata;
  
  /*
   * Parse the command-line parameters given our kernel, but do not
@@ -3764,7 +3770,7 @@ static struct {
   *             Shifted name_to_kdev_t() and related operations to md_set_drive()
   *             for later execution. Rewrote section to make devfs compatible.
   */
-static int md__init md_setup(char *str)
+static int __init md_setup(char *str)
  {
         int minor, level, factor, fault;
         char *pername = "";
@@ -3783,7 +3789,7 @@ static int md__init md_setup(char *str)
         }
         switch (get_option(&str, &level)) {     /* RAID Personality */
         case 2: /* could be 0 or -1.. */
-               if (level == 0 || level == -1) {
+               if (!level || level == -1) {
                         if (get_option(&str, &factor) != 2 ||   /* Chunk Size */
                                         get_option(&str, &fault) != 2) {
                                 printk(KERN_WARNING "md: Too few arguments supplied to md=.\n");
@@ -3825,8 +3831,8 @@ static int md__init md_setup(char *str)
         return 1;
  }
  
-extern kdev_t name_to_kdev_t(char *line) md__init;
-void md__init md_setup_drive(void)
+extern kdev_t name_to_kdev_t(char *line) __init;
+void __init md_setup_drive(void)
  {
         int minor, i;
         kdev_t dev;
@@ -3838,7 +3844,8 @@ void md__init md_setup_drive(void)
                 char *devname;
                 mdu_disk_info_t dinfo;
  
-               if ((devname = md_setup_args.device_names[minor]) == 0) continue;
+               if (!(devname = md_setup_args.device_names[minor]))
+                       continue;
  
                 for (i = 0; i < MD_SB_DISKS && devname != 0; i++) {
  
@@ -3857,7 +3864,7 @@ void md__init md_setup_drive(void)
                                 devfs_get_maj_min(handle, &major, &minor);
                                 dev = MKDEV(major, minor);
                         }
-                       if (dev == 0) {
+                       if (!dev) {
                                 printk(KERN_WARNING "md: Unknown device name: %s\n", devname);
                                 break;
                         }
@@ -3869,7 +3876,7 @@ void md__init md_setup_drive(void)
                 }
                 devices[i] = 0;
  
-               if (md_setup_args.device_set[minor] == 0)
+               if (!md_setup_args.device_set[minor])
                         continue;
  
                 if (mddev_map[minor].mddev) {
@@ -3933,7 +3940,7 @@ void md__init md_setup_drive(void)
         }
  }
  
-static int md__init raid_setup(char *str)
+static int __init raid_setup(char *str)
  {
         int len, pos;
  
@@ -3947,7 +3954,7 @@ static int md__init raid_setup(char *str)
                         wlen = (comma-str)-pos;
                 else    wlen = (len-1)-pos;
  
-               if (strncmp(str, "noautodetect", wlen) == 0)
+               if (!strncmp(str, "noautodetect", wlen))
                         raid_setup_args.noautodetect = 1;
                 pos += wlen+1;
         }
@@ -3955,7 +3962,7 @@ static int md__init raid_setup(char *str)
         return 1;
  }
  
-int md__init md_run_setup(void)
+int __init md_run_setup(void)
  {
         if (raid_setup_args.noautodetect)
                 printk(KERN_INFO "md: Skipping autodetection of RAID arrays. (raid=noautodetect)\n");
@@ -4008,23 +4015,23 @@ void cleanup_module(void)
  }
  #endif
  
-MD_EXPORT_SYMBOL(md_size);
-MD_EXPORT_SYMBOL(register_md_personality);
-MD_EXPORT_SYMBOL(unregister_md_personality);
-MD_EXPORT_SYMBOL(partition_name);
-MD_EXPORT_SYMBOL(md_error);
-MD_EXPORT_SYMBOL(md_do_sync);
-MD_EXPORT_SYMBOL(md_sync_acct);
-MD_EXPORT_SYMBOL(md_done_sync);
-MD_EXPORT_SYMBOL(md_recover_arrays);
-MD_EXPORT_SYMBOL(md_register_thread);
-MD_EXPORT_SYMBOL(md_unregister_thread);
-MD_EXPORT_SYMBOL(md_update_sb);
-MD_EXPORT_SYMBOL(md_wakeup_thread);
-MD_EXPORT_SYMBOL(md_print_devices);
-MD_EXPORT_SYMBOL(find_rdev_nr);
-MD_EXPORT_SYMBOL(md_interrupt_thread);
-MD_EXPORT_SYMBOL(mddev_map);
-MD_EXPORT_SYMBOL(md_check_ordering);
-MD_EXPORT_SYMBOL(get_spare);
+EXPORT_SYMBOL(md_size);
+EXPORT_SYMBOL(register_md_personality);
+EXPORT_SYMBOL(unregister_md_personality);
+EXPORT_SYMBOL(partition_name);
+EXPORT_SYMBOL(md_error);
+EXPORT_SYMBOL(md_do_sync);
+EXPORT_SYMBOL(md_sync_acct);
+EXPORT_SYMBOL(md_done_sync);
+EXPORT_SYMBOL(md_recover_arrays);
+EXPORT_SYMBOL(md_register_thread);
+EXPORT_SYMBOL(md_unregister_thread);
+EXPORT_SYMBOL(md_update_sb);
+EXPORT_SYMBOL(md_wakeup_thread);
+EXPORT_SYMBOL(md_print_devices);
+EXPORT_SYMBOL(find_rdev_nr);
+EXPORT_SYMBOL(md_interrupt_thread);
+EXPORT_SYMBOL(mddev_map);
+EXPORT_SYMBOL(md_check_ordering);
+EXPORT_SYMBOL(get_spare);
  
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c

index 8b21f612e0a509c8370703aa9c626cb1f7dabe63..7203d97a27fd3575711e7027aa1d4fa44c282bcd 100644 (file)
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -334,7 +334,7 @@ static mdk_personality_t raid0_personality=
         status:         raid0_status,
  };
  
-static int md__init raid0_init (void)
+static int __init raid0_init (void)
  {
         return register_md_personality (RAID0, &raid0_personality);
  }
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c

index 6c8a5bf21112f13c313f99274ffdca44dae349a8..57829582b60cfad01afc81ec0676549f7ae25ab1 100644 (file)
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1,7 +1,7 @@
  /*
   * raid1.c : Multiple Devices driver for Linux
   *
- * Copyright (C) 1999, 2000 Ingo Molnar, Red Hat
+ * Copyright (C) 1999, 2000, 2001 Ingo Molnar, Red Hat
   *
   * Copyright (C) 1996, 1997, 1998 Ingo Molnar, Miguel de Icaza, Gadi Oxman
   *
@@ -22,330 +22,208 @@
   * Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
   */
  
-#include <linux/module.h>
-#include <linux/slab.h>
  #include <linux/raid/raid1.h>
-#include <asm/atomic.h>
  
  #define MAJOR_NR MD_MAJOR
  #define MD_DRIVER
  #define MD_PERSONALITY
  
  #define MAX_WORK_PER_DISK 128
-
-#define        NR_RESERVED_BUFS        32
-
-
  /*
- * The following can be used to debug the driver
+ * Number of guaranteed r1bios in case of extreme VM load:
   */
-#define RAID1_DEBUG    0
-
-#if RAID1_DEBUG
-#define PRINTK(x...)   printk(x)
-#define inline
-#define __inline__
-#else
-#define PRINTK(x...)  do { } while (0)
-#endif
-
+#define        NR_RAID1_BIOS 256
  
  static mdk_personality_t raid1_personality;
-static md_spinlock_t retry_list_lock = MD_SPIN_LOCK_UNLOCKED;
-struct raid1_bh *raid1_retry_list = NULL, **raid1_retry_tail;
+static spinlock_t retry_list_lock = SPIN_LOCK_UNLOCKED;
+static LIST_HEAD(retry_list_head);
  
-static struct buffer_head *raid1_alloc_bh(raid1_conf_t *conf, int cnt)
+static inline void check_all_w_bios_empty(r1bio_t *r1_bio)
  {
-       /* return a linked list of "cnt" struct buffer_heads.
-        * don't take any off the free list unless we know we can
-        * get all we need, otherwise we could deadlock
-        */
-       struct buffer_head *bh=NULL;
-
-       while(cnt) {
-               struct buffer_head *t;
-               md_spin_lock_irq(&conf->device_lock);
-               if (!conf->freebh_blocked && conf->freebh_cnt >= cnt)
-                       while (cnt) {
-                               t = conf->freebh;
-                               conf->freebh = t->b_next;
-                               t->b_next = bh;
-                               bh = t;
-                               t->b_state = 0;
-                               conf->freebh_cnt--;
-                               cnt--;
-                       }
-               md_spin_unlock_irq(&conf->device_lock);
-               if (cnt == 0)
-                       break;
-               t = kmem_cache_alloc(bh_cachep, SLAB_NOIO);
-               if (t) {
-                       t->b_next = bh;
-                       bh = t;
-                       cnt--;
-               } else {
-                       PRINTK("raid1: waiting for %d bh\n", cnt);
-                       conf->freebh_blocked = 1;
-                       wait_disk_event(conf->wait_buffer,
-                                       !conf->freebh_blocked ||
-                                       conf->freebh_cnt > conf->raid_disks * NR_RESERVED_BUFS/2);
-                       conf->freebh_blocked = 0;
-               }
-       }
-       return bh;
-}
+       int i;
  
-static inline void raid1_free_bh(raid1_conf_t *conf, struct buffer_head *bh)
-{
-       unsigned long flags;
-       spin_lock_irqsave(&conf->device_lock, flags);
-       while (bh) {
-               struct buffer_head *t = bh;
-               bh=bh->b_next;
-               if (t->b_pprev == NULL)
-                       kmem_cache_free(bh_cachep, t);
-               else {
-                       t->b_next= conf->freebh;
-                       conf->freebh = t;
-                       conf->freebh_cnt++;
-               }
-       }
-       spin_unlock_irqrestore(&conf->device_lock, flags);
-       wake_up(&conf->wait_buffer);
+       return;
+       for (i = 0; i < MD_SB_DISKS; i++)
+               if (r1_bio->write_bios[i])
+                       BUG();
  }
  
-static int raid1_grow_bh(raid1_conf_t *conf, int cnt)
+static inline void check_all_bios_empty(r1bio_t *r1_bio)
  {
-       /* allocate cnt buffer_heads, possibly less if kmalloc fails */
-       int i = 0;
-
-       while (i < cnt) {
-               struct buffer_head *bh;
-               bh = kmem_cache_alloc(bh_cachep, SLAB_KERNEL);
-               if (!bh) break;
-
-               md_spin_lock_irq(&conf->device_lock);
-               bh->b_pprev = &conf->freebh;
-               bh->b_next = conf->freebh;
-               conf->freebh = bh;
-               conf->freebh_cnt++;
-               md_spin_unlock_irq(&conf->device_lock);
-
-               i++;
-       }
-       return i;
+       return;
+       if (r1_bio->read_bio)
+               BUG();
+       check_all_w_bios_empty(r1_bio);
  }
  
-static void raid1_shrink_bh(raid1_conf_t *conf)
+static void * r1bio_pool_alloc(int gfp_flags, void *data)
  {
-       /* discard all buffer_heads */
-
-       md_spin_lock_irq(&conf->device_lock);
-       while (conf->freebh) {
-               struct buffer_head *bh = conf->freebh;
-               conf->freebh = bh->b_next;
-               kmem_cache_free(bh_cachep, bh);
-               conf->freebh_cnt--;
-       }
-       md_spin_unlock_irq(&conf->device_lock);
-}
-               
+       r1bio_t *r1_bio;
  
-static struct raid1_bh *raid1_alloc_r1bh(raid1_conf_t *conf)
-{
-       struct raid1_bh *r1_bh = NULL;
+       r1_bio = kmalloc(sizeof(r1bio_t), gfp_flags);
+       if (r1_bio)
+               memset(r1_bio, 0, sizeof(*r1_bio));
  
-       do {
-               md_spin_lock_irq(&conf->device_lock);
-               if (!conf->freer1_blocked && conf->freer1) {
-                       r1_bh = conf->freer1;
-                       conf->freer1 = r1_bh->next_r1;
-                       conf->freer1_cnt--;
-                       r1_bh->next_r1 = NULL;
-                       r1_bh->state = (1 << R1BH_PreAlloc);
-                       r1_bh->bh_req.b_state = 0;
-               }
-               md_spin_unlock_irq(&conf->device_lock);
-               if (r1_bh)
-                       return r1_bh;
-               r1_bh = (struct raid1_bh *) kmalloc(sizeof(struct raid1_bh), GFP_NOIO);
-               if (r1_bh) {
-                       memset(r1_bh, 0, sizeof(*r1_bh));
-                       return r1_bh;
-               }
-               conf->freer1_blocked = 1;
-               wait_disk_event(conf->wait_buffer,
-                               !conf->freer1_blocked ||
-                               conf->freer1_cnt > NR_RESERVED_BUFS/2
-                       );
-               conf->freer1_blocked = 0;
-       } while (1);
+       return r1_bio;
  }
  
-static inline void raid1_free_r1bh(struct raid1_bh *r1_bh)
+static void r1bio_pool_free(void *r1_bio, void *data)
  {
-       struct buffer_head *bh = r1_bh->mirror_bh_list;
-       raid1_conf_t *conf = mddev_to_conf(r1_bh->mddev);
-
-       r1_bh->mirror_bh_list = NULL;
-
-       if (test_bit(R1BH_PreAlloc, &r1_bh->state)) {
-               unsigned long flags;
-               spin_lock_irqsave(&conf->device_lock, flags);
-               r1_bh->next_r1 = conf->freer1;
-               conf->freer1 = r1_bh;
-               conf->freer1_cnt++;
-               spin_unlock_irqrestore(&conf->device_lock, flags);
-               /* don't need to wakeup wait_buffer because
-                *  raid1_free_bh below will do that
-                */
-       } else {
-               kfree(r1_bh);
-       }
-       raid1_free_bh(conf, bh);
+       check_all_bios_empty(r1_bio);
+       kfree(r1_bio);
  }
  
-static int raid1_grow_r1bh (raid1_conf_t *conf, int cnt)
-{
-       int i = 0;
-
-       while (i < cnt) {
-               struct raid1_bh *r1_bh;
-               r1_bh = (struct raid1_bh*)kmalloc(sizeof(*r1_bh), GFP_KERNEL);
-               if (!r1_bh)
-                       break;
-               memset(r1_bh, 0, sizeof(*r1_bh));
-               set_bit(R1BH_PreAlloc, &r1_bh->state);
-               r1_bh->mddev = conf->mddev;
-
-               raid1_free_r1bh(r1_bh);
-               i++;
-       }
-       return i;
-}
+#define RESYNC_BLOCK_SIZE (64*1024)
+#define RESYNC_PAGES ((RESYNC_BLOCK_SIZE + PAGE_SIZE-1) / PAGE_SIZE)
+#define RESYNC_WINDOW (2048*1024)
  
-static void raid1_shrink_r1bh(raid1_conf_t *conf)
+static void * r1buf_pool_alloc(int gfp_flags, void *data)
  {
-       md_spin_lock_irq(&conf->device_lock);
-       while (conf->freer1) {
-               struct raid1_bh *r1_bh = conf->freer1;
-               conf->freer1 = r1_bh->next_r1;
-               conf->freer1_cnt--;
-               kfree(r1_bh);
+       conf_t *conf = data;
+       struct page *page;
+       r1bio_t *r1_bio;
+       struct bio *bio;
+       int i, j;
+
+       r1_bio = mempool_alloc(conf->r1bio_pool, gfp_flags);
+       check_all_bios_empty(r1_bio);
+
+       bio = bio_alloc(gfp_flags, RESYNC_PAGES);
+       if (!bio)
+               goto out_free_r1_bio;
+
+       for (i = 0; i < RESYNC_PAGES; i++) {
+               page = alloc_page(gfp_flags);
+               if (unlikely(!page))
+                       goto out_free_pages;
+
+               bio->bi_io_vec[i].bv_page = page;
+               bio->bi_io_vec[i].bv_len = PAGE_SIZE;
+               bio->bi_io_vec[i].bv_offset = 0;
         }
-       md_spin_unlock_irq(&conf->device_lock);
-}
-
  
-
-static inline void raid1_free_buf(struct raid1_bh *r1_bh)
-{
-       unsigned long flags;
-       struct buffer_head *bh = r1_bh->mirror_bh_list;
-       raid1_conf_t *conf = mddev_to_conf(r1_bh->mddev);
-       r1_bh->mirror_bh_list = NULL;
-       
-       spin_lock_irqsave(&conf->device_lock, flags);
-       r1_bh->next_r1 = conf->freebuf;
-       conf->freebuf = r1_bh;
-       spin_unlock_irqrestore(&conf->device_lock, flags);
-       raid1_free_bh(conf, bh);
+       /*
+        * Allocate a single data page for this iovec.
+        */
+       bio->bi_vcnt = RESYNC_PAGES;
+       bio->bi_idx = 0;
+       bio->bi_size = RESYNC_BLOCK_SIZE;
+       bio->bi_end_io = NULL;
+       atomic_set(&bio->bi_cnt, 1);
+
+       r1_bio->master_bio = bio;
+
+       return r1_bio;
+
+out_free_pages:
+       for (j = 0; j < i; j++)
+               __free_page(bio->bi_io_vec[j].bv_page);
+       bio_put(bio);
+out_free_r1_bio:
+       mempool_free(r1_bio, conf->r1bio_pool);
+       return NULL;
  }
  
-static struct raid1_bh *raid1_alloc_buf(raid1_conf_t *conf)
+static void r1buf_pool_free(void *__r1_bio, void *data)
  {
-       struct raid1_bh *r1_bh;
-
-       md_spin_lock_irq(&conf->device_lock);
-       wait_event_lock_irq(conf->wait_buffer, conf->freebuf, conf->device_lock);
-       r1_bh = conf->freebuf;
-       conf->freebuf = r1_bh->next_r1;
-       r1_bh->next_r1= NULL;
-       md_spin_unlock_irq(&conf->device_lock);
+       int i;
+       conf_t *conf = data;
+       r1bio_t *r1bio = __r1_bio;
+       struct bio *bio = r1bio->master_bio;
  
-       return r1_bh;
+       check_all_bios_empty(r1bio);
+       if (atomic_read(&bio->bi_cnt) != 1)
+               BUG();
+       for (i = 0; i < RESYNC_PAGES; i++) {
+               __free_page(bio->bi_io_vec[i].bv_page);
+               bio->bi_io_vec[i].bv_page = NULL;
+       }
+       if (atomic_read(&bio->bi_cnt) != 1)
+               BUG();
+       bio_put(bio);
+       mempool_free(r1bio, conf->r1bio_pool);
  }
  
-static int raid1_grow_buffers (raid1_conf_t *conf, int cnt)
+static void put_all_bios(conf_t *conf, r1bio_t *r1_bio)
  {
-       int i = 0;
-
-       md_spin_lock_irq(&conf->device_lock);
-       while (i < cnt) {
-               struct raid1_bh *r1_bh;
-               struct page *page;
-
-               page = alloc_page(GFP_KERNEL);
-               if (!page)
-                       break;
+       int i;
  
-               r1_bh = (struct raid1_bh *) kmalloc(sizeof(*r1_bh), GFP_KERNEL);
-               if (!r1_bh) {
-                       __free_page(page);
-                       break;
+       if (r1_bio->read_bio) {
+               if (atomic_read(&r1_bio->read_bio->bi_cnt) != 1)
+                       BUG();
+               bio_put(r1_bio->read_bio);
+               r1_bio->read_bio = NULL;
+       }
+       for (i = 0; i < MD_SB_DISKS; i++) {
+               struct bio **bio = r1_bio->write_bios + i;
+               if (*bio) {
+                       if (atomic_read(&(*bio)->bi_cnt) != 1)
+                               BUG();
+                       bio_put(*bio);
                 }
-               memset(r1_bh, 0, sizeof(*r1_bh));
-               r1_bh->bh_req.b_page = page;
-               r1_bh->bh_req.b_data = page_address(page);
-               r1_bh->next_r1 = conf->freebuf;
-               conf->freebuf = r1_bh;
-               i++;
+               *bio = NULL;
         }
-       md_spin_unlock_irq(&conf->device_lock);
-       return i;
+       check_all_bios_empty(r1_bio);
  }
  
-static void raid1_shrink_buffers (raid1_conf_t *conf)
+static inline void free_r1bio(r1bio_t *r1_bio)
  {
-       md_spin_lock_irq(&conf->device_lock);
-       while (conf->freebuf) {
-               struct raid1_bh *r1_bh = conf->freebuf;
-               conf->freebuf = r1_bh->next_r1;
-               __free_page(r1_bh->bh_req.b_page);
-               kfree(r1_bh);
-       }
-       md_spin_unlock_irq(&conf->device_lock);
+       conf_t *conf = mddev_to_conf(r1_bio->mddev);
+
+       put_all_bios(conf, r1_bio);
+       mempool_free(r1_bio, conf->r1bio_pool);
+}
+
+static inline void put_buf(r1bio_t *r1_bio)
+{
+       conf_t *conf = mddev_to_conf(r1_bio->mddev);
+       struct bio *bio = r1_bio->master_bio;
+
+       /*
+        * undo any possible partial request fixup magic:
+        */
+       if (bio->bi_size != RESYNC_BLOCK_SIZE)
+               bio->bi_io_vec[bio->bi_vcnt-1].bv_len = PAGE_SIZE;
+       put_all_bios(conf, r1_bio);
+       mempool_free(r1_bio, conf->r1buf_pool);
  }
  
-static int raid1_map (mddev_t *mddev, kdev_t *rdev)
+static int map(mddev_t *mddev, kdev_t *rdev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       conf_t *conf = mddev_to_conf(mddev);
         int i, disks = MD_SB_DISKS;
  
         /*
-        * Later we do read balancing on the read side 
+        * Later we do read balancing on the read side
          * now we use the first available disk.
          */
  
         for (i = 0; i < disks; i++) {
                 if (conf->mirrors[i].operational) {
                         *rdev = conf->mirrors[i].dev;
-                       return (0);
+                       return 0;
                 }
         }
  
         printk (KERN_ERR "raid1_map(): huh, no more operational devices?\n");
-       return (-1);
+       return -1;
  }
  
-static void raid1_reschedule_retry (struct raid1_bh *r1_bh)
+static void reschedule_retry(r1bio_t *r1_bio)
  {
         unsigned long flags;
-       mddev_t *mddev = r1_bh->mddev;
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-
-       md_spin_lock_irqsave(&retry_list_lock, flags);
-       if (raid1_retry_list == NULL)
-               raid1_retry_tail = &raid1_retry_list;
-       *raid1_retry_tail = r1_bh;
-       raid1_retry_tail = &r1_bh->next_r1;
-       r1_bh->next_r1 = NULL;
-       md_spin_unlock_irqrestore(&retry_list_lock, flags);
+       mddev_t *mddev = r1_bio->mddev;
+       conf_t *conf = mddev_to_conf(mddev);
+
+       spin_lock_irqsave(&retry_list_lock, flags);
+       list_add(&r1_bio->retry_list, &retry_list_head);
+       spin_unlock_irqrestore(&retry_list_lock, flags);
+
         md_wakeup_thread(conf->thread);
  }
  
  
-static void inline io_request_done(unsigned long sector, raid1_conf_t *conf, int phase)
+static void inline raid_request_done(unsigned long sector, conf_t *conf, int phase)
  {
         unsigned long flags;
         spin_lock_irqsave(&conf->segment_lock, flags);
@@ -359,9 +237,10 @@ static void inline io_request_done(unsigned long sector, raid1_conf_t *conf, int
         spin_unlock_irqrestore(&conf->segment_lock, flags);
  }
  
-static void inline sync_request_done (unsigned long sector, raid1_conf_t *conf)
+static void inline sync_request_done(sector_t sector, conf_t *conf)
  {
         unsigned long flags;
+
         spin_lock_irqsave(&conf->segment_lock, flags);
         if (sector >= conf->start_ready)
                 --conf->cnt_ready;
@@ -375,73 +254,80 @@ static void inline sync_request_done (unsigned long sector, raid1_conf_t *conf)
  }
  
  /*
- * raid1_end_bh_io() is called when we have finished servicing a mirrored
+ * raid_end_bio_io() is called when we have finished servicing a mirrored
   * operation and are ready to return a success/failure code to the buffer
   * cache layer.
   */
-static void raid1_end_bh_io (struct raid1_bh *r1_bh, int uptodate)
+static int raid_end_bio_io(r1bio_t *r1_bio, int uptodate, int nr_sectors)
  {
-       struct buffer_head *bh = r1_bh->master_bh;
+       struct bio *bio = r1_bio->master_bio;
  
-       io_request_done(bh->b_rsector, mddev_to_conf(r1_bh->mddev),
-                       test_bit(R1BH_SyncPhase, &r1_bh->state));
+       raid_request_done(bio->bi_sector, mddev_to_conf(r1_bio->mddev),
+                       test_bit(R1BIO_SyncPhase, &r1_bio->state));
  
-       bh->b_end_io(bh, uptodate);
-       raid1_free_r1bh(r1_bh);
+       bio_endio(bio, uptodate, nr_sectors);
+       free_r1bio(r1_bio);
+
+       return 0;
  }
-void raid1_end_request (struct buffer_head *bh, int uptodate)
+
+static int end_request(struct bio *bio, int nr_sectors)
  {
-       struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_private);
+       int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
+       r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
  
         /*
          * this branch is our 'one mirror IO has finished' event handler:
          */
         if (!uptodate)
-               md_error (r1_bh->mddev, bh->b_dev);
+               md_error(r1_bio->mddev, bio->bi_dev);
         else
                 /*
-                * Set R1BH_Uptodate in our master buffer_head, so that
+                * Set R1BIO_Uptodate in our master bio, so that
                  * we will return a good error code for to the higher
                  * levels even if IO on some other mirrored buffer fails.
                  *
-                * The 'master' represents the complex operation to 
+                * The 'master' represents the complex operation to
                  * user-side. So if something waits for IO, then it will
-                * wait for the 'master' buffer_head.
+                * wait for the 'master' bio.
                  */
-               set_bit (R1BH_Uptodate, &r1_bh->state);
+               set_bit(R1BIO_Uptodate, &r1_bio->state);
  
         /*
-        * We split up the read and write side, imho they are 
+        * We split up the read and write side, imho they are
          * conceptually different.
          */
  
-       if ( (r1_bh->cmd == READ) || (r1_bh->cmd == READA) ) {
+       if ((r1_bio->cmd == READ) || (r1_bio->cmd == READA)) {
+               if (!r1_bio->read_bio)
+                       BUG();
                 /*
-                * we have only one buffer_head on the read side
+                * we have only one bio on the read side
                  */
-               
                 if (uptodate) {
-                       raid1_end_bh_io(r1_bh, uptodate);
-                       return;
+                       raid_end_bio_io(r1_bio, uptodate, nr_sectors);
+                       return 0;
                 }
                 /*
                  * oops, read error:
                  */
-               printk(KERN_ERR "raid1: %s: rescheduling block %lu\n", 
-                        partition_name(bh->b_dev), bh->b_blocknr);
-               raid1_reschedule_retry(r1_bh);
-               return;
+               printk(KERN_ERR "raid1: %s: rescheduling sector %lu\n",
+                       partition_name(bio->bi_dev), r1_bio->sector);
+               reschedule_retry(r1_bio);
+               return 0;
         }
  
+       if (r1_bio->read_bio)
+               BUG();
         /*
          * WRITE:
          *
-        * Let's see if all mirrored write operations have finished 
+        * Let's see if all mirrored write operations have finished
          * already.
          */
-
-       if (atomic_dec_and_test(&r1_bh->remaining))
-               raid1_end_bh_io(r1_bh, test_bit(R1BH_Uptodate, &r1_bh->state));
+       if (atomic_dec_and_test(&r1_bio->remaining))
+               raid_end_bio_io(r1_bio, uptodate, nr_sectors);
+       return 0;
  }
  
  /*
@@ -456,22 +342,20 @@ void raid1_end_request (struct buffer_head *bh, int uptodate)
   * reads should be somehow balanced.
   */
  
-static int raid1_read_balance (raid1_conf_t *conf, struct buffer_head *bh)
+static int read_balance(conf_t *conf, struct bio *bio, r1bio_t *r1_bio)
  {
-       int new_disk = conf->last_used;
-       const int sectors = bh->b_size >> 9;
-       const unsigned long this_sector = bh->b_rsector;
-       int disk = new_disk;
-       unsigned long new_distance;
-       unsigned long current_distance;
-       
+       const int sectors = bio->bi_size >> 9;
+       const unsigned long this_sector = r1_bio->sector;
+       unsigned long new_distance, current_distance;
+       int new_disk = conf->last_used, disk = new_disk;
+
         /*
          * Check if it is sane at all to balance
          */
-       
+
         if (conf->resync_mirrors)
                 goto rb_out;
-       
+
  
         /* make sure that disk is operational */
         while( !conf->mirrors[new_disk].operational) {
@@ -483,7 +367,7 @@ static int raid1_read_balance (raid1_conf_t *conf, struct buffer_head *bh)
                          * Nothing much to do, lets not change anything
                          * and hope for the best...
                          */
-                       
+
                         new_disk = conf->last_used;
  
                         goto rb_out;
@@ -491,53 +375,51 @@ static int raid1_read_balance (raid1_conf_t *conf, struct buffer_head *bh)
         }
         disk = new_disk;
         /* now disk == new_disk == starting point for search */
-       
+
         /*
          * Don't touch anything for sequential reads.
          */
-
         if (this_sector == conf->mirrors[new_disk].head_position)
                 goto rb_out;
-       
+
         /*
          * If reads have been done only on a single disk
          * for a time, lets give another disk a change.
          * This is for kicking those idling disks so that
          * they would find work near some hotspot.
          */
-       
         if (conf->sect_count >= conf->mirrors[new_disk].sect_limit) {
                 conf->sect_count = 0;
  
                 do {
-                       if (new_disk<=0)
+                       if (new_disk <= 0)
                                 new_disk = conf->raid_disks;
                         new_disk--;
                         if (new_disk == disk)
                                 break;
                 } while ((conf->mirrors[new_disk].write_only) ||
-                        (!conf->mirrors[new_disk].operational));
+                       (!conf->mirrors[new_disk].operational));
  
                 goto rb_out;
         }
-       
+
         current_distance = abs(this_sector -
                                 conf->mirrors[disk].head_position);
-       
+
         /* Find the disk which is closest */
-       
+
         do {
                 if (disk <= 0)
                         disk = conf->raid_disks;
                 disk--;
-               
+
                 if ((conf->mirrors[disk].write_only) ||
                                 (!conf->mirrors[disk].operational))
                         continue;
-               
+
                 new_distance = abs(this_sector -
                                         conf->mirrors[disk].head_position);
-               
+
                 if (new_distance < current_distance) {
                         conf->sect_count = 0;
                         current_distance = new_distance;
@@ -554,69 +436,73 @@ rb_out:
         return new_disk;
  }
  
-static int raid1_make_request (mddev_t *mddev, int rw,
-                              struct buffer_head * bh)
-{
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       struct buffer_head *bh_req, *bhl;
-       struct raid1_bh * r1_bh;
-       int disks = MD_SB_DISKS;
-       int i, sum_bhs = 0;
-       struct mirror_info *mirror;
-
-       if (!buffer_locked(bh))
-               BUG();
-       
  /*
- * make_request() can abort the operation when READA is being
- * used and no empty request is available.
- *
- * Currently, just replace the command with READ/WRITE.
+ * Wait if the reconstruction state machine puts up a bar for
+ * new requests in this sector range:
   */
-       if (rw == READA)
-               rw = READ;
-
-       r1_bh = raid1_alloc_r1bh (conf);
-
+static inline void new_request(conf_t *conf, r1bio_t *r1_bio)
+{
         spin_lock_irq(&conf->segment_lock);
         wait_event_lock_irq(conf->wait_done,
-                       bh->b_rsector < conf->start_active ||
-                       bh->b_rsector >= conf->start_future,
+                       r1_bio->sector < conf->start_active ||
+                       r1_bio->sector >= conf->start_future,
                         conf->segment_lock);
-       if (bh->b_rsector < conf->start_active) 
+       if (r1_bio->sector < conf->start_active)
                 conf->cnt_done++;
         else {
                 conf->cnt_future++;
                 if (conf->phase)
-                       set_bit(R1BH_SyncPhase, &r1_bh->state);
+                       set_bit(R1BIO_SyncPhase, &r1_bio->state);
         }
         spin_unlock_irq(&conf->segment_lock);
-       
+}
+
+static int make_request(mddev_t *mddev, int rw, struct bio * bio)
+{
+       conf_t *conf = mddev_to_conf(mddev);
+       mirror_info_t *mirror;
+       r1bio_t *r1_bio;
+       struct bio *read_bio;
+       int i, sum_bios = 0, disks = MD_SB_DISKS;
+
         /*
-        * i think the read and write branch should be separated completely,
-        * since we want to do read balancing on the read side for example.
-        * Alternative implementations? :) --mingo
+        * make_request() can abort the operation when READA is being
+        * used and no empty request is available.
+        *
+        * Currently, just replace the command with READ.
          */
+       if (rw == READA)
+               rw = READ;
+
+       r1_bio = mempool_alloc(conf->r1bio_pool, GFP_NOIO);
+       check_all_bios_empty(r1_bio);
+
+       r1_bio->master_bio = bio;
+
+       r1_bio->mddev = mddev;
+       r1_bio->sector = bio->bi_sector;
+       r1_bio->cmd = rw;
  
-       r1_bh->master_bh = bh;
-       r1_bh->mddev = mddev;
-       r1_bh->cmd = rw;
+       new_request(conf, r1_bio);
  
         if (rw == READ) {
                 /*
                  * read balancing logic:
                  */
-               mirror = conf->mirrors + raid1_read_balance(conf, bh);
-
-               bh_req = &r1_bh->bh_req;
-               memcpy(bh_req, bh, sizeof(*bh));
-               bh_req->b_blocknr = bh->b_rsector;
-               bh_req->b_dev = mirror->dev;
-               bh_req->b_rdev = mirror->dev;
-       /*      bh_req->b_rsector = bh->n_rsector; */
-               bh_req->b_end_io = raid1_end_request;
-               bh_req->b_private = r1_bh;
-               generic_make_request (rw, bh_req);
+               mirror = conf->mirrors + read_balance(conf, bio, r1_bio);
+
+               read_bio = bio_clone(bio, GFP_NOIO);
+               if (r1_bio->read_bio)
+                       BUG();
+               r1_bio->read_bio = read_bio;
+
+               read_bio->bi_sector = r1_bio->sector;
+               read_bio->bi_dev = mirror->dev;
+               read_bio->bi_end_io = end_request;
+               read_bio->bi_rw = rw;
+               read_bio->bi_private = r1_bio;
+
+               generic_make_request(read_bio);
                 return 0;
         }
  
@@ -624,62 +510,35 @@ static int raid1_make_request (mddev_t *mddev, int rw,
          * WRITE:
          */
  
-       bhl = raid1_alloc_bh(conf, conf->raid_disks);
+       check_all_w_bios_empty(r1_bio);
+
         for (i = 0; i < disks; i++) {
-               struct buffer_head *mbh;
-               if (!conf->mirrors[i].operational) 
+               struct bio *mbio;
+               if (!conf->mirrors[i].operational)
                         continue;
- 
-       /*
-        * We should use a private pool (size depending on NR_REQUEST),
-        * to avoid writes filling up the memory with bhs
-        *
-        * Such pools are much faster than kmalloc anyways (so we waste
-        * almost nothing by not using the master bh when writing and
-        * win alot of cleanness) but for now we are cool enough. --mingo
-        *
-        * It's safe to sleep here, buffer heads cannot be used in a shared
-        * manner in the write branch. Look how we lock the buffer at the
-        * beginning of this function to grok the difference ;)
-        */
-               mbh = bhl;
-               if (mbh == NULL) {
-                       MD_BUG();
-                       break;
-               }
-               bhl = mbh->b_next;
-               mbh->b_next = NULL;
-               mbh->b_this_page = (struct buffer_head *)1;
-               
-       /*
-        * prepare mirrored mbh (fields ordered for max mem throughput):
-        */
-               mbh->b_blocknr    = bh->b_rsector;
-               mbh->b_dev        = conf->mirrors[i].dev;
-               mbh->b_rdev       = conf->mirrors[i].dev;
-               mbh->b_rsector    = bh->b_rsector;
-               mbh->b_state      = (1<<BH_Req) | (1<<BH_Dirty) |
-                                               (1<<BH_Mapped) | (1<<BH_Lock);
-
-               atomic_set(&mbh->b_count, 1);
-               mbh->b_size       = bh->b_size;
-               mbh->b_page       = bh->b_page;
-               mbh->b_data       = bh->b_data;
-               mbh->b_list       = BUF_LOCKED;
-               mbh->b_end_io     = raid1_end_request;
-               mbh->b_private    = r1_bh;
-
-               mbh->b_next = r1_bh->mirror_bh_list;
-               r1_bh->mirror_bh_list = mbh;
-               sum_bhs++;
+
+               mbio = bio_clone(bio, GFP_NOIO);
+               if (r1_bio->write_bios[i])
+                       BUG();
+               r1_bio->write_bios[i] = mbio;
+
+               mbio->bi_sector = r1_bio->sector;
+               mbio->bi_dev = conf->mirrors[i].dev;
+               mbio->bi_end_io = end_request;
+               mbio->bi_rw = rw;
+               mbio->bi_private = r1_bio;
+
+               sum_bios++;
         }
-       if (bhl) raid1_free_bh(conf,bhl);
-       if (!sum_bhs) {
-               /* Gag - all mirrors non-operational.. */
-               raid1_end_bh_io(r1_bh, 0);
+       if (!sum_bios) {
+               /*
+                * If all mirrors are non-operational
+                * then return an IO error:
+                */
+               raid_end_bio_io(r1_bio, 0, 0);
                 return 0;
         }
-       md_atomic_set(&r1_bh->remaining, sum_bhs);
+       atomic_set(&r1_bio->remaining, sum_bios);
  
         /*
          * We have to be a bit careful about the semaphore above, thats
@@ -688,28 +547,30 @@ static int raid1_make_request (mddev_t *mddev, int rw,
          * safer solution. Imagine, end_request decreasing the semaphore
          * before we could have set it up ... We could play tricks with
          * the semaphore (presetting it and correcting at the end if
-        * sum_bhs is not 'n' but we have to do end_request by hand if
+        * sum_bios is not 'n' but we have to do end_request by hand if
          * all requests finish until we had a chance to set up the
          * semaphore correctly ... lots of races).
          */
-       bh = r1_bh->mirror_bh_list;
-       while(bh) {
-               struct buffer_head *bh2 = bh;
-               bh = bh->b_next;
-               generic_make_request(rw, bh2);
+       for (i = 0; i < disks; i++) {
+               struct bio *mbio;
+               mbio = r1_bio->write_bios[i];
+               if (!mbio)
+                       continue;
+
+               generic_make_request(mbio);
         }
-       return (0);
+       return 0;
  }
  
-static int raid1_status (char *page, mddev_t *mddev)
+static int status(char *page, mddev_t *mddev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       conf_t *conf = mddev_to_conf(mddev);
         int sz = 0, i;
-       
-       sz += sprintf (page+sz, " [%d/%d] [", conf->raid_disks,
-                                                conf->working_disks);
+
+       sz += sprintf(page+sz, " [%d/%d] [", conf->raid_disks,
+                                               conf->working_disks);
         for (i = 0; i < conf->raid_disks; i++)
-               sz += sprintf (page+sz, "%s",
+               sz += sprintf(page+sz, "%s",
                         conf->mirrors[i].operational ? "U" : "_");
         sz += sprintf (page+sz, "]");
         return sz;
@@ -731,10 +592,10 @@ static int raid1_status (char *page, mddev_t *mddev)
  #define ALREADY_SYNCING KERN_INFO \
  "raid1: syncing already in progress.\n"
  
-static void mark_disk_bad (mddev_t *mddev, int failed)
+static void mark_disk_bad(mddev_t *mddev, int failed)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       struct mirror_info *mirror = conf->mirrors+failed;
+       conf_t *conf = mddev_to_conf(mddev);
+       mirror_info_t *mirror = conf->mirrors+failed;
         mdp_super_t *sb = mddev->sb;
  
         mirror->operational = 0;
@@ -749,37 +610,36 @@ static void mark_disk_bad (mddev_t *mddev, int failed)
         md_wakeup_thread(conf->thread);
         if (!mirror->write_only)
                 conf->working_disks--;
-       printk (DISK_FAILED, partition_name (mirror->dev),
-                                conf->working_disks);
+       printk(DISK_FAILED, partition_name(mirror->dev),
+                               conf->working_disks);
  }
  
-static int raid1_error (mddev_t *mddev, kdev_t dev)
+static int error(mddev_t *mddev, kdev_t dev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       struct mirror_info * mirrors = conf->mirrors;
+       conf_t *conf = mddev_to_conf(mddev);
+       mirror_info_t * mirrors = conf->mirrors;
         int disks = MD_SB_DISKS;
         int i;
  
-       /* Find the drive.
+       /*
+        * Find the drive.
          * If it is not operational, then we have already marked it as dead
          * else if it is the last working disks, ignore the error, let the
          * next level up know.
          * else mark the drive as failed
          */
-
         for (i = 0; i < disks; i++)
-               if (mirrors[i].dev==dev && mirrors[i].operational)
+               if (mirrors[i].dev == dev && mirrors[i].operational)
                         break;
         if (i == disks)
                 return 0;
  
-       if (i < conf->raid_disks && conf->working_disks == 1) {
-               /* Don't fail the drive, act as though we were just a
+       if (i < conf->raid_disks && conf->working_disks == 1)
+               /*
+                * Don't fail the drive, act as though we were just a
                  * normal single drive
                  */
-
                 return 1;
-       }
         mark_disk_bad(mddev, i);
         return 0;
  }
@@ -790,41 +650,42 @@ static int raid1_error (mddev_t *mddev, kdev_t dev)
  #undef START_SYNCING
  
  
-static void print_raid1_conf (raid1_conf_t *conf)
+static void print_conf(conf_t *conf)
  {
         int i;
-       struct mirror_info *tmp;
+       mirror_info_t *tmp;
  
         printk("RAID1 conf printout:\n");
         if (!conf) {
-               printk("(conf==NULL)\n");
+               printk("(!conf)\n");
                 return;
         }
         printk(" --- wd:%d rd:%d nd:%d\n", conf->working_disks,
-                        conf->raid_disks, conf->nr_disks);
+                       conf->raid_disks, conf->nr_disks);
  
         for (i = 0; i < MD_SB_DISKS; i++) {
                 tmp = conf->mirrors + i;
                 printk(" disk %d, s:%d, o:%d, n:%d rd:%d us:%d dev:%s\n",
-                       i, tmp->spare,tmp->operational,
-                       tmp->number,tmp->raid_disk,tmp->used_slot,
+                       i, tmp->spare, tmp->operational,
+                       tmp->number, tmp->raid_disk, tmp->used_slot,
                         partition_name(tmp->dev));
         }
  }
  
-static void close_sync(raid1_conf_t *conf)
+static void close_sync(conf_t *conf)
  {
         mddev_t *mddev = conf->mddev;
-       /* If reconstruction was interrupted, we need to close the "active" and "pending"
-        * holes.
-        * we know that there are no active rebuild requests, os cnt_active == cnt_ready ==0
+       /*
+        * If reconstruction was interrupted, we need to close the "active"
+        * and "pending" holes.
+        * we know that there are no active rebuild requests,
+        * os cnt_active == cnt_ready == 0
          */
-       /* this is really needed when recovery stops too... */
         spin_lock_irq(&conf->segment_lock);
         conf->start_active = conf->start_pending;
         conf->start_ready = conf->start_pending;
         wait_event_lock_irq(conf->wait_ready, !conf->cnt_pending, conf->segment_lock);
-       conf->start_active =conf->start_ready = conf->start_pending = conf->start_future;
+       conf->start_active = conf->start_ready = conf->start_pending = conf->start_future;
         conf->start_future = mddev->sb->size+1;
         conf->cnt_pending = conf->cnt_future;
         conf->cnt_future = 0;
@@ -838,18 +699,18 @@ static void close_sync(raid1_conf_t *conf)
         wake_up(&conf->wait_done);
  }
  
-static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
+static int diskop(mddev_t *mddev, mdp_disk_t **d, int state)
  {
         int err = 0;
-       int i, failed_disk=-1, spare_disk=-1, removed_disk=-1, added_disk=-1;
-       raid1_conf_t *conf = mddev->private;
-       struct mirror_info *tmp, *sdisk, *fdisk, *rdisk, *adisk;
+       int i, failed_disk = -1, spare_disk = -1, removed_disk = -1, added_disk = -1;
+       conf_t *conf = mddev->private;
+       mirror_info_t *tmp, *sdisk, *fdisk, *rdisk, *adisk;
         mdp_super_t *sb = mddev->sb;
         mdp_disk_t *failed_desc, *spare_desc, *added_desc;
         mdk_rdev_t *spare_rdev, *failed_rdev;
  
-       print_raid1_conf(conf);
-       md_spin_lock_irq(&conf->device_lock);
+       print_conf(conf);
+       spin_lock_irq(&conf->device_lock);
         /*
          * find the disk ...
          */
@@ -871,7 +732,7 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
                 }
                 /*
                  * When we activate a spare disk we _must_ have a disk in
-                * the lower (active) part of the array to replace. 
+                * the lower (active) part of the array to replace.
                  */
                 if ((failed_disk == -1) || (failed_disk >= conf->raid_disks)) {
                         MD_BUG();
@@ -982,7 +843,7 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
                         err = 1;
                         goto abort;
                 }
-                       
+
                 if (sdisk->raid_disk != spare_disk) {
                         MD_BUG();
                         err = 1;
@@ -1007,13 +868,14 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
                 spare_rdev = find_rdev_nr(mddev, spare_desc->number);
                 failed_rdev = find_rdev_nr(mddev, failed_desc->number);
  
-               /* There must be a spare_rdev, but there may not be a
-                * failed_rdev.  That slot might be empty...
+               /*
+                * There must be a spare_rdev, but there may not be a
+                * failed_rdev. That slot might be empty...
                  */
                 spare_rdev->desc_nr = failed_desc->number;
                 if (failed_rdev)
                         failed_rdev->desc_nr = spare_desc->number;
-               
+
                 xchg_values(*spare_desc, *failed_desc);
                 xchg_values(*fdisk, *sdisk);
  
@@ -1024,7 +886,6 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
                  * give the proper raid_disk number to the now activated
                  * disk. (this means we switch back these values)
                  */
-       
                 xchg_values(spare_desc->raid_disk, failed_desc->raid_disk);
                 xchg_values(sdisk->raid_disk, fdisk->raid_disk);
                 xchg_values(spare_desc->number, failed_desc->number);
@@ -1054,7 +915,7 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
                 rdisk = conf->mirrors + removed_disk;
  
                 if (rdisk->spare && (removed_disk < conf->raid_disks)) {
-                       MD_BUG();       
+                       MD_BUG();
                         err = 1;
                         goto abort;
                 }
@@ -1068,14 +929,14 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
                 added_desc = *d;
  
                 if (added_disk != added_desc->number) {
-                       MD_BUG();       
+                       MD_BUG();
                         err = 1;
                         goto abort;
                 }
  
                 adisk->number = added_desc->number;
                 adisk->raid_disk = added_desc->raid_disk;
-               adisk->dev = MKDEV(added_desc->major,added_desc->minor);
+               adisk->dev = MKDEV(added_desc->major, added_desc->minor);
  
                 adisk->operational = 0;
                 adisk->write_only = 0;
@@ -1087,17 +948,18 @@ static int raid1_diskop(mddev_t *mddev, mdp_disk_t **d, int state)
                 break;
  
         default:
-               MD_BUG();       
+               MD_BUG();
                 err = 1;
                 goto abort;
         }
  abort:
-       md_spin_unlock_irq(&conf->device_lock);
-       if (state == DISKOP_SPARE_ACTIVE || state == DISKOP_SPARE_INACTIVE)
-               /* should move to "END_REBUILD" when such exists */
-               raid1_shrink_buffers(conf);
+       spin_unlock_irq(&conf->device_lock);
+       if (state == DISKOP_SPARE_ACTIVE || state == DISKOP_SPARE_INACTIVE) {
+               mempool_destroy(conf->r1buf_pool);
+               conf->r1buf_pool = NULL;
+       }
  
-       print_raid1_conf(conf);
+       print_conf(conf);
         return err;
  }
  
@@ -1108,6 +970,122 @@ abort:
  #define REDIRECT_SECTOR KERN_ERR \
  "raid1: %s: redirecting sector %lu to another mirror\n"
  
+static int end_sync_read(struct bio *bio, int nr_sectors)
+{
+       int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
+       r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
+
+       check_all_w_bios_empty(r1_bio);
+       if (r1_bio->read_bio != bio)
+               BUG();
+       /*
+        * we have read a block, now it needs to be re-written,
+        * or re-read if the read failed.
+        * We don't do much here, just schedule handling by raid1d
+        */
+       if (!uptodate)
+               md_error (r1_bio->mddev, bio->bi_dev);
+       else
+               set_bit(R1BIO_Uptodate, &r1_bio->state);
+       reschedule_retry(r1_bio);
+
+       return 0;
+}
+
+static int end_sync_write(struct bio *bio, int nr_sectors)
+{
+       int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
+       r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
+       mddev_t *mddev = r1_bio->mddev;
+
+       if (!uptodate)
+               md_error(mddev, bio->bi_dev);
+
+       if (atomic_dec_and_test(&r1_bio->remaining)) {
+               sync_request_done(r1_bio->sector, mddev_to_conf(mddev));
+               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, uptodate);
+               put_buf(r1_bio);
+       }
+       return 0;
+}
+
+static void sync_request_write(mddev_t *mddev, r1bio_t *r1_bio)
+{
+       conf_t *conf = mddev_to_conf(mddev);
+       int i, sum_bios = 0;
+       int disks = MD_SB_DISKS;
+       struct bio *bio, *mbio;
+
+       bio = r1_bio->master_bio;
+
+       /*
+        * have to allocate lots of bio structures and
+        * schedule writes
+        */
+       if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) {
+               /*
+                * There is no point trying a read-for-reconstruct as
+                * reconstruct is about to be aborted
+                */
+               printk(IO_ERROR, partition_name(bio->bi_dev), r1_bio->sector);
+               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, 0);
+               return;
+       }
+
+       check_all_w_bios_empty(r1_bio);
+
+       for (i = 0; i < disks ; i++) {
+               if (!conf->mirrors[i].operational)
+                       continue;
+               if (i == conf->last_used)
+                       /*
+                        * we read from here, no need to write
+                        */
+                       continue;
+               if (i < conf->raid_disks && !conf->resync_mirrors)
+                       /*
+                        * don't need to write this we are just rebuilding
+                        */
+                       continue;
+
+               mbio = bio_clone(bio, GFP_NOIO);
+               if (r1_bio->write_bios[i])
+                       BUG();
+               r1_bio->write_bios[i] = mbio;
+               mbio->bi_dev = conf->mirrors[i].dev;
+               mbio->bi_sector = r1_bio->sector;
+               mbio->bi_end_io = end_sync_write;
+               mbio->bi_rw = WRITE;
+               mbio->bi_private = r1_bio;
+
+               sum_bios++;
+       }
+       if (i != disks)
+               BUG();
+       atomic_set(&r1_bio->remaining, sum_bios);
+
+
+       if (!sum_bios) {
+               /*
+                * Nowhere to write this to... I guess we
+                * must be done
+                */
+               printk(IO_ERROR, partition_name(bio->bi_dev), r1_bio->sector);
+               sync_request_done(r1_bio->sector, conf);
+               md_done_sync(mddev, r1_bio->master_bio->bi_size >> 9, 0);
+               put_buf(r1_bio);
+               return;
+       }
+       for (i = 0; i < disks ; i++) {
+               mbio = r1_bio->write_bios[i];
+               if (!mbio)
+                       continue;
+
+               md_sync_acct(mbio->bi_dev, mbio->bi_size >> 9);
+               generic_make_request(mbio);
+       }
+}
+
  /*
   * This is a kernel thread which:
   *
@@ -1115,134 +1093,56 @@ abort:
   *     2.      Updates the raid superblock when problems encounter.
   *     3.      Performs writes following reads for array syncronising.
   */
-static void end_sync_write(struct buffer_head *bh, int uptodate);
-static void end_sync_read(struct buffer_head *bh, int uptodate);
  
-static void raid1d (void *data)
+static void raid1d(void *data)
  {
-       struct raid1_bh *r1_bh;
-       struct buffer_head *bh;
+       struct list_head *head = &retry_list_head;
+       r1bio_t *r1_bio;
+       struct bio *bio;
         unsigned long flags;
         mddev_t *mddev;
         kdev_t dev;
  
  
         for (;;) {
-               md_spin_lock_irqsave(&retry_list_lock, flags);
-               r1_bh = raid1_retry_list;
-               if (!r1_bh)
+               spin_lock_irqsave(&retry_list_lock, flags);
+               if (list_empty(head))
                         break;
-               raid1_retry_list = r1_bh->next_r1;
-               md_spin_unlock_irqrestore(&retry_list_lock, flags);
+               r1_bio = list_entry(head->prev, r1bio_t, retry_list);
+               list_del(head->prev);
+               spin_unlock_irqrestore(&retry_list_lock, flags);
+               check_all_w_bios_empty(r1_bio);
  
-               mddev = r1_bh->mddev;
+               mddev = r1_bio->mddev;
                 if (mddev->sb_dirty) {
                         printk(KERN_INFO "raid1: dirty sb detected, updating.\n");
                         mddev->sb_dirty = 0;
                         md_update_sb(mddev);
                 }
-               bh = &r1_bh->bh_req;
-               switch(r1_bh->cmd) {
+               bio = r1_bio->master_bio;
+               switch(r1_bio->cmd) {
                 case SPECIAL:
-                       /* have to allocate lots of bh structures and
-                        * schedule writes
-                        */
-                       if (test_bit(R1BH_Uptodate, &r1_bh->state)) {
-                               int i, sum_bhs = 0;
-                               int disks = MD_SB_DISKS;
-                               struct buffer_head *bhl, *mbh;
-                               raid1_conf_t *conf;
-                               
-                               conf = mddev_to_conf(mddev);
-                               bhl = raid1_alloc_bh(conf, conf->raid_disks); /* don't really need this many */
-                               for (i = 0; i < disks ; i++) {
-                                       if (!conf->mirrors[i].operational)
-                                               continue;
-                                       if (i==conf->last_used)
-                                               /* we read from here, no need to write */
-                                               continue;
-                                       if (i < conf->raid_disks
-                                           && !conf->resync_mirrors)
-                                               /* don't need to write this,
-                                                * we are just rebuilding */
-                                               continue;
-                                       mbh = bhl;
-                                       if (!mbh) {
-                                               MD_BUG();
-                                               break;
-                                       }
-                                       bhl = mbh->b_next;
-                                       mbh->b_this_page = (struct buffer_head *)1;
-
-                                               
-                               /*
-                                * prepare mirrored bh (fields ordered for max mem throughput):
-                                */
-                                       mbh->b_blocknr    = bh->b_blocknr;
-                                       mbh->b_dev        = conf->mirrors[i].dev;
-                                       mbh->b_rdev       = conf->mirrors[i].dev;
-                                       mbh->b_rsector    = bh->b_blocknr;
-                                       mbh->b_state      = (1<<BH_Req) | (1<<BH_Dirty) |
-                                               (1<<BH_Mapped) | (1<<BH_Lock);
-                                       atomic_set(&mbh->b_count, 1);
-                                       mbh->b_size       = bh->b_size;
-                                       mbh->b_page       = bh->b_page;
-                                       mbh->b_data       = bh->b_data;
-                                       mbh->b_list       = BUF_LOCKED;
-                                       mbh->b_end_io     = end_sync_write;
-                                       mbh->b_private    = r1_bh;
-
-                                       mbh->b_next = r1_bh->mirror_bh_list;
-                                       r1_bh->mirror_bh_list = mbh;
-
-                                       sum_bhs++;
-                               }
-                               md_atomic_set(&r1_bh->remaining, sum_bhs);
-                               if (bhl) raid1_free_bh(conf, bhl);
-                               mbh = r1_bh->mirror_bh_list;
-
-                               if (!sum_bhs) {
-                                       /* nowhere to write this too... I guess we
-                                        * must be done
-                                        */
-                                       sync_request_done(bh->b_blocknr, conf);
-                                       md_done_sync(mddev, bh->b_size>>9, 0);
-                                       raid1_free_buf(r1_bh);
-                               } else
-                               while (mbh) {
-                                       struct buffer_head *bh1 = mbh;
-                                       mbh = mbh->b_next;
-                                       generic_make_request(WRITE, bh1);
-                                       md_sync_acct(bh1->b_dev, bh1->b_size/512);
-                               }
-                       } else {
-                               /* There is no point trying a read-for-reconstruct
-                                * as reconstruct is about to be aborted
-                                */
-
-                               printk (IO_ERROR, partition_name(bh->b_dev), bh->b_blocknr);
-                               md_done_sync(mddev, bh->b_size>>9, 0);
-                       }
-
+                       sync_request_write(mddev, r1_bio);
                         break;
                 case READ:
                 case READA:
-                       dev = bh->b_dev;
-                       raid1_map (mddev, &bh->b_dev);
-                       if (bh->b_dev == dev) {
-                               printk (IO_ERROR, partition_name(bh->b_dev), bh->b_blocknr);
-                               raid1_end_bh_io(r1_bh, 0);
-                       } else {
-                               printk (REDIRECT_SECTOR,
-                                       partition_name(bh->b_dev), bh->b_blocknr);
-                               bh->b_rdev = bh->b_dev;
-                               bh->b_rsector = bh->b_blocknr;
-                               generic_make_request (r1_bh->cmd, bh);
+                       dev = bio->bi_dev;
+                       map(mddev, &bio->bi_dev);
+                       if (bio->bi_dev == dev) {
+                               printk(IO_ERROR, partition_name(bio->bi_dev), r1_bio->sector);
+                               raid_end_bio_io(r1_bio, 0, 0);
+                               break;
                         }
+                       printk(REDIRECT_SECTOR,
+                               partition_name(bio->bi_dev), r1_bio->sector);
+                       bio->bi_sector = r1_bio->sector;
+                       bio->bi_rw = r1_bio->cmd;
+
+                       generic_make_request(bio);
                         break;
                 }
         }
-       md_spin_unlock_irqrestore(&retry_list_lock, flags);
+       spin_unlock_irqrestore(&retry_list_lock, flags);
  }
  #undef IO_ERROR
  #undef REDIRECT_SECTOR
@@ -1251,9 +1151,9 @@ static void raid1d (void *data)
   * Private kernel thread to reconstruct mirrors after an unclean
   * shutdown.
   */
-static void raid1syncd (void *data)
+static void raid1syncd(void *data)
  {
-       raid1_conf_t *conf = data;
+       conf_t *conf = data;
         mddev_t *mddev = conf->mddev;
  
         if (!conf->resync_mirrors)
@@ -1271,7 +1171,56 @@ static void raid1syncd (void *data)
         close_sync(conf);
  
         up(&mddev->recovery_sem);
-       raid1_shrink_buffers(conf);
+}
+
+static int init_resync(conf_t *conf)
+{
+       int buffs;
+
+       conf->start_active = 0;
+       conf->start_ready = 0;
+       conf->start_pending = 0;
+       conf->start_future = 0;
+       conf->phase = 0;
+
+       buffs = RESYNC_WINDOW / RESYNC_BLOCK_SIZE;
+       if (conf->r1buf_pool)
+               BUG();
+       conf->r1buf_pool = mempool_create(buffs, r1buf_pool_alloc, r1buf_pool_free, conf);
+       if (!conf->r1buf_pool)
+               return -ENOMEM;
+       conf->window = 2048;
+       conf->cnt_future += conf->cnt_done+conf->cnt_pending;
+       conf->cnt_done = conf->cnt_pending = 0;
+       if (conf->cnt_ready || conf->cnt_active)
+               MD_BUG();
+       return 0;
+}
+
+static void wait_sync_pending(conf_t *conf, sector_t sector_nr)
+{
+       spin_lock_irq(&conf->segment_lock);
+       while (sector_nr >= conf->start_pending) {
+//             printk("wait .. sect=%lu start_active=%d ready=%d pending=%d future=%d, cnt_done=%d active=%d ready=%d pending=%d future=%d\n", sector_nr, conf->start_active, conf->start_ready, conf->start_pending, conf->start_future, conf->cnt_done, conf->cnt_active, conf->cnt_ready, conf->cnt_pending, conf->cnt_future);
+               wait_event_lock_irq(conf->wait_done, !conf->cnt_active,
+                                       conf->segment_lock);
+               wait_event_lock_irq(conf->wait_ready, !conf->cnt_pending,
+                                       conf->segment_lock);
+               conf->start_active = conf->start_ready;
+               conf->start_ready = conf->start_pending;
+               conf->start_pending = conf->start_future;
+               conf->start_future = conf->start_future+conf->window;
+
+               // Note: falling off the end is not a problem
+               conf->phase = conf->phase ^1;
+               conf->cnt_active = conf->cnt_ready;
+               conf->cnt_ready = 0;
+               conf->cnt_pending = conf->cnt_future;
+               conf->cnt_future = 0;
+               wake_up(&conf->wait_done);
+       }
+       conf->cnt_ready++;
+       spin_unlock_irq(&conf->segment_lock);
  }
  
  /*
@@ -1279,7 +1228,7 @@ static void raid1syncd (void *data)
   *
   * We need to make sure that no normal I/O request - particularly write
   * requests - conflict with active sync requests.
- * This is achieved by conceptually dividing the device space into a
+ * This is achieved by conceptually dividing the block space into a
   * number of sections:
   *  DONE: 0 .. a-1     These blocks are in-sync
   *  ACTIVE: a.. b-1    These blocks may have active sync requests, but
@@ -1322,149 +1271,81 @@ static void raid1syncd (void *data)
   * issue suitable write requests
   */
  
-static int raid1_sync_request (mddev_t *mddev, unsigned long sector_nr)
+static int sync_request(mddev_t *mddev, sector_t sector_nr)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
-       struct mirror_info *mirror;
-       struct raid1_bh *r1_bh;
-       struct buffer_head *bh;
-       int bsize;
-       int disk;
-       int block_nr;
+       conf_t *conf = mddev_to_conf(mddev);
+       mirror_info_t *mirror;
+       r1bio_t *r1_bio;
+       struct bio *read_bio, *bio;
+       sector_t max_sector, nr_sectors;
+       int disk, partial;
  
-       spin_lock_irq(&conf->segment_lock);
-       if (!sector_nr) {
-               /* initialize ...*/
-               int buffs;
-               conf->start_active = 0;
-               conf->start_ready = 0;
-               conf->start_pending = 0;
-               conf->start_future = 0;
-               conf->phase = 0;
-               /* we want enough buffers to hold twice the window of 128*/
-               buffs = 128 *2 / (PAGE_SIZE>>9);
-               buffs = raid1_grow_buffers(conf, buffs);
-               if (buffs < 2)
-                       goto nomem;
-               
-               conf->window = buffs*(PAGE_SIZE>>9)/2;
-               conf->cnt_future += conf->cnt_done+conf->cnt_pending;
-               conf->cnt_done = conf->cnt_pending = 0;
-               if (conf->cnt_ready || conf->cnt_active)
-                       MD_BUG();
-       }
-       while (sector_nr >= conf->start_pending) {
-               PRINTK("wait .. sect=%lu start_active=%d ready=%d pending=%d future=%d, cnt_done=%d active=%d ready=%d pending=%d future=%d\n",
-                       sector_nr, conf->start_active, conf->start_ready, conf->start_pending, conf->start_future,
-                       conf->cnt_done, conf->cnt_active, conf->cnt_ready, conf->cnt_pending, conf->cnt_future);
-               wait_event_lock_irq(conf->wait_done,
-                                       !conf->cnt_active,
-                                       conf->segment_lock);
-               wait_event_lock_irq(conf->wait_ready,
-                                       !conf->cnt_pending,
-                                       conf->segment_lock);
-               conf->start_active = conf->start_ready;
-               conf->start_ready = conf->start_pending;
-               conf->start_pending = conf->start_future;
-               conf->start_future = conf->start_future+conf->window;
-               // Note: falling off the end is not a problem
-               conf->phase = conf->phase ^1;
-               conf->cnt_active = conf->cnt_ready;
-               conf->cnt_ready = 0;
-               conf->cnt_pending = conf->cnt_future;
-               conf->cnt_future = 0;
-               wake_up(&conf->wait_done);
-       }
-       conf->cnt_ready++;
-       spin_unlock_irq(&conf->segment_lock);
-               
+       if (!sector_nr)
+               if (init_resync(conf))
+                       return -ENOMEM;
  
-       /* If reconstructing, and >1 working disc,
+       wait_sync_pending(conf, sector_nr);
+
+       /*
+        * If reconstructing, and >1 working disc,
          * could dedicate one to rebuild and others to
          * service read requests ..
          */
         disk = conf->last_used;
         /* make sure disk is operational */
         while (!conf->mirrors[disk].operational) {
-               if (disk <= 0) disk = conf->raid_disks;
+               if (disk <= 0)
+                       disk = conf->raid_disks;
                 disk--;
                 if (disk == conf->last_used)
                         break;
         }
         conf->last_used = disk;
-       
+
         mirror = conf->mirrors+conf->last_used;
-       
-       r1_bh = raid1_alloc_buf (conf);
-       r1_bh->master_bh = NULL;
-       r1_bh->mddev = mddev;
-       r1_bh->cmd = SPECIAL;
-       bh = &r1_bh->bh_req;
-
-       block_nr = sector_nr;
-       bsize = 512;
-       while (!(block_nr & 1) && bsize < PAGE_SIZE
-                       && (block_nr+2)*(bsize>>9) < (mddev->sb->size *2)) {
-               block_nr >>= 1;
-               bsize <<= 1;
-       }
-       bh->b_size = bsize;
-       bh->b_list = BUF_LOCKED;
-       bh->b_dev = mirror->dev;
-       bh->b_rdev = mirror->dev;
-       bh->b_state = (1<<BH_Req) | (1<<BH_Mapped) | (1<<BH_Lock);
-       if (!bh->b_page)
-               BUG();
-       if (!bh->b_data)
-               BUG();
-       if (bh->b_data != page_address(bh->b_page))
+
+       r1_bio = mempool_alloc(conf->r1buf_pool, GFP_NOIO);
+       check_all_bios_empty(r1_bio);
+
+       r1_bio->mddev = mddev;
+       r1_bio->sector = sector_nr;
+       r1_bio->cmd = SPECIAL;
+
+       max_sector = mddev->sb->size << 1;
+       if (sector_nr >= max_sector)
                 BUG();
-       bh->b_end_io = end_sync_read;
-       bh->b_private = r1_bh;
-       bh->b_blocknr = sector_nr;
-       bh->b_rsector = sector_nr;
-       init_waitqueue_head(&bh->b_wait);
  
-       generic_make_request(READ, bh);
-       md_sync_acct(bh->b_dev, bh->b_size/512);
+       bio = r1_bio->master_bio;
+       nr_sectors = RESYNC_BLOCK_SIZE >> 9;
+       if (max_sector - sector_nr < nr_sectors)
+               nr_sectors = max_sector - sector_nr;
+       bio->bi_size = nr_sectors << 9;
+       bio->bi_vcnt = (bio->bi_size + PAGE_SIZE-1) / PAGE_SIZE;
+       /*
+        * Is there a partial page at the end of the request?
+        */
+       partial = bio->bi_size % PAGE_SIZE;
+       if (partial)
+               bio->bi_io_vec[bio->bi_vcnt-1].bv_len = partial;
  
-       return (bsize >> 9);
  
-nomem:
-       raid1_shrink_buffers(conf);
-       spin_unlock_irq(&conf->segment_lock);
-       return -ENOMEM;
-}
+       read_bio = bio_clone(r1_bio->master_bio, GFP_NOIO);
  
-static void end_sync_read(struct buffer_head *bh, int uptodate)
-{
-       struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_private);
+       read_bio->bi_sector = sector_nr;
+       read_bio->bi_dev = mirror->dev;
+       read_bio->bi_end_io = end_sync_read;
+       read_bio->bi_rw = READ;
+       read_bio->bi_private = r1_bio;
  
-       /* we have read a block, now it needs to be re-written,
-        * or re-read if the read failed.
-        * We don't do much here, just schedule handling by raid1d
-        */
-       if (!uptodate)
-               md_error (r1_bh->mddev, bh->b_dev);
-       else
-               set_bit(R1BH_Uptodate, &r1_bh->state);
-       raid1_reschedule_retry(r1_bh);
-}
+       if (r1_bio->read_bio)
+               BUG();
+       r1_bio->read_bio = read_bio;
  
-static void end_sync_write(struct buffer_head *bh, int uptodate)
-{
-       struct raid1_bh * r1_bh = (struct raid1_bh *)(bh->b_private);
-       
-       if (!uptodate)
-               md_error (r1_bh->mddev, bh->b_dev);
-       if (atomic_dec_and_test(&r1_bh->remaining)) {
-               mddev_t *mddev = r1_bh->mddev;
-               unsigned long sect = bh->b_blocknr;
-               int size = bh->b_size;
-               raid1_free_buf(r1_bh);
-               sync_request_done(sect, mddev_to_conf(mddev));
-               md_done_sync(mddev,size>>9, uptodate);
-       }
+       md_sync_acct(read_bio->bi_dev, nr_sectors);
+
+       generic_make_request(read_bio);
+
+       return nr_sectors;
  }
  
  #define INVALID_LEVEL KERN_WARNING \
@@ -1506,15 +1387,15 @@ static void end_sync_write(struct buffer_head *bh, int uptodate)
  #define START_RESYNC KERN_WARNING \
  "raid1: raid set md%d not clean; reconstructing mirrors\n"
  
-static int raid1_run (mddev_t *mddev)
+static int run(mddev_t *mddev)
  {
-       raid1_conf_t *conf;
+       conf_t *conf;
         int i, j, disk_idx;
-       struct mirror_info *disk;
+       mirror_info_t *disk;
         mdp_super_t *sb = mddev->sb;
         mdp_disk_t *descriptor;
         mdk_rdev_t *rdev;
-       struct md_list_head *tmp;
+       struct list_head *tmp;
         int start_recovery = 0;
  
         MOD_INC_USE_COUNT;
@@ -1525,11 +1406,10 @@ static int raid1_run (mddev_t *mddev)
         }
         /*
          * copy the already verified devices into our private RAID1
-        * bookkeeping area. [whatever we allocate in raid1_run(),
-        * should be freed in raid1_stop()]
+        * bookkeeping area. [whatever we allocate in run(),
+        * should be freed in stop()]
          */
-
-       conf = kmalloc(sizeof(raid1_conf_t), GFP_KERNEL);
+       conf = kmalloc(sizeof(conf_t), GFP_KERNEL);
         mddev->private = conf;
         if (!conf) {
                 printk(MEM_ERROR, mdidx(mddev));
@@ -1537,7 +1417,16 @@ static int raid1_run (mddev_t *mddev)
         }
         memset(conf, 0, sizeof(*conf));
  
-       ITERATE_RDEV(mddev,rdev,tmp) {
+       conf->r1bio_pool = mempool_create(NR_RAID1_BIOS, r1bio_pool_alloc,
+                                               r1bio_pool_free, NULL);
+       if (!conf->r1bio_pool) {
+               printk(MEM_ERROR, mdidx(mddev));
+               goto out;
+       }
+
+//     for (tmp = (mddev)->disks.next; rdev = ((mdk_rdev_t *)((char *)(tmp)-(unsigned long)(&((mdk_rdev_t *)0)->same_set))), tmp = tmp->next, tmp->prev != &(mddev)->disks ; ) {
+
+       ITERATE_RDEV(mddev, rdev, tmp) {
                 if (rdev->faulty) {
                         printk(ERRORS, partition_name(rdev->dev));
                 } else {
@@ -1573,7 +1462,7 @@ static int raid1_run (mddev_t *mddev)
                                 continue;
                         }
                         if ((descriptor->number > MD_SB_DISKS) ||
-                                        (disk_idx > sb->raid_disks)) {
+                                       (disk_idx > sb->raid_disks)) {
  
                                 printk(INCONSISTENT,
                                         partition_name(rdev->dev));
@@ -1586,7 +1475,7 @@ static int raid1_run (mddev_t *mddev)
                                 continue;
                         }
                         printk(OPERATIONAL, partition_name(rdev->dev),
-                                       disk_idx);
+                                       disk_idx);
                         disk->number = descriptor->number;
                         disk->raid_disk = disk_idx;
                         disk->dev = rdev->dev;
@@ -1616,10 +1505,9 @@ static int raid1_run (mddev_t *mddev)
         conf->raid_disks = sb->raid_disks;
         conf->nr_disks = sb->nr_disks;
         conf->mddev = mddev;
-       conf->device_lock = MD_SPIN_LOCK_UNLOCKED;
+       conf->device_lock = SPIN_LOCK_UNLOCKED;
  
-       conf->segment_lock = MD_SPIN_LOCK_UNLOCKED;
-       init_waitqueue_head(&conf->wait_buffer);
+       conf->segment_lock = SPIN_LOCK_UNLOCKED;
         init_waitqueue_head(&conf->wait_done);
         init_waitqueue_head(&conf->wait_ready);
  
@@ -1628,25 +1516,8 @@ static int raid1_run (mddev_t *mddev)
                 goto out_free_conf;
         }
  
-
-       /* pre-allocate some buffer_head structures.
-        * As a minimum, 1 r1bh and raid_disks buffer_heads
-        * would probably get us by in tight memory situations,
-        * but a few more is probably a good idea.
-        * For now, try NR_RESERVED_BUFS r1bh and
-        * NR_RESERVED_BUFS*raid_disks bufferheads
-        * This will allow at least NR_RESERVED_BUFS concurrent
-        * reads or writes even if kmalloc starts failing
-        */
-       if (raid1_grow_r1bh(conf, NR_RESERVED_BUFS) < NR_RESERVED_BUFS ||
-           raid1_grow_bh(conf, NR_RESERVED_BUFS*conf->raid_disks)
-                             < NR_RESERVED_BUFS*conf->raid_disks) {
-               printk(MEM_ERROR, mdidx(mddev));
-               goto out_free_conf;
-       }
-
         for (i = 0; i < MD_SB_DISKS; i++) {
-               
+
                 descriptor = sb->disks+i;
                 disk_idx = descriptor->raid_disk;
                 disk = conf->mirrors + disk_idx;
@@ -1691,10 +1562,10 @@ static int raid1_run (mddev_t *mddev)
         }
  
         if (!start_recovery && !(sb->state & (1 << MD_SB_CLEAN)) &&
-           (conf->working_disks > 1)) {
+                                               (conf->working_disks > 1)) {
                 const char * name = "raid1syncd";
  
-               conf->resync_thread = md_register_thread(raid1syncd, conf,name);
+               conf->resync_thread = md_register_thread(raid1syncd, conf, name);
                 if (!conf->resync_thread) {
                         printk(THREAD_ERROR, mdidx(mddev));
                         goto out_free_conf;
@@ -1731,9 +1602,8 @@ static int raid1_run (mddev_t *mddev)
         return 0;
  
  out_free_conf:
-       raid1_shrink_r1bh(conf);
-       raid1_shrink_bh(conf);
-       raid1_shrink_buffers(conf);
+       if (conf->r1bio_pool)
+               mempool_destroy(conf->r1bio_pool);
         kfree(conf);
         mddev->private = NULL;
  out:
@@ -1752,9 +1622,9 @@ out:
  #undef NONE_OPERATIONAL
  #undef ARRAY_IS_ACTIVE
  
-static int raid1_stop_resync (mddev_t *mddev)
+static int stop_resync(mddev_t *mddev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       conf_t *conf = mddev_to_conf(mddev);
  
         if (conf->resync_thread) {
                 if (conf->resync_mirrors) {
@@ -1769,9 +1639,9 @@ static int raid1_stop_resync (mddev_t *mddev)
         return 0;
  }
  
-static int raid1_restart_resync (mddev_t *mddev)
+static int restart_resync(mddev_t *mddev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       conf_t *conf = mddev_to_conf(mddev);
  
         if (conf->resync_mirrors) {
                 if (!conf->resync_thread) {
@@ -1785,46 +1655,45 @@ static int raid1_restart_resync (mddev_t *mddev)
         return 0;
  }
  
-static int raid1_stop (mddev_t *mddev)
+static int stop(mddev_t *mddev)
  {
-       raid1_conf_t *conf = mddev_to_conf(mddev);
+       conf_t *conf = mddev_to_conf(mddev);
  
         md_unregister_thread(conf->thread);
         if (conf->resync_thread)
                 md_unregister_thread(conf->resync_thread);
-       raid1_shrink_r1bh(conf);
-       raid1_shrink_bh(conf);
-       raid1_shrink_buffers(conf);
+       if (conf->r1bio_pool)
+               mempool_destroy(conf->r1bio_pool);
         kfree(conf);
         mddev->private = NULL;
         MOD_DEC_USE_COUNT;
         return 0;
  }
  
-static mdk_personality_t raid1_personality=
+static mdk_personality_t raid1_personality =
  {
         name:           "raid1",
-       make_request:   raid1_make_request,
-       run:            raid1_run,
-       stop:           raid1_stop,
-       status:         raid1_status,
-       error_handler:  raid1_error,
-       diskop:         raid1_diskop,
-       stop_resync:    raid1_stop_resync,
-       restart_resync: raid1_restart_resync,
-       sync_request:   raid1_sync_request
+       make_request:   make_request,
+       run:            run,
+       stop:           stop,
+       status:         status,
+       error_handler:  error,
+       diskop:         diskop,
+       stop_resync:    stop_resync,
+       restart_resync: restart_resync,
+       sync_request:   sync_request
  };
  
-static int md__init raid1_init (void)
+static int __init raid_init(void)
  {
-       return register_md_personality (RAID1, &raid1_personality);
+       return register_md_personality(RAID1, &raid1_personality);
  }
  
-static void raid1_exit (void)
+static void raid_exit(void)
  {
-       unregister_md_personality (RAID1);
+       unregister_md_personality(RAID1);
  }
  
-module_init(raid1_init);
-module_exit(raid1_exit);
+module_init(raid_init);
+module_exit(raid_exit);
  MODULE_LICENSE("GPL");
diff --git a/drivers/net/tulip/ChangeLog b/drivers/net/tulip/ChangeLog

index a515efcfd338f203ff733eede0e42c47b74d5d9d..8a1caaa28d2f24f27ed85dc635537d7c9cc14af6 100644 (file)
--- a/drivers/net/tulip/ChangeLog
+++ b/drivers/net/tulip/ChangeLog
@@ -1,3 +1,8 @@
+2001-12-11  Jeff Garzik  <jgarzik@mandrakesoft.com>
+
+       * eeprom.c, timer.c, media.c, tulip_core.c:
+       Remove 21040 and 21041 chip support.
+
  2001-11-13  David S. Miller  <davem@redhat.com>
  
         * tulip_core.c (tulip_mwi_config): Kill unused label early_out.
diff --git a/drivers/net/tulip/eeprom.c b/drivers/net/tulip/eeprom.c

index beb1430cc4318bad1e71b6a2d881551b98874931..8777cc1f30658f2a9f4e9aa852686793851f34e5 100644 (file)
--- a/drivers/net/tulip/eeprom.c
+++ b/drivers/net/tulip/eeprom.c
@@ -136,23 +136,6 @@ void __devinit tulip_parse_eeprom(struct net_device *dev)
  subsequent_board:
  
         if (ee_data[27] == 0) {         /* No valid media table. */
-       } else if (tp->chip_id == DC21041) {
-               unsigned char *p = (void *)ee_data + ee_data[27 + controller_index*3];
-               int media = get_u16(p);
-               int count = p[2];
-               p += 3;
-
-               printk(KERN_INFO "%s: 21041 Media table, default media %4.4x (%s).\n",
-                          dev->name, media,
-                          media & 0x0800 ? "Autosense" : medianame[media & MEDIA_MASK]);
-               for (i = 0; i < count; i++) {
-                       unsigned char media_block = *p++;
-                       int media_code = media_block & MEDIA_MASK;
-                       if (media_block & 0x40)
-                               p += 6;
-                       printk(KERN_INFO "%s:  21041 media #%d, %s.\n",
-                                  dev->name, media_code, medianame[media_code]);
-               }
         } else {
                 unsigned char *p = (void *)ee_data + ee_data[27];
                 unsigned char csr12dir = 0;
diff --git a/drivers/net/tulip/media.c b/drivers/net/tulip/media.c

index 5d1329776d012338ae186d3ebf266e9bfd6eff90..e7160fca0e349933c0bfe42b0724ddb87a93b46e 100644 (file)
--- a/drivers/net/tulip/media.c
+++ b/drivers/net/tulip/media.c
@@ -21,12 +21,6 @@
  #include "tulip.h"
  
  
-/* This is a mysterious value that can be written to CSR11 in the 21040 (only)
-   to support a pre-NWay full-duplex signaling mechanism using short frames.
-   No one knows what it should be, but if left at its default value some
-   10base2(!) packets trigger a full-duplex-request interrupt. */
-#define FULL_DUPLEX_MAGIC      0x6969
-
  /* The maximum data clock rate is 2.5 Mhz.  The minimum timing is usually
     met by back-to-back PCI I/O cycles, but we insert a delay to avoid
     "overclocking" issues or future 66Mhz PCI. */
@@ -326,17 +320,6 @@ void tulip_select_media(struct net_device *dev, int startup)
                         printk(KERN_DEBUG "%s: Using media type %s, CSR12 is %2.2x.\n",
                                    dev->name, medianame[dev->if_port],
                                    inl(ioaddr + CSR12) & 0xff);
-       } else if (tp->chip_id == DC21041) {
-               int port = dev->if_port <= 4 ? dev->if_port : 0;
-               if (tulip_debug > 1)
-                       printk(KERN_DEBUG "%s: 21041 using media %s, CSR12 is %4.4x.\n",
-                                  dev->name, medianame[port == 3 ? 12: port],
-                                  inl(ioaddr + CSR12));
-               outl(0x00000000, ioaddr + CSR13); /* Reset the serial interface */
-               outl(t21041_csr14[port], ioaddr + CSR14);
-               outl(t21041_csr15[port], ioaddr + CSR15);
-               outl(t21041_csr13[port], ioaddr + CSR13);
-               new_csr6 = 0x80020000;
         } else if (tp->chip_id == LC82C168) {
                 if (startup && ! tp->medialock)
                         dev->if_port = tp->mii_cnt ? 11 : 0;
@@ -363,26 +346,6 @@ void tulip_select_media(struct net_device *dev, int startup)
                         new_csr6 = 0x00420000;
                         outl(0x1F078, ioaddr + 0xB8);
                 }
-       } else if (tp->chip_id == DC21040) {                                    /* 21040 */
-               /* Turn on the xcvr interface. */
-               int csr12 = inl(ioaddr + CSR12);
-               if (tulip_debug > 1)
-                       printk(KERN_DEBUG "%s: 21040 media type is %s, CSR12 is %2.2x.\n",
-                                  dev->name, medianame[dev->if_port], csr12);
-               if (tulip_media_cap[dev->if_port] & MediaAlwaysFD)
-                       tp->full_duplex = 1;
-               new_csr6 = 0x20000;
-               /* Set the full duplux match frame. */
-               outl(FULL_DUPLEX_MAGIC, ioaddr + CSR11);
-               outl(0x00000000, ioaddr + CSR13); /* Reset the serial interface */
-               if (t21040_csr13[dev->if_port] & 8) {
-                       outl(0x0705, ioaddr + CSR14);
-                       outl(0x0006, ioaddr + CSR15);
-               } else {
-                       outl(0xffff, ioaddr + CSR14);
-                       outl(0x0000, ioaddr + CSR15);
-               }
-               outl(0x8f01 | t21040_csr13[dev->if_port], ioaddr + CSR13);
         } else {                                        /* Unknown chip type with no media table. */
                 if (tp->default_port == 0)
                         dev->if_port = tp->mii_cnt ? 11 : 3;
diff --git a/drivers/net/tulip/timer.c b/drivers/net/tulip/timer.c

index 4079772ae9fee7f901bab92230bf46f8b0c32076..53c43912bad71b4728424b202e8f19ea3764e337 100644 (file)
--- a/drivers/net/tulip/timer.c
+++ b/drivers/net/tulip/timer.c
@@ -33,60 +33,6 @@ void tulip_timer(unsigned long data)
                            inl(ioaddr + CSR14), inl(ioaddr + CSR15));
         }
         switch (tp->chip_id) {
-       case DC21040:
-               if (!tp->medialock  &&  csr12 & 0x0002) { /* Network error */
-                       printk(KERN_INFO "%s: No link beat found.\n",
-                                  dev->name);
-                       dev->if_port = (dev->if_port == 2 ? 0 : 2);
-                       tulip_select_media(dev, 0);
-                       dev->trans_start = jiffies;
-               }
-               break;
-       case DC21041:
-               if (tulip_debug > 2)
-                       printk(KERN_DEBUG "%s: 21041 media tick  CSR12 %8.8x.\n",
-                                  dev->name, csr12);
-               if (tp->medialock) break;
-               switch (dev->if_port) {
-               case 0: case 3: case 4:
-                 if (csr12 & 0x0004) { /*LnkFail */
-                       /* 10baseT is dead.  Check for activity on alternate port. */
-                       tp->mediasense = 1;
-                       if (csr12 & 0x0200)
-                               dev->if_port = 2;
-                       else
-                               dev->if_port = 1;
-                       printk(KERN_INFO "%s: No 21041 10baseT link beat, Media switched to %s.\n",
-                                  dev->name, medianame[dev->if_port]);
-                       outl(0, ioaddr + CSR13); /* Reset */
-                       outl(t21041_csr14[dev->if_port], ioaddr + CSR14);
-                       outl(t21041_csr15[dev->if_port], ioaddr + CSR15);
-                       outl(t21041_csr13[dev->if_port], ioaddr + CSR13);
-                       next_tick = 10*HZ;                      /* 2.4 sec. */
-                 } else
-                       next_tick = 30*HZ;
-                 break;
-               case 1:                                 /* 10base2 */
-               case 2:                                 /* AUI */
-                       if (csr12 & 0x0100) {
-                               next_tick = (30*HZ);                    /* 30 sec. */
-                               tp->mediasense = 0;
-                       } else if ((csr12 & 0x0004) == 0) {
-                               printk(KERN_INFO "%s: 21041 media switched to 10baseT.\n",
-                                          dev->name);
-                               dev->if_port = 0;
-                               tulip_select_media(dev, 0);
-                               next_tick = (24*HZ)/10;                         /* 2.4 sec. */
-                       } else if (tp->mediasense || (csr12 & 0x0002)) {
-                               dev->if_port = 3 - dev->if_port; /* Swap ports. */
-                               tulip_select_media(dev, 0);
-                               next_tick = 20*HZ;
-                       } else {
-                               next_tick = 20*HZ;
-                       }
-                       break;
-               }
-               break;
         case DC21140:
         case DC21142:
         case MX98713:
diff --git a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c

index f67ff13732cbabf4b194cf57a346bce723b78fe3..917f1a9be8cf8aa37c0657858221d75512d48120 100644 (file)
--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -15,8 +15,8 @@
  */
  
  #define DRV_NAME       "tulip"
-#define DRV_VERSION    "0.9.15-pre9"
-#define DRV_RELDATE    "Nov 6, 2001"
+#define DRV_VERSION    "1.1.0"
+#define DRV_RELDATE    "Dec 11, 2001"
  
  #include <linux/config.h>
  #include <linux/module.h>
@@ -130,12 +130,8 @@ int tulip_debug = 1;
   */
  
  struct tulip_chip_table tulip_tbl[] = {
-  /* DC21040 */
-  { "Digital DC21040 Tulip", 128, 0x0001ebef, 0, tulip_timer },
-
-  /* DC21041 */
-  { "Digital DC21041 Tulip", 128, 0x0001ebef,
-       HAS_MEDIA_TABLE | HAS_NWAY, tulip_timer },
+  { }, /* placeholder for array, slot unused currently */
+  { }, /* placeholder for array, slot unused currently */
  
    /* DC21140 */
    { "Digital DS21140 Tulip", 128, 0x0001ebef,
@@ -192,8 +188,6 @@ struct tulip_chip_table tulip_tbl[] = {
  
  
  static struct pci_device_id tulip_pci_tbl[] __devinitdata = {
-       { 0x1011, 0x0002, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DC21040 },
-       { 0x1011, 0x0014, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DC21041 },
         { 0x1011, 0x0009, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DC21140 },
         { 0x1011, 0x0019, PCI_ANY_ID, PCI_ANY_ID, 0, 0, DC21143 },
         { 0x11AD, 0x0002, PCI_ANY_ID, PCI_ANY_ID, 0, 0, LC82C168 },
@@ -224,19 +218,6 @@ MODULE_DEVICE_TABLE(pci, tulip_pci_tbl);
  /* A full-duplex map for media types. */
  const char tulip_media_cap[32] =
  {0,0,0,16,  3,19,16,24,  27,4,7,5, 0,20,23,20,  28,31,0,0, };
-u8 t21040_csr13[] = {2,0x0C,8,4,  4,0,0,0, 0,0,0,0, 4,0,0,0};
-
-/* 21041 transceiver register settings: 10-T, 10-2, AUI, 10-T, 10T-FD*/
-u16 t21041_csr13[] = {
-       csr13_mask_10bt,                /* 10-T */
-       csr13_mask_auibnc,              /* 10-2 */
-       csr13_mask_auibnc,              /* AUI */
-       csr13_mask_10bt,                /* 10-T */
-       csr13_mask_10bt,                /* 10T-FD */
-};
-u16 t21041_csr14[] = { 0xFFFF, 0xF7FD, 0xF7FD, 0x7F3F, 0x7F3D, };
-u16 t21041_csr15[] = { 0x0008, 0x0006, 0x000E, 0x0008, 0x0008, };
-
  
  static void tulip_tx_timeout(struct net_device *dev);
  static void tulip_init_ring(struct net_device *dev);
@@ -388,19 +369,6 @@ media_picked:
                         outl(0x0008, ioaddr + CSR15);
                 }
                 tulip_select_media(dev, 1);
-       } else if (tp->chip_id == DC21041) {
-               dev->if_port = 0;
-               tp->nway = tp->mediasense = 1;
-               tp->nwayset = tp->lpar = 0;
-               outl(0x00000000, ioaddr + CSR13);
-               outl(0xFFFFFFFF, ioaddr + CSR14);
-               outl(0x00000008, ioaddr + CSR15); /* Listen on AUI also. */
-               tp->csr6 = 0x80020000;
-               if (tp->sym_advertise & 0x0040)
-                       tp->csr6 |= FullDuplex;
-               outl(tp->csr6, ioaddr + CSR6);
-               outl(0x0000EF01, ioaddr + CSR13);
-
         } else if (tp->chip_id == DC21142) {
                 if (tp->mii_cnt) {
                         tulip_select_media(dev, 1);
@@ -538,33 +506,6 @@ static void tulip_tx_timeout(struct net_device *dev)
                 if (tulip_debug > 1)
                         printk(KERN_WARNING "%s: Transmit timeout using MII device.\n",
                                    dev->name);
-       } else if (tp->chip_id == DC21040) {
-               if ( !tp->medialock  &&  inl(ioaddr + CSR12) & 0x0002) {
-                       dev->if_port = (dev->if_port == 2 ? 0 : 2);
-                       printk(KERN_INFO "%s: 21040 transmit timed out, switching to "
-                                  "%s.\n",
-                                  dev->name, medianame[dev->if_port]);
-                       tulip_select_media(dev, 0);
-               }
-               goto out;
-       } else if (tp->chip_id == DC21041) {
-               int csr12 = inl(ioaddr + CSR12);
-
-               printk(KERN_WARNING "%s: 21041 transmit timed out, status %8.8x, "
-                          "CSR12 %8.8x, CSR13 %8.8x, CSR14 %8.8x, resetting...\n",
-                          dev->name, inl(ioaddr + CSR5), csr12,
-                          inl(ioaddr + CSR13), inl(ioaddr + CSR14));
-               tp->mediasense = 1;
-               if ( ! tp->medialock) {
-                       if (dev->if_port == 1 || dev->if_port == 2)
-                               if (csr12 & 0x0004) {
-                                       dev->if_port = 2 - dev->if_port;
-                               } else
-                                       dev->if_port = 0;
-                       else
-                               dev->if_port = 1;
-                       tulip_select_media(dev, 0);
-               }
         } else if (tp->chip_id == DC21140 || tp->chip_id == DC21142
                            || tp->chip_id == MX98713 || tp->chip_id == COMPEX9881
                            || tp->chip_id == DM910X) {
@@ -636,7 +577,6 @@ static void tulip_tx_timeout(struct net_device *dev)
  
         tp->stats.tx_errors++;
  
-out:
         spin_unlock_irqrestore (&tp->lock, flags);
         dev->trans_start = jiffies;
         netif_wake_queue (dev);
@@ -802,10 +742,6 @@ static void tulip_down (struct net_device *dev)
         /* release any unconsumed transmit buffers */
         tulip_clean_tx_ring(tp);
  
-       /* 21040 -- Leave the card in 10baseT state. */
-       if (tp->chip_id == DC21040)
-               outl (0x00000004, ioaddr + CSR13);
-
         if (inl (ioaddr + CSR6) != 0xffffffff)
                 tp->stats.rx_missed_errors += inl (ioaddr + CSR8) & 0xffff;
  
@@ -966,16 +902,14 @@ static int private_ioctl (struct net_device *dev, struct ifreq *rq, int cmd)
                                         0x1848 +
                                         ((csr12&0x7000) == 0x5000 ? 0x20 : 0) +
                                         ((csr12&0x06) == 6 ? 0 : 4);
-                                if (tp->chip_id != DC21041)
-                                        data->val_out |= 0x6048;
+                                data->val_out |= 0x6048;
                                 break;
                         case 4:
                                  /* Advertised value, bogus 10baseTx-FD value from CSR6. */
                                  data->val_out =
                                         ((inl(ioaddr + CSR6) >> 3) & 0x0040) +
                                         ((csr14 >> 1) & 0x20) + 1;
-                                if (tp->chip_id != DC21041)
-                                         data->val_out |= ((csr14 >> 9) & 0x03C0);
+                                data->val_out |= ((csr14 >> 9) & 0x03C0);
                                 break;
                         case 5: data->val_out = tp->lpar; break;
                         default: data->val_out = 0; break;
@@ -1358,7 +1292,6 @@ static int __devinit tulip_init_one (struct pci_dev *pdev,
         long ioaddr;
         static int board_idx = -1;
         int chip_idx = ent->driver_data;
-       unsigned int t2104x_mode = 0;
         unsigned int eeprom_missing = 0;
         unsigned int force_csr0 = 0;
  
@@ -1527,31 +1460,12 @@ static int __devinit tulip_init_one (struct pci_dev *pdev,
         /* Clear the missed-packet counter. */
         inl(ioaddr + CSR8);
  
-       if (chip_idx == DC21041) {
-               if (inl(ioaddr + CSR9) & 0x8000) {
-                       chip_idx = DC21040;
-                       t2104x_mode = 1;
-               } else {
-                       t2104x_mode = 2;
-               }
-       }
-
         /* The station address ROM is read byte serially.  The register must
            be polled, waiting for the value to be read bit serially from the
            EEPROM.
            */
         sum = 0;
-       if (chip_idx == DC21040) {
-               outl(0, ioaddr + CSR9);         /* Reset the pointer with a dummy write. */
-               for (i = 0; i < 6; i++) {
-                       int value, boguscnt = 100000;
-                       do
-                               value = inl(ioaddr + CSR9);
-                       while (value < 0  && --boguscnt > 0);
-                       dev->dev_addr[i] = value;
-                       sum += value & 0xff;
-               }
-       } else if (chip_idx == LC82C168) {
+       if (chip_idx == LC82C168) {
                 for (i = 0; i < 3; i++) {
                         int value, boguscnt = 100000;
                         outl(0x600 | i, ioaddr + 0x98);
@@ -1719,10 +1633,6 @@ static int __devinit tulip_init_one (struct pci_dev *pdev,
                dev->name, tulip_tbl[chip_idx].chip_name, chip_rev, ioaddr);
         pci_set_drvdata(pdev, dev);
  
-       if (t2104x_mode == 1)
-               printk(" 21040 compatible mode,");
-       else if (t2104x_mode == 2)
-               printk(" 21041 mode,");
         if (eeprom_missing)
                 printk(" EEPROM not present,");
         for (i = 0; i < 6; i++)
@@ -1731,26 +1641,13 @@ static int __devinit tulip_init_one (struct pci_dev *pdev,
  
          if (tp->chip_id == PNIC2)
                 tp->link_change = pnic2_lnk_change;
-       else if ((tp->flags & HAS_NWAY)  || tp->chip_id == DC21041)
+       else if (tp->flags & HAS_NWAY)
                 tp->link_change = t21142_lnk_change;
         else if (tp->flags & HAS_PNICNWAY)
                 tp->link_change = pnic_lnk_change;
  
         /* Reset the xcvr interface and turn on heartbeat. */
         switch (chip_idx) {
-       case DC21041:
-               if (tp->sym_advertise == 0)
-                       tp->sym_advertise = 0x0061;
-               outl(0x00000000, ioaddr + CSR13);
-               outl(0xFFFFFFFF, ioaddr + CSR14);
-               outl(0x00000008, ioaddr + CSR15); /* Listen on AUI also. */
-               outl(inl(ioaddr + CSR6) | csr6_fd, ioaddr + CSR6);
-               outl(0x0000EF01, ioaddr + CSR13);
-               break;
-       case DC21040:
-               outl(0x00000000, ioaddr + CSR13);
-               outl(0x00000004, ioaddr + CSR13);
-               break;
         case DC21140:
         case DM910X:
         default:
diff --git a/drivers/scsi/eata.c b/drivers/scsi/eata.c

index 1ce0fa803975d8728308f8fc4e0f04f9fce73fdc..fa97dfb7bdde0633798d6af71dcb021f8352552b 100644 (file)
--- a/drivers/scsi/eata.c
+++ b/drivers/scsi/eata.c
@@ -1,6 +1,9 @@
  /*
   *      eata.c - Low-level driver for EATA/DMA SCSI host adapters.
   *
+ *      11 Dec 2001 Rev. 7.00 for linux 2.5.1
+ *        + Use host->host_lock instead of io_request_lock.
+ *
   *       1 May 2001 Rev. 6.05 for linux 2.4.4
   *        + Clean up all pci related routines.
   *        + Fix data transfer direction for opcode SEND_CUE_SHEET (0x5d)
@@ -438,13 +441,6 @@ MODULE_AUTHOR("Dario Ballabio");
  #include <linux/ctype.h>
  #include <linux/spinlock.h>
  
-#define SPIN_FLAGS unsigned long spin_flags;
-#define SPIN_LOCK spin_lock_irq(&io_request_lock);
-#define SPIN_LOCK_SAVE spin_lock_irqsave(&io_request_lock, spin_flags);
-#define SPIN_UNLOCK spin_unlock_irq(&io_request_lock);
-#define SPIN_UNLOCK_RESTORE \
-                  spin_unlock_irqrestore(&io_request_lock, spin_flags);
-
  /* Subversion values */
  #define ISA  0
  #define ESA 1
@@ -1589,10 +1585,12 @@ static inline int do_reset(Scsi_Cmnd *SCarg) {
  #endif
  
     HD(j)->in_reset = TRUE;
-   SPIN_UNLOCK
+
+   spin_unlock_irq(&sh[j]->host_lock);
     time = jiffies;
     while ((jiffies - time) < (10 * HZ) && limit++ < 200000) udelay(100L);
-   SPIN_LOCK
+   spin_lock_irq(&sh[j]->host_lock);
+
     printk("%s: reset, interrupts disabled, loops %d.\n", BN(j), limit);
  
     for (i = 0; i < sh[j]->can_queue; i++) {
@@ -2036,14 +2034,14 @@ static inline void ihdlr(int irq, unsigned int j) {
  
  static void do_interrupt_handler(int irq, void *shap, struct pt_regs *regs) {
     unsigned int j;
-   SPIN_FLAGS
+   unsigned long spin_flags;
  
     /* Check if the interrupt must be processed by this handler */
     if ((j = (unsigned int)((char *)shap - sha)) >= num_boards) return;
  
-   SPIN_LOCK_SAVE
+   spin_lock_irqsave(&sh[j]->host_lock, spin_flags);
     ihdlr(irq, j);
-   SPIN_UNLOCK_RESTORE
+   spin_unlock_irqrestore(&sh[j]->host_lock, spin_flags);
  }
  
  int eata2x_release(struct Scsi_Host *shpnt) {
@@ -2077,4 +2075,4 @@ static Scsi_Host_Template driver_template = EATA;
  #ifndef MODULE
  __setup("eata=", option_setup);
  #endif /* end MODULE */
-MODULE_LICENSE("Dual BSD/GPL");
+MODULE_LICENSE("GPL");
diff --git a/drivers/scsi/eata.h b/drivers/scsi/eata.h

index afa5e27870f98c36f7c37a4b06ed139ac030d618..de0bad6efaab8948ae7d7c544c8d042dff618575 100644 (file)
--- a/drivers/scsi/eata.h
+++ b/drivers/scsi/eata.h
@@ -13,7 +13,7 @@ int eata2x_abort(Scsi_Cmnd *);
  int eata2x_reset(Scsi_Cmnd *);
  int eata2x_biosparam(Disk *, kdev_t, int *);
  
-#define EATA_VERSION "6.05.00"
+#define EATA_VERSION "7.00.00"
  
  #define EATA {                                                               \
                  name:              "EATA/DMA 2.0x rev. " EATA_VERSION " ",   \
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c

index 3713c328424381043a5ee367c50064cef4947e86..656766c09f2d3340f34280d4cdc4843f9aa77b46 100644 (file)
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -183,7 +183,7 @@ void  scsi_initialize_queue(Scsi_Device * SDpnt, struct Scsi_Host * SHpnt)
         request_queue_t *q = &SDpnt->request_queue;
         int max_segments = SHpnt->sg_tablesize;
  
-       blk_init_queue(q, scsi_request_fn);
+       blk_init_queue(q, scsi_request_fn, &SHpnt->host_lock);
         q->queuedata = (void *) SDpnt;
  
  #ifdef DMA_CHUNK_SIZE
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c

index af0bb409c9d62fa7792bef38f2ec2c42a4dbb5bd..b6894649e12f668b13ed64af68de7c26b11d9304 100644 (file)
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -1254,9 +1254,7 @@ STATIC void scsi_restart_operations(struct Scsi_Host *host)
                         break;
                 }
  
-               spin_lock(&q->queue_lock);
                 q->request_fn(q);
-               spin_unlock(&q->queue_lock);
         }
         spin_unlock_irqrestore(&host->host_lock, flags);
  }
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c

index a723b3404227a95555e082799a2ce7b11ceb755a..d7cc000bcdd2acf4e5d0100043d602fd40aa3e3f 100644 (file)
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -70,7 +70,7 @@ static void __scsi_insert_special(request_queue_t *q, struct request *rq,
  {
         unsigned long flags;
  
-       ASSERT_LOCK(&q->queue_lock, 0);
+       ASSERT_LOCK(q->queue_lock, 0);
  
         /*
          * tell I/O scheduler that this isn't a regular read/write (ie it
@@ -91,10 +91,10 @@ static void __scsi_insert_special(request_queue_t *q, struct request *rq,
          * head of the queue for things like a QUEUE_FULL message from a
          * device, or a host that is unable to accept a particular command.
          */
-       spin_lock_irqsave(&q->queue_lock, flags);
+       spin_lock_irqsave(q->queue_lock, flags);
         __elv_add_request(q, rq, !at_head, 0);
         q->request_fn(q);
-       spin_unlock_irqrestore(&q->queue_lock, flags);
+       spin_unlock_irqrestore(q->queue_lock, flags);
  }
  
  
@@ -250,9 +250,9 @@ void scsi_queue_next_request(request_queue_t * q, Scsi_Cmnd * SCpnt)
         Scsi_Device *SDpnt;
         struct Scsi_Host *SHpnt;
  
-       ASSERT_LOCK(&q->queue_lock, 0);
+       ASSERT_LOCK(q->queue_lock, 0);
  
-       spin_lock_irqsave(&q->queue_lock, flags);
+       spin_lock_irqsave(q->queue_lock, flags);
         if (SCpnt != NULL) {
  
                 /*
@@ -325,7 +325,7 @@ void scsi_queue_next_request(request_queue_t * q, Scsi_Cmnd * SCpnt)
                         SHpnt->some_device_starved = 0;
                 }
         }
-       spin_unlock_irqrestore(&q->queue_lock, flags);
+       spin_unlock_irqrestore(q->queue_lock, flags);
  }
  
  /*
@@ -360,7 +360,7 @@ static Scsi_Cmnd *__scsi_end_request(Scsi_Cmnd * SCpnt,
         request_queue_t *q = &SCpnt->device->request_queue;
         struct request *req = &SCpnt->request;
  
-       ASSERT_LOCK(&q->queue_lock, 0);
+       ASSERT_LOCK(q->queue_lock, 0);
  
         /*
          * If there are blocks left over at the end, set up the command
@@ -445,7 +445,7 @@ static void scsi_release_buffers(Scsi_Cmnd * SCpnt)
  {
         struct request *req = &SCpnt->request;
  
-       ASSERT_LOCK(&SCpnt->device->request_queue.queue_lock, 0);
+       ASSERT_LOCK(&SCpnt->host->host_lock, 0);
  
         /*
          * Free up any indirection buffers we allocated for DMA purposes. 
@@ -518,7 +518,7 @@ void scsi_io_completion(Scsi_Cmnd * SCpnt, int good_sectors,
          *      would be used if we just wanted to retry, for example.
          *
          */
-       ASSERT_LOCK(&q->queue_lock, 0);
+       ASSERT_LOCK(q->queue_lock, 0);
  
         /*
          * Free up any indirection buffers we allocated for DMA purposes. 
@@ -746,8 +746,6 @@ struct Scsi_Device_Template *scsi_get_request_dev(struct request *req)
         kdev_t dev = req->rq_dev;
         int major = MAJOR(dev);
  
-       ASSERT_LOCK(&req->q->queue_lock, 1);
-
         for (spnt = scsi_devicelist; spnt; spnt = spnt->next) {
                 /*
                  * Search for a block device driver that supports this
@@ -804,7 +802,7 @@ void scsi_request_fn(request_queue_t * q)
         struct Scsi_Host *SHpnt;
         struct Scsi_Device_Template *STpnt;
  
-       ASSERT_LOCK(&q->queue_lock, 1);
+       ASSERT_LOCK(q->queue_lock, 1);
  
         SDpnt = (Scsi_Device *) q->queuedata;
         if (!SDpnt) {
@@ -871,9 +869,9 @@ void scsi_request_fn(request_queue_t * q)
                          */
                         SDpnt->was_reset = 0;
                         if (SDpnt->removable && !in_interrupt()) {
-                               spin_unlock_irq(&q->queue_lock);
+                               spin_unlock_irq(q->queue_lock);
                                 scsi_ioctl(SDpnt, SCSI_IOCTL_DOORLOCK, 0);
-                               spin_lock_irq(&q->queue_lock);
+                               spin_lock_irq(q->queue_lock);
                                 continue;
                         }
                 }
@@ -982,7 +980,7 @@ void scsi_request_fn(request_queue_t * q)
                  * another.  
                  */
                 req = NULL;
-               spin_unlock_irq(&q->queue_lock);
+               spin_unlock_irq(q->queue_lock);
  
                 if (SCpnt->request.flags & REQ_CMD) {
                         /*
@@ -1012,7 +1010,7 @@ void scsi_request_fn(request_queue_t * q)
                                 {
                                         panic("Should not have leftover blocks\n");
                                 }
-                               spin_lock_irq(&q->queue_lock);
+                               spin_lock_irq(q->queue_lock);
                                 SHpnt->host_busy--;
                                 SDpnt->device_busy--;
                                 continue;
@@ -1028,7 +1026,7 @@ void scsi_request_fn(request_queue_t * q)
                                 {
                                         panic("Should not have leftover blocks\n");
                                 }
-                               spin_lock_irq(&q->queue_lock);
+                               spin_lock_irq(q->queue_lock);
                                 SHpnt->host_busy--;
                                 SDpnt->device_busy--;
                                 continue;
@@ -1049,7 +1047,7 @@ void scsi_request_fn(request_queue_t * q)
                  * Now we need to grab the lock again.  We are about to mess
                  * with the request queue and try to find another command.
                  */
-               spin_lock_irq(&q->queue_lock);
+               spin_lock_irq(q->queue_lock);
         }
  }
  
diff --git a/drivers/scsi/scsi_merge.c b/drivers/scsi/scsi_merge.c

index 9d455e89574ac02c961f45122df018513dae79e0..89def7c84d79a9b4057d5ce045052adde3a61147 100644 (file)
--- a/drivers/scsi/scsi_merge.c
+++ b/drivers/scsi/scsi_merge.c
@@ -307,7 +307,7 @@ __inline static int __scsi_back_merge_fn(request_queue_t * q,
         }
  
  #ifdef DMA_CHUNK_SIZE
-       if (MERGEABLE_BUFFERS(bio, req->bio))
+       if (MERGEABLE_BUFFERS(req->biotail, bio))
                 return scsi_new_mergeable(q, req, bio);
  #endif
  
@@ -461,9 +461,7 @@ inline static int scsi_merge_requests_fn(request_queue_t * q,
   *              (mainly because we don't need queue management functions
   *              which keep the tally uptodate.
   */
-__inline static int __init_io(Scsi_Cmnd * SCpnt,
-                             int sg_count_valid,
-                             int dma_host)
+__inline static int __init_io(Scsi_Cmnd * SCpnt, int dma_host)
  {
         struct bio         * bio;
         char               * buff;
@@ -480,11 +478,7 @@ __inline static int __init_io(Scsi_Cmnd * SCpnt,
         /*
          * First we need to know how many scatter gather segments are needed.
          */
-       if (!sg_count_valid) {
-               count = __count_segments(req, dma_host, NULL);
-       } else {
-               count = req->nr_segments;
-       }
+       count = req->nr_segments;
  
         /*
          * If the dma pool is nearly empty, then queue a minimal request
@@ -721,20 +715,14 @@ __inline static int __init_io(Scsi_Cmnd * SCpnt,
         return 1;
  }
  
-#define INITIO(_FUNCTION, _VALID, _DMA)                \
+#define INITIO(_FUNCTION, _DMA)                        \
  static int _FUNCTION(Scsi_Cmnd * SCpnt)                \
  {                                              \
-    return __init_io(SCpnt, _VALID, _DMA);     \
+    return __init_io(SCpnt, _DMA);             \
  }
  
-/*
- * ll_rw_blk.c now keeps track of the number of segments in
- * a request.  Thus we don't have to do it any more here.
- * We always force "_VALID" to 1.  Eventually clean this up
- * and get rid of the extra argument.
- */
-INITIO(scsi_init_io_v, 1, 0)
-INITIO(scsi_init_io_vd, 1, 1)
+INITIO(scsi_init_io_v, 0)
+INITIO(scsi_init_io_vd, 1)
  
  /*
   * Function:    initialize_merge_fn()
diff --git a/drivers/scsi/scsi_queue.c b/drivers/scsi/scsi_queue.c

index b864fc04507ffc6da34fb11641d49720b3e723ba..1d9a90bbdd56c5c33d323f5542d727df6d1feb93 100644 (file)
--- a/drivers/scsi/scsi_queue.c
+++ b/drivers/scsi/scsi_queue.c
@@ -80,7 +80,6 @@ int scsi_mlqueue_insert(Scsi_Cmnd * cmd, int reason)
  {
         struct Scsi_Host *host;
         unsigned long flags;
-       request_queue_t *q = &cmd->device->request_queue;
  
         SCSI_LOG_MLQUEUE(1, printk("Inserting command %p into mlqueue\n", cmd));
  
@@ -138,10 +137,10 @@ int scsi_mlqueue_insert(Scsi_Cmnd * cmd, int reason)
          * Decrement the counters, since these commands are no longer
          * active on the host/device.
          */
-       spin_lock_irqsave(&q->queue_lock, flags);
+       spin_lock_irqsave(&cmd->host->host_lock, flags);
         cmd->host->host_busy--;
         cmd->device->device_busy--;
-       spin_unlock_irqrestore(&q->queue_lock, flags);
+       spin_unlock_irqrestore(&cmd->host->host_lock, flags);
  
         /*
          * Insert this command at the head of the queue for it's device.
diff --git a/drivers/scsi/u14-34f.c b/drivers/scsi/u14-34f.c

index 41cff9e57108cdf10d5702bf92d9a0514536ae27..adacf2fd49a07904037b632a7884c702967488ab 100644 (file)
--- a/drivers/scsi/u14-34f.c
+++ b/drivers/scsi/u14-34f.c
@@ -1,6 +1,9 @@
  /*
   *      u14-34f.c - Low-level driver for UltraStor 14F/34F SCSI host adapters.
   *
+ *      11 Dec 2001 Rev. 7.00 for linux 2.5.1
+ *        + Use host->host_lock instead of io_request_lock.
+ *
   *       1 May 2001 Rev. 6.05 for linux 2.4.4
   *        + Fix data transfer direction for opcode SEND_CUE_SHEET (0x5d)
   *
@@ -334,7 +337,6 @@
   *  the driver sets host->wish_block = TRUE for all ISA boards.
   */
  
-#include <linux/module.h>
  #include <linux/version.h>
  
  #ifndef LinuxVersionCode
@@ -343,6 +345,9 @@
  
  #define MAX_INT_PARAM 10
  
+#if defined(MODULE)
+#include <linux/module.h>
+
  MODULE_PARM(boot_options, "s");
  MODULE_PARM(io_port, "1-" __MODULE_STRING(MAX_INT_PARAM) "i");
  MODULE_PARM(linked_comm, "i");
@@ -352,6 +357,8 @@ MODULE_PARM(max_queue_depth, "i");
  MODULE_PARM(ext_tran, "i");
  MODULE_AUTHOR("Dario Ballabio");
  
+#endif
+
  #include <linux/string.h>
  #include <linux/sched.h>
  #include <linux/kernel.h>
@@ -374,13 +381,6 @@ MODULE_AUTHOR("Dario Ballabio");
  #include <linux/ctype.h>
  #include <linux/spinlock.h>
  
-#define SPIN_FLAGS unsigned long spin_flags;
-#define SPIN_LOCK spin_lock_irq(&io_request_lock);
-#define SPIN_LOCK_SAVE spin_lock_irqsave(&io_request_lock, spin_flags);
-#define SPIN_UNLOCK spin_unlock_irq(&io_request_lock);
-#define SPIN_UNLOCK_RESTORE \
-                  spin_unlock_irqrestore(&io_request_lock, spin_flags);
-
  /* Values for the PRODUCT_ID ports for the 14/34F */
  #define PRODUCT_ID1  0x56
  #define PRODUCT_ID2  0x40        /* NOTE: Only upper nibble is used */
@@ -672,10 +672,8 @@ static int board_inquiry(unsigned int j) {
     /* Issue OGM interrupt */
     outb(CMD_OGM_INTR, sh[j]->io_port + REG_LCL_INTR);
  
-   SPIN_UNLOCK
     time = jiffies;
     while ((jiffies - time) < HZ && limit++ < 20000) udelay(100L);
-   SPIN_LOCK
  
     if (cpp->adapter_status || HD(j)->cp_stat[0] != FREE) {
        HD(j)->cp_stat[0] = FREE;
@@ -1274,10 +1272,12 @@ static inline int do_reset(Scsi_Cmnd *SCarg) {
  #endif
  
     HD(j)->in_reset = TRUE;
-   SPIN_UNLOCK
+   
+   spin_unlock_irq(&sh[j]->host_lock);
     time = jiffies;
     while ((jiffies - time) < (10 * HZ) && limit++ < 200000) udelay(100L);
-   SPIN_LOCK
+   spin_lock_irq(&sh[j]->host_lock);
+   
     printk("%s: reset, interrupts disabled, loops %d.\n", BN(j), limit);
  
     for (i = 0; i < sh[j]->can_queue; i++) {
@@ -1718,14 +1718,14 @@ static inline void ihdlr(int irq, unsigned int j) {
  
  static void do_interrupt_handler(int irq, void *shap, struct pt_regs *regs) {
     unsigned int j;
-   SPIN_FLAGS
+   unsigned long spin_flags;
  
     /* Check if the interrupt must be processed by this handler */
     if ((j = (unsigned int)((char *)shap - sha)) >= num_boards) return;
  
-   SPIN_LOCK_SAVE
+   spin_lock_irqsave(&sh[j]->host_lock, spin_flags);
     ihdlr(irq, j);
-   SPIN_UNLOCK_RESTORE
+   spin_unlock_irqrestore(&sh[j]->host_lock, spin_flags);
  }
  
  int u14_34f_release(struct Scsi_Host *shpnt) {
@@ -1752,7 +1752,6 @@ int u14_34f_release(struct Scsi_Host *shpnt) {
     return FALSE;
  }
  
-MODULE_LICENSE("BSD without advertisement clause");
  static Scsi_Host_Template driver_template = ULTRASTOR_14_34F;
  
  #include "scsi_module.c"
@@ -1760,3 +1759,4 @@ static Scsi_Host_Template driver_template = ULTRASTOR_14_34F;
  #ifndef MODULE
  __setup("u14-34f=", option_setup);
  #endif /* end MODULE */
+MODULE_LICENSE("GPL");
diff --git a/drivers/scsi/u14-34f.h b/drivers/scsi/u14-34f.h

index 1d2988d739b52b4872beba5ab9ab43c4c0bc8328..d8d1d400fdd90af33fc453ce838e94bf0f8b3191 100644 (file)
--- a/drivers/scsi/u14-34f.h
+++ b/drivers/scsi/u14-34f.h
@@ -13,7 +13,7 @@ int u14_34f_abort(Scsi_Cmnd *);
  int u14_34f_reset(Scsi_Cmnd *);
  int u14_34f_biosparam(Disk *, kdev_t, int *);
  
-#define U14_34F_VERSION "6.05.00"
+#define U14_34F_VERSION "7.00.00"
  
  #define ULTRASTOR_14_34F {                                                   \
                  name:         "UltraStor 14F/34F rev. " U14_34F_VERSION " ", \
diff --git a/fs/bio.c b/fs/bio.c

index d04cbca7ab1bf38d9b687509302ffe4eadb5e266..36fe91f4a636a092d4e1fe2b5287e65925d1a6bc 100644 (file)
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -48,7 +48,7 @@ static const int bvec_pool_sizes[BIOVEC_NR_POOLS] = { 1, 4, 16, 64, 128, 256 };
  
  #define BIO_MAX_PAGES  (bvec_pool_sizes[BIOVEC_NR_POOLS - 1])
  
-static void * slab_pool_alloc(int gfp_mask, void *data)
+static void *slab_pool_alloc(int gfp_mask, void *data)
  {
         return kmem_cache_alloc(data, gfp_mask);
  }
diff --git a/fs/block_dev.c b/fs/block_dev.c

index de4cb8afade6487cce11ab60235212e0bf69b1c1..301a62ef5777dbeaaa6404a286288e9f98008fe9 100644 (file)
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -324,6 +324,7 @@ struct block_device *bdget(dev_t dev)
                         new_bdev->bd_dev = dev;
                         new_bdev->bd_op = NULL;
                         new_bdev->bd_inode = inode;
+                       inode->i_mode = S_IFBLK;
                         inode->i_rdev = kdev;
                         inode->i_dev = kdev;
                         inode->i_bdev = new_bdev;
diff --git a/fs/buffer.c b/fs/buffer.c

index 405e81410c888c333df99f6fb0e29067917ab543..e724f5ade10556bfdeabc2edbe67cafbf1a4bda2 100644 (file)
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -2005,12 +2005,12 @@ int generic_direct_IO(int rw, struct inode * inode, struct kiobuf * iobuf, unsig
  {
         int i, nr_blocks, retval;
         sector_t *blocks = iobuf->blocks;
-       struct buffer_head bh;
  
-       bh.b_dev = inode->i_dev;
         nr_blocks = iobuf->length / blocksize;
         /* build the blocklist */
         for (i = 0; i < nr_blocks; i++, blocknr++) {
+               struct buffer_head bh;
+
                 bh.b_state = 0;
                 bh.b_dev = inode->i_dev;
                 bh.b_size = blocksize;
@@ -2037,7 +2037,7 @@ int generic_direct_IO(int rw, struct inode * inode, struct kiobuf * iobuf, unsig
         }
  
         /* This does not understand multi-device filesystems currently */
-       retval = brw_kiovec(rw, 1, &iobuf, bh.b_dev, blocks, blocksize);
+       retval = brw_kiovec(rw, 1, &iobuf, inode->i_dev, blocks, blocksize);
  
   out:
         return retval;
diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c

index 6ad90b306b0c9012cc4040a8f08bb5492535042f..cff561ab9b5fb88512bfee081fc1fbd3f40ea774 100644 (file)
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -311,7 +311,7 @@ out:
         return result;
  }
  
-static int ufs_getfrag_block (struct inode *inode, long fragment, struct buffer_head *bh_result, int create)
+static int ufs_getfrag_block (struct inode *inode, sector_t fragment, struct buffer_head *bh_result, int create)
  {
         struct super_block * sb;
         struct ufs_sb_private_info * uspi;
diff --git a/include/asm-i386/io.h b/include/asm-i386/io.h

index 0c5e61d14eef424ff65d6d17f158d97c9a241791..d8d68e8c296d3cb819f33b743b87be6e2ff872d8 100644 (file)
--- a/include/asm-i386/io.h
+++ b/include/asm-i386/io.h
@@ -51,12 +51,9 @@
   */
  #if CONFIG_DEBUG_IOVIRT
    extern void *__io_virt_debug(unsigned long x, const char *file, int line);
-  extern unsigned long __io_phys_debug(unsigned long x, const char *file, int line);
    #define __io_virt(x) __io_virt_debug((unsigned long)(x), __FILE__, __LINE__)
-//#define __io_phys(x) __io_phys_debug((unsigned long)(x), __FILE__, __LINE__)
  #else
    #define __io_virt(x) ((void *)(x))
-//#define __io_phys(x) __pa(x)
  #endif
  
  /*
diff --git a/include/asm-s390/io.h b/include/asm-s390/io.h

index a9c1a917a8fc58fff291f28bcf995f3e82978471..e044135ef7797f1a22d379e68e81beee51e5cb33 100644 (file)
--- a/include/asm-s390/io.h
+++ b/include/asm-s390/io.h
@@ -19,7 +19,7 @@
  #define IO_SPACE_LIMIT 0xffffffff
  
  #define __io_virt(x)            ((void *)(PAGE_OFFSET | (unsigned long)(x)))
-#define __io_phys(x)            ((unsigned long)(x) & ~PAGE_OFFSET)
+
  /*
   * Change virtual addresses to physical addresses and vv.
   * These are pretty trivial
diff --git a/include/asm-s390x/io.h b/include/asm-s390x/io.h

index 2d0d2e79a274e758b32b7d681cb6e0ed4724b34c..088e26498d6825920dc0e267b097684a3732ced2 100644 (file)
--- a/include/asm-s390x/io.h
+++ b/include/asm-s390x/io.h
@@ -19,7 +19,7 @@
  #define IO_SPACE_LIMIT 0xffffffff
  
  #define __io_virt(x)            ((void *)(PAGE_OFFSET | (unsigned long)(x)))
-#define __io_phys(x)            ((unsigned long)(x) & ~PAGE_OFFSET)
+
  /*
   * Change virtual addresses to physical addresses and vv.
   * These are pretty trivial
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h

index 204ab976551419f246290b375403c9a23b29d088..fad87a308171c3ea2350eb50fe0ab584a5ef1033 100644 (file)
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -160,7 +160,7 @@ struct request_queue
         /*
          * protects queue structures from reentrancy
          */
-       spinlock_t              queue_lock;
+       spinlock_t              *queue_lock;
  
         /*
          * queue settings
@@ -258,13 +258,14 @@ extern void blk_put_request(struct request *);
  extern void blk_plug_device(request_queue_t *);
  extern void blk_recount_segments(request_queue_t *, struct bio *);
  extern inline int blk_contig_segment(request_queue_t *q, struct bio *, struct bio *);
+extern void blk_queue_assign_lock(request_queue_t *q, spinlock_t *);
  
  extern int block_ioctl(kdev_t, unsigned int, unsigned long);
  
  /*
   * Access functions for manipulating queue properties
   */
-extern int blk_init_queue(request_queue_t *, request_fn_proc *);
+extern int blk_init_queue(request_queue_t *, request_fn_proc *, spinlock_t *);
  extern void blk_cleanup_queue(request_queue_t *);
  extern void blk_queue_make_request(request_queue_t *, make_request_fn *);
  extern void blk_queue_bounce_limit(request_queue_t *, u64);
diff --git a/include/linux/devfs_fs_kernel.h b/include/linux/devfs_fs_kernel.h

index 7ca978981e2c0db8f215f1866e9a65319025f6e4..0a241a076158a63eea258695b45c4fb952de6c3f 100644 (file)
--- a/include/linux/devfs_fs_kernel.h
+++ b/include/linux/devfs_fs_kernel.h
@@ -47,14 +47,6 @@
  
  typedef struct devfs_entry * devfs_handle_t;
  
-
-#ifdef CONFIG_BLK_DEV_INITRD
-#  define ROOT_DEVICE_NAME ((real_root_dev ==ROOT_DEV) ? root_device_name:NULL)
-#else
-#  define ROOT_DEVICE_NAME root_device_name
-#endif
-
-
  #ifdef CONFIG_DEVFS_FS
  
  struct unique_numspace
diff --git a/include/linux/ide.h b/include/linux/ide.h

index 38a17222c225a25cc5913b7015d7a0d76ef43365..5bcdab80f3f7ae31e4c68e4c9a1b4c02234803e4 100644 (file)
--- a/include/linux/ide.h
+++ b/include/linux/ide.h
@@ -1001,7 +1001,6 @@ unsigned long ide_get_or_set_dma_base (ide_hwif_t *hwif, int extra, const char *
  
  void hwif_unregister (ide_hwif_t *hwif);
  
-#define DRIVE_LOCK(drive)      (&(drive)->queue.queue_lock)
  extern spinlock_t ide_lock;
  
  #endif /* _IDE_H */
diff --git a/include/linux/mempool.h b/include/linux/mempool.h

index 07e97d109ac839e6c4d085cf9d65480ab3c00720..bd3745152632965d0ea83ca462153643cefb08a1 100644 (file)
--- a/include/linux/mempool.h
+++ b/include/linux/mempool.h
@@ -25,6 +25,7 @@ struct mempool_s {
  };
  extern mempool_t * mempool_create(int min_nr, mempool_alloc_t *alloc_fn,
                                  mempool_free_t *free_fn, void *pool_data);
+extern void mempool_resize(mempool_t *pool, int new_min_nr, int gfp_mask);
  extern void mempool_destroy(mempool_t *pool);
  extern void * mempool_alloc(mempool_t *pool, int gfp_mask);
  extern void mempool_free(void *element, mempool_t *pool);
diff --git a/include/linux/nbd.h b/include/linux/nbd.h

index 0dbf87851169e8956dea554975d733302aeebf41..6c8bc1e4438e0b238b53356e58f3a8b54d859d62 100644 (file)
--- a/include/linux/nbd.h
+++ b/include/linux/nbd.h
@@ -46,7 +46,7 @@ nbd_end_request(struct request *req)
  #ifdef PARANOIA
         requests_out++;
  #endif
-       spin_lock_irqsave(&q->queue_lock, flags);
+       spin_lock_irqsave(q->queue_lock, flags);
         while((bio = req->bio) != NULL) {
                 nsect = bio_sectors(bio);
                 blk_finished_io(nsect);
@@ -55,7 +55,7 @@ nbd_end_request(struct request *req)
                 bio_endio(bio, uptodate, nsect);
         }
         blkdev_release_request(req);
-       spin_unlock_irqrestore(&q->queue_lock, flags);
+       spin_unlock_irqrestore(q->queue_lock, flags);
  }
  
  #define MAX_NBD 128
diff --git a/include/linux/raid/md.h b/include/linux/raid/md.h

index a7e18913ec09e81fb2288f9fdd046856b22402aa..233163eb2872878be8f8fae5c521aaf2259a51fd 100644 (file)
--- a/include/linux/raid/md.h
+++ b/include/linux/raid/md.h
@@ -37,8 +37,12 @@
  #include <linux/kernel_stat.h>
  #include <asm/io.h>
  #include <linux/completion.h>
+#include <linux/mempool.h>
+#include <linux/list.h>
+#include <linux/reboot.h>
+#include <linux/vmalloc.h>
+#include <linux/blkpg.h>
  
-#include <linux/raid/md_compatible.h>
  /*
   * 'md_p.h' holds the 'physical' layout of RAID devices
   * 'md_u.h' holds the user <=> kernel API
diff --git a/include/linux/raid/md_compatible.h b/include/linux/raid/md_compatible.h

deleted file mode 100644 (file)

index 74dadd4..0000000
--- a/include/linux/raid/md_compatible.h
+++ /dev/null
@@ -1,158 +0,0 @@
-
-/*
-   md.h : Multiple Devices driver compatibility layer for Linux 2.0/2.2
-          Copyright (C) 1998 Ingo Molnar
-         
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 2, or (at your option)
-   any later version.
-   
-   You should have received a copy of the GNU General Public License
-   (for example /usr/src/linux/COPYING); if not, write to the Free
-   Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.  
-*/
-
-#include <linux/version.h>
-
-#ifndef _MD_COMPATIBLE_H
-#define _MD_COMPATIBLE_H
-
-/** 2.3/2.4 stuff: **/
-
-#include <linux/reboot.h>
-#include <linux/vmalloc.h>
-#include <linux/blkpg.h>
-
-/* 000 */
-#define md__get_free_pages(x,y) __get_free_pages(x,y)
-
-#if defined(__i386__) || defined(__x86_64__)
-/* 001 */
-static __inline__ int md_cpu_has_mmx(void)
-{
-       return test_bit(X86_FEATURE_MMX,  &boot_cpu_data.x86_capability);
-}
-#else
-#define md_cpu_has_mmx(x)      (0)
-#endif
-
-/* 002 */
-#define md_clear_page(page)        clear_page(page)
-
-/* 003 */
-#define MD_EXPORT_SYMBOL(x) EXPORT_SYMBOL(x)
-
-/* 004 */
-#define md_copy_to_user(x,y,z) copy_to_user(x,y,z)
-
-/* 005 */
-#define md_copy_from_user(x,y,z) copy_from_user(x,y,z)
-
-/* 006 */
-#define md_put_user put_user
-
-/* 007 */
-static inline int md_capable_admin(void)
-{
-       return capable(CAP_SYS_ADMIN);
-}
-
-/* 008 */
-#define MD_FILE_TO_INODE(file) ((file)->f_dentry->d_inode)
-
-/* 009 */
-static inline void md_flush_signals (void)
-{
-       spin_lock(&current->sigmask_lock);
-       flush_signals(current);
-       spin_unlock(&current->sigmask_lock);
-}
- 
-/* 010 */
-static inline void md_init_signals (void)
-{
-        current->exit_signal = SIGCHLD;
-        siginitsetinv(&current->blocked, sigmask(SIGKILL));
-}
-
-/* 011 */
-#define md_signal_pending signal_pending
-
-/* 012 - md_set_global_readahead - nowhere used */
-
-/* 013 */
-#define md_mdelay(x) mdelay(x)
-
-/* 014 */
-#define MD_SYS_DOWN SYS_DOWN
-#define MD_SYS_HALT SYS_HALT
-#define MD_SYS_POWER_OFF SYS_POWER_OFF
-
-/* 015 */
-#define md_register_reboot_notifier register_reboot_notifier
-
-/* 016 */
-#define md_test_and_set_bit test_and_set_bit
-
-/* 017 */
-#define md_test_and_clear_bit test_and_clear_bit
-
-/* 018 */
-#define md_atomic_read atomic_read
-#define md_atomic_set atomic_set
-
-/* 019 */
-#define md_lock_kernel lock_kernel
-#define md_unlock_kernel unlock_kernel
-
-/* 020 */
-
-#include <linux/init.h>
-
-#define md__init __init
-#define md__initdata __initdata
-#define md__initfunc(__arginit) __initfunc(__arginit)
-
-/* 021 */
-
-
-/* 022 */
-
-#define md_list_head list_head
-#define MD_LIST_HEAD(name) LIST_HEAD(name)
-#define MD_INIT_LIST_HEAD(ptr) INIT_LIST_HEAD(ptr)
-#define md_list_add list_add
-#define md_list_del list_del
-#define md_list_empty list_empty
-
-#define md_list_entry(ptr, type, member) list_entry(ptr, type, member)
-
-/* 023 */
-
-#define md_schedule_timeout schedule_timeout
-
-/* 024 */
-#define md_need_resched(tsk) ((tsk)->need_resched)
-
-/* 025 */
-#define md_spinlock_t spinlock_t
-#define MD_SPIN_LOCK_UNLOCKED SPIN_LOCK_UNLOCKED
-
-#define md_spin_lock spin_lock
-#define md_spin_unlock spin_unlock
-#define md_spin_lock_irq spin_lock_irq
-#define md_spin_unlock_irq spin_unlock_irq
-#define md_spin_unlock_irqrestore spin_unlock_irqrestore
-#define md_spin_lock_irqsave spin_lock_irqsave
-
-/* 026 */
-typedef wait_queue_head_t md_wait_queue_head_t;
-#define MD_DECLARE_WAITQUEUE(w,t) DECLARE_WAITQUEUE((w),(t))
-#define MD_DECLARE_WAIT_QUEUE_HEAD(x) DECLARE_WAIT_QUEUE_HEAD(x)
-#define md_init_waitqueue_head init_waitqueue_head
-
-/* END */
-
-#endif 
-
diff --git a/include/linux/raid/md_k.h b/include/linux/raid/md_k.h

index 5382bc072c3d57f5e9710ae0ac85c46904545b37..6bf45496c507d4374cfdf4cd4f9508af02b7410d 100644 (file)
--- a/include/linux/raid/md_k.h
+++ b/include/linux/raid/md_k.h
@@ -158,9 +158,9 @@ static inline void mark_disk_nonsync(mdp_disk_t * d)
   */
  struct mdk_rdev_s
  {
-       struct md_list_head same_set;   /* RAID devices within the same set */
-       struct md_list_head all;        /* all RAID devices */
-       struct md_list_head pending;    /* undetected RAID devices */
+       struct list_head same_set;      /* RAID devices within the same set */
+       struct list_head all;           /* all RAID devices */
+       struct list_head pending;       /* undetected RAID devices */
  
         kdev_t dev;                     /* Device number */
         kdev_t old_dev;                 /*  "" when it was last imported */
@@ -197,7 +197,7 @@ struct mddev_s
         int                             __minor;
         mdp_super_t                     *sb;
         int                             nb_dev;
-       struct md_list_head             disks;
+       struct list_head                disks;
         int                             sb_dirty;
         mdu_param_t                     param;
         int                             ro;
@@ -212,9 +212,9 @@ struct mddev_s
         atomic_t                        active;
  
         atomic_t                        recovery_active; /* blocks scheduled, but not written */
-       md_wait_queue_head_t            recovery_wait;
+       wait_queue_head_t               recovery_wait;
  
-       struct md_list_head             all_mddevs;
+       struct list_head                all_mddevs;
  };
  
  struct mdk_personality_s
@@ -240,7 +240,7 @@ struct mdk_personality_s
  
         int (*stop_resync)(mddev_t *mddev);
         int (*restart_resync)(mddev_t *mddev);
-       int (*sync_request)(mddev_t *mddev, unsigned long block_nr);
+       int (*sync_request)(mddev_t *mddev, sector_t sector_nr);
  };
  
  
@@ -269,9 +269,9 @@ extern mdp_disk_t *get_spare(mddev_t *mddev);
   */
  #define ITERATE_RDEV_GENERIC(head,field,rdev,tmp)                      \
                                                                         \
-       for (tmp = head.next;                                           \
-               rdev = md_list_entry(tmp, mdk_rdev_t, field),           \
-                       tmp = tmp->next, tmp->prev != &head             \
+       for ((tmp) = (head).next;                                       \
+               (rdev) = (list_entry((tmp), mdk_rdev_t, field)),        \
+                       (tmp) = (tmp)->next, (tmp)->prev != &(head)     \
                 ; )
  /*
   * iterates through the 'same array disks' ringlist
@@ -305,7 +305,7 @@ extern mdp_disk_t *get_spare(mddev_t *mddev);
  #define ITERATE_MDDEV(mddev,tmp)                                       \
                                                                         \
         for (tmp = all_mddevs.next;                                     \
-               mddev = md_list_entry(tmp, mddev_t, all_mddevs),        \
+               mddev = list_entry(tmp, mddev_t, all_mddevs),   \
                         tmp = tmp->next, tmp->prev != &all_mddevs       \
                 ; )
  
@@ -325,7 +325,7 @@ static inline void unlock_mddev (mddev_t * mddev)
  typedef struct mdk_thread_s {
         void                    (*run) (void *data);
         void                    *data;
-       md_wait_queue_head_t    wqueue;
+       wait_queue_head_t       wqueue;
         unsigned long           flags;
         struct completion       *event;
         struct task_struct      *tsk;
@@ -337,7 +337,7 @@ typedef struct mdk_thread_s {
  #define MAX_DISKNAME_LEN 64
  
  typedef struct dev_name_s {
-       struct md_list_head list;
+       struct list_head list;
         kdev_t dev;
         char namebuf [MAX_DISKNAME_LEN];
         char *name;
diff --git a/include/linux/raid/raid1.h b/include/linux/raid/raid1.h

index 40675b40ca0fb88ed3e9159c8ff4d88b47f39a56..c03eabf2e55c0630028c931541c5d19d08badfb2 100644 (file)
--- a/include/linux/raid/raid1.h
+++ b/include/linux/raid/raid1.h
@@ -3,6 +3,8 @@
  
  #include <linux/raid/md.h>
  
+typedef struct mirror_info mirror_info_t;
+
  struct mirror_info {
         int             number;
         int             raid_disk;
@@ -20,34 +22,21 @@ struct mirror_info {
         int             used_slot;
  };
  
-struct raid1_private_data {
+typedef struct r1bio_s r1bio_t;
+
+struct r1_private_data_s {
         mddev_t                 *mddev;
-       struct mirror_info      mirrors[MD_SB_DISKS];
+       mirror_info_t           mirrors[MD_SB_DISKS];
         int                     nr_disks;
         int                     raid_disks;
         int                     working_disks;
         int                     last_used;
-       unsigned long           next_sect;
+       sector_t                next_sect;
         int                     sect_count;
         mdk_thread_t            *thread, *resync_thread;
         int                     resync_mirrors;
-       struct mirror_info      *spare;
-       md_spinlock_t           device_lock;
-
-       /* buffer pool */
-       /* buffer_heads that we have pre-allocated have b_pprev -> &freebh
-        * and are linked into a stack using b_next
-        * raid1_bh that are pre-allocated have R1BH_PreAlloc set.
-        * All these variable are protected by device_lock
-        */
-       struct buffer_head      *freebh;
-       int                     freebh_cnt;     /* how many are on the list */
-       int                     freebh_blocked;
-       struct raid1_bh         *freer1;
-       int                     freer1_blocked;
-       int                     freer1_cnt;
-       struct raid1_bh         *freebuf;       /* each bh_req has a page allocated */
-       md_wait_queue_head_t    wait_buffer;
+       mirror_info_t           *spare;
+       spinlock_t              device_lock;
  
         /* for use when syncing mirrors: */
         unsigned long   start_active, start_ready,
@@ -56,18 +45,21 @@ struct raid1_private_data {
                 cnt_pending, cnt_future;
         int     phase;
         int     window;
-       md_wait_queue_head_t    wait_done;
-       md_wait_queue_head_t    wait_ready;
-       md_spinlock_t           segment_lock;
+       wait_queue_head_t       wait_done;
+       wait_queue_head_t       wait_ready;
+       spinlock_t              segment_lock;
+
+       mempool_t *r1bio_pool;
+       mempool_t *r1buf_pool;
  };
  
-typedef struct raid1_private_data raid1_conf_t;
+typedef struct r1_private_data_s conf_t;
  
  /*
   * this is the only point in the RAID code where we violate
   * C type safety. mddev->private is an 'opaque' pointer.
   */
-#define mddev_to_conf(mddev) ((raid1_conf_t *) mddev->private)
+#define mddev_to_conf(mddev) ((conf_t *) mddev->private)
  
  /*
   * this is our 'private' 'collective' RAID1 buffer head.
@@ -75,20 +67,32 @@ typedef struct raid1_private_data raid1_conf_t;
   * for this RAID1 operation, and about their status:
   */
  
-struct raid1_bh {
+struct r1bio_s {
         atomic_t                remaining; /* 'have we finished' count,
                                             * used from IRQ handlers
                                             */
         int                     cmd;
+       sector_t                sector;
         unsigned long           state;
         mddev_t                 *mddev;
-       struct buffer_head      *master_bh;
-       struct buffer_head      *mirror_bh_list;
-       struct buffer_head      bh_req;
-       struct raid1_bh         *next_r1;       /* next for retry or in free list */
+       /*
+        * original bio going to /dev/mdx
+        */
+       struct bio              *master_bio;
+       /*
+        * if the IO is in READ direction, then this bio is used:
+        */
+       struct bio              *read_bio;
+       /*
+        * if the IO is in WRITE direction, then multiple bios are used:
+        */
+       struct bio              *write_bios[MD_SB_DISKS];
+
+       r1bio_t                 *next_r1; /* next for retry or in free list */
+       struct list_head        retry_list;
  };
-/* bits for raid1_bh.state */
-#define        R1BH_Uptodate   1
-#define        R1BH_SyncPhase  2
-#define        R1BH_PreAlloc   3       /* this was pre-allocated, add to free list */
+
+/* bits for r1bio.state */
+#define        R1BIO_Uptodate  1
+#define        R1BIO_SyncPhase 2
  #endif
diff --git a/init/do_mounts.c b/init/do_mounts.c

index d34fdd7ae7f8f0ea0c05daf4a1b6e6cfa4d09e1b..e6a94292c2b41cc18c386c0e5e6df5d7cc6859cd 100644 (file)
--- a/init/do_mounts.c
+++ b/init/do_mounts.c
@@ -14,37 +14,44 @@
  #include <linux/nfs_fs.h>
  #include <linux/nfs_fs_sb.h>
  #include <linux/nfs_mount.h>
+#include <linux/minix_fs.h>
+#include <linux/ext2_fs.h>
+#include <linux/romfs_fs.h>
  
  #include <asm/uaccess.h>
  
-/* syscalls missing from unistd.h */
- 
-static inline _syscall2(int,mkdir,char *,name,int,mode);
-static inline _syscall1(int,chdir,char *,name);
-static inline _syscall1(int,chroot,char *,name);
-static inline _syscall1(int,unlink,char *,name);
-static inline _syscall3(int,mknod,char *,name,int,mode,dev_t,dev);
-static inline _syscall5(int,mount,char *,dev,char *,dir,char *,type,
-                       unsigned long,flags,void *,data);
-static inline _syscall2(int,umount,char *,name,int,flags);
-
-extern void rd_load(void);
-extern void initrd_load(void);
+#define BUILD_CRAMDISK
+
  extern int get_filesystem_list(char * buf);
  extern void wait_for_keypress(void);
  
-asmlinkage long sys_mount(char * dev_name, char * dir_name, char * type,
-        unsigned long flags, void * data);
+asmlinkage long sys_mount(char *dev_name, char *dir_name, char *type,
+        unsigned long flags, void *data);
+asmlinkage long sys_mkdir(char *name, int mode);
+asmlinkage long sys_chdir(char *name);
+asmlinkage long sys_chroot(char *name);
+asmlinkage long sys_unlink(char *name);
+asmlinkage long sys_symlink(char *old, char *new);
+asmlinkage long sys_mknod(char *name, int mode, dev_t dev);
+asmlinkage long sys_umount(char *name, int flags);
+asmlinkage long sys_ioctl(int fd, int cmd, unsigned long arg);
  
  #ifdef CONFIG_BLK_DEV_INITRD
  unsigned int real_root_dev;    /* do_proc_dointvec cannot handle kdev_t */
  #endif
-int root_mountflags = MS_RDONLY;
-char root_device_name[64];
+#ifdef CONFIG_BLK_DEV_RAM
+extern int rd_doload;
+#else
+static int rd_doload = 0;
+#endif
+int root_mountflags = MS_RDONLY | MS_VERBOSE;
+static char root_device_name[64];
  
  /* this is initialized in init/main.c */
  kdev_t ROOT_DEV;
  
+static int do_devfs = 0;
+
  static int __init readonly(char *str)
  {
         if (*str)
@@ -275,91 +282,20 @@ static void __init get_fs_names(char *page)
         }
         *s = '\0';
  }
-
-static void __init mount_root(void)
+static void __init mount_block_root(char *name, int flags)
  {
-       void *handle;
-       char path[64];
-       char *name = "/dev/root";
-       char *fs_names, *p;
-       int do_devfs = 0;
+       char *fs_names = __getname();
+       char *p;
  
-       root_mountflags |= MS_VERBOSE;
-
-       fs_names = __getname();
         get_fs_names(fs_names);
-
-#ifdef CONFIG_ROOT_NFS
-       if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) {
-               void *data;
-               data = nfs_root_data();
-               if (data) {
-                       int err = mount("/dev/root", "/root", "nfs", root_mountflags, data);
-                       if (!err)
-                               goto done;
-               }
-               printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying floppy.\n");
-               ROOT_DEV = MKDEV(FLOPPY_MAJOR, 0);
-       }
-#endif
-
-#ifdef CONFIG_BLK_DEV_FD
-       if (MAJOR(ROOT_DEV) == FLOPPY_MAJOR) {
-#ifdef CONFIG_BLK_DEV_RAM
-               extern int rd_doload;
-               extern void rd_load_secondary(void);
-#endif
-               floppy_eject();
-#ifndef CONFIG_BLK_DEV_RAM
-               printk(KERN_NOTICE "(Warning, this kernel has no ramdisk support)\n");
-#else
-               /* rd_doload is 2 for a dual initrd/ramload setup */
-               if(rd_doload==2)
-                       rd_load_secondary();
-               else
-#endif
-               {
-                       printk(KERN_NOTICE "VFS: Insert root floppy and press ENTER\n");
-                       wait_for_keypress();
-               }
-       }
-#endif
-
-       devfs_make_root (root_device_name);
-       handle = devfs_find_handle (NULL, ROOT_DEVICE_NAME,
-                                   MAJOR (ROOT_DEV), MINOR (ROOT_DEV),
-                                   DEVFS_SPECIAL_BLK, 1);
-       if (handle) {
-               int n;
-               unsigned major, minor;
-
-               devfs_get_maj_min (handle, &major, &minor);
-               ROOT_DEV = MKDEV (major, minor);
-               if (!ROOT_DEV)
-                       panic("I have no root and I want to scream");
-               n = devfs_generate_path (handle, path + 5, sizeof (path) - 5);
-               if (n >= 0) {
-                       name = path + n;
-                       devfs_mk_symlink (NULL, "root", DEVFS_FL_DEFAULT,
-                                         name + 5, NULL, NULL);
-                       memcpy (name, "/dev/", 5);
-                       do_devfs = 1;
-               }
-       }
-       chdir("/dev");
-       unlink("root");
-       mknod("root", S_IFBLK|0600, kdev_t_to_nr(ROOT_DEV));
-       if (do_devfs)
-               mount("devfs", ".", "devfs", 0, NULL);
  retry:
         for (p = fs_names; *p; p += strlen(p)+1) {
-               int err;
-               err = sys_mount(name,"/root",p,root_mountflags,root_mount_data);
+               int err = sys_mount(name, "/root", p, flags, root_mount_data);
                 switch (err) {
                         case 0:
-                               goto done;
+                               goto out;
                         case -EACCES:
-                               root_mountflags |= MS_RDONLY;
+                               flags |= MS_RDONLY;
                                 goto retry;
                         case -EINVAL:
                                 continue;
@@ -375,94 +311,324 @@ retry:
                         kdevname(ROOT_DEV));
         }
         panic("VFS: Unable to mount root fs on %s", kdevname(ROOT_DEV));
-
-done:
+out:
         putname(fs_names);
-       if (do_devfs)
-               umount(".", 0);
+       sys_chdir("/root");
+       ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
+       printk("VFS: Mounted root (%s filesystem)%s.\n",
+               current->fs->pwdmnt->mnt_sb->s_type->name,
+               (current->fs->pwdmnt->mnt_sb->s_flags & MS_RDONLY) ? " readonly" : "");
  }
+ 
+#ifdef CONFIG_ROOT_NFS
+static int __init mount_nfs_root(void)
+{
+       void *data = nfs_root_data();
  
-#ifdef CONFIG_BLK_DEV_INITRD
+       if (data && sys_mount("/dev/root","/root","nfs",root_mountflags,data) == 0)
+               return 1;
+       return 0;
+}
+#endif
  
-static int __init change_root(kdev_t new_root_dev,const char *put_old)
+static int __init create_dev(char *name, kdev_t dev, char *devfs_name)
+{
+       void *handle;
+       char path[64];
+       int n;
+
+       sys_unlink(name);
+       if (!do_devfs)
+               return sys_mknod(name, S_IFBLK|0600, kdev_t_to_nr(dev));
+
+       handle = devfs_find_handle(NULL, dev ? NULL : devfs_name,
+                               MAJOR(dev), MINOR(dev), DEVFS_SPECIAL_BLK, 1);
+       if (!handle)
+               return -1;
+       n = devfs_generate_path(handle, path + 5, sizeof (path) - 5);
+       if (n < 0)
+               return -1;
+       return sys_symlink(path + n + 5, name);
+}
+
+#ifdef CONFIG_MAC_FLOPPY
+int swim3_fd_eject(int devnum);
+#endif
+static void __init change_floppy(char *fmt, ...)
  {
-       struct vfsmount *old_rootmnt;
-       struct nameidata devfs_nd;
-       char *new_devname = kmalloc(strlen("/dev/root.old")+1, GFP_KERNEL);
-       int error = 0;
-
-       if (new_devname)
-               strcpy(new_devname, "/dev/root.old");
-
-       /* .. here is directory mounted over root */
-       mount("..", ".", NULL, MS_MOVE, NULL);
-       chdir("/old");
-
-       read_lock(&current->fs->lock);
-       old_rootmnt = mntget(current->fs->pwdmnt);
-       read_unlock(&current->fs->lock);
-
-       /*  First unmount devfs if mounted  */
-       if (path_init("/old/dev", LOOKUP_FOLLOW|LOOKUP_POSITIVE, &devfs_nd))
-               error = path_walk("/old/dev", &devfs_nd);
-       if (!error) {
-               if (devfs_nd.mnt->mnt_sb->s_magic == DEVFS_SUPER_MAGIC &&
-                   devfs_nd.dentry == devfs_nd.mnt->mnt_root)
-                       umount("/old/dev", 0);
-               path_release(&devfs_nd);
+       extern void wait_for_keypress(void);
+       char buf[80];
+       va_list args;
+       va_start(args, fmt);
+       vsprintf(buf, fmt, args);
+       va_end(args);
+#ifdef CONFIG_BLK_DEV_FD
+       floppy_eject();
+#endif
+#ifdef CONFIG_MAC_FLOPPY
+       swim3_fd_eject(MINOR(ROOT_DEV));
+#endif
+       printk(KERN_NOTICE "VFS: Insert %s and press ENTER\n", buf);
+       wait_for_keypress();
+}
+
+#ifdef CONFIG_BLK_DEV_RAM
+
+static int __init crd_load(int in_fd, int out_fd);
+
+/*
+ * This routine tries to find a RAM disk image to load, and returns the
+ * number of blocks to read for a non-compressed image, 0 if the image
+ * is a compressed image, and -1 if an image with the right magic
+ * numbers could not be found.
+ *
+ * We currently check for the following magic numbers:
+ *     minix
+ *     ext2
+ *     romfs
+ *     gzip
+ */
+static int __init 
+identify_ramdisk_image(int fd, int start_block)
+{
+       const int size = 512;
+       struct minix_super_block *minixsb;
+       struct ext2_super_block *ext2sb;
+       struct romfs_super_block *romfsb;
+       int nblocks = -1;
+       unsigned char *buf;
+
+       buf = kmalloc(size, GFP_KERNEL);
+       if (buf == 0)
+               return -1;
+
+       minixsb = (struct minix_super_block *) buf;
+       ext2sb = (struct ext2_super_block *) buf;
+       romfsb = (struct romfs_super_block *) buf;
+       memset(buf, 0xe5, size);
+
+       /*
+        * Read block 0 to test for gzipped kernel
+        */
+       lseek(fd, start_block * BLOCK_SIZE, 0);
+       read(fd, buf, size);
+
+       /*
+        * If it matches the gzip magic numbers, return -1
+        */
+       if (buf[0] == 037 && ((buf[1] == 0213) || (buf[1] == 0236))) {
+               printk(KERN_NOTICE
+                      "RAMDISK: Compressed image found at block %d\n",
+                      start_block);
+               nblocks = 0;
+               goto done;
         }
  
-       ROOT_DEV = new_root_dev;
-       mount_root();
+       /* romfs is at block zero too */
+       if (romfsb->word0 == ROMSB_WORD0 &&
+           romfsb->word1 == ROMSB_WORD1) {
+               printk(KERN_NOTICE
+                      "RAMDISK: romfs filesystem found at block %d\n",
+                      start_block);
+               nblocks = (ntohl(romfsb->size)+BLOCK_SIZE-1)>>BLOCK_SIZE_BITS;
+               goto done;
+       }
  
-       chdir("/root");
-       ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
-       printk("VFS: Mounted root (%s filesystem)%s.\n",
-               current->fs->pwdmnt->mnt_sb->s_type->name,
-               (current->fs->pwdmnt->mnt_sb->s_flags & MS_RDONLY) ? " readonly" : "");
+       /*
+        * Read block 1 to test for minix and ext2 superblock
+        */
+       lseek(fd, (start_block+1) * BLOCK_SIZE, 0);
+       read(fd, buf, size);
+
+       /* Try minix */
+       if (minixsb->s_magic == MINIX_SUPER_MAGIC ||
+           minixsb->s_magic == MINIX_SUPER_MAGIC2) {
+               printk(KERN_NOTICE
+                      "RAMDISK: Minix filesystem found at block %d\n",
+                      start_block);
+               nblocks = minixsb->s_nzones << minixsb->s_log_zone_size;
+               goto done;
+       }
+
+       /* Try ext2 */
+       if (ext2sb->s_magic == cpu_to_le16(EXT2_SUPER_MAGIC)) {
+               printk(KERN_NOTICE
+                      "RAMDISK: ext2 filesystem found at block %d\n",
+                      start_block);
+               nblocks = le32_to_cpu(ext2sb->s_blocks_count);
+               goto done;
+       }
  
-#if 1
-       shrink_dcache();
-       printk("change_root: old root has d_count=%d\n", 
-              atomic_read(&old_rootmnt->mnt_root->d_count));
+       printk(KERN_NOTICE
+              "RAMDISK: Couldn't find valid RAM disk image starting at %d.\n",
+              start_block);
+       
+done:
+       lseek(fd, start_block * BLOCK_SIZE, 0);
+       kfree(buf);
+       return nblocks;
+}
  #endif
  
-       error = mount("/old", "/root/initrd", NULL, MS_MOVE, NULL);
-       if (error) {
-               int blivet;
-               struct block_device *ramdisk = old_rootmnt->mnt_sb->s_bdev;
-
-               atomic_inc(&ramdisk->bd_count);
-               blivet = blkdev_get(ramdisk, FMODE_READ, 0, BDEV_FS);
-               printk(KERN_NOTICE "Trying to unmount old root ... ");
-               umount("/old", MNT_DETACH);
-               if (!blivet) {
-                       blivet = ioctl_by_bdev(ramdisk, BLKFLSBUF, 0);
-                       blkdev_put(ramdisk, BDEV_FS);
-               }
-               if (blivet) {
-                       printk(KERN_ERR "error %d\n", blivet);
-               } else {
-                       printk("okay\n");
-                       error = 0;
+static int __init rd_load_image(char *from)
+{
+       int res = 0;
+
+#ifdef CONFIG_BLK_DEV_RAM
+       int in_fd, out_fd;
+       int nblocks, rd_blocks, devblocks, i;
+       char *buf;
+       unsigned short rotate = 0;
+#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
+       char rotator[4] = { '|' , '/' , '-' , '\\' };
+#endif
+
+       out_fd = open("/dev/ram", O_RDWR, 0);
+       if (out_fd < 0)
+               goto out;
+
+       in_fd = open(from, O_RDONLY, 0);
+       if (in_fd < 0)
+               goto noclose_input;
+
+       nblocks = identify_ramdisk_image(in_fd, rd_image_start);
+       if (nblocks < 0)
+               goto done;
+
+       if (nblocks == 0) {
+#ifdef BUILD_CRAMDISK
+               if (crd_load(in_fd, out_fd) == 0)
+                       goto successful_load;
+#else
+               printk(KERN_NOTICE
+                      "RAMDISK: Kernel does not support compressed "
+                      "RAM disk images\n");
+#endif
+               goto done;
+       }
+
+       /*
+        * NOTE NOTE: nblocks suppose that the blocksize is BLOCK_SIZE, so
+        * rd_load_image will work only with filesystem BLOCK_SIZE wide!
+        * So make sure to use 1k blocksize while generating ext2fs
+        * ramdisk-images.
+        */
+       if (sys_ioctl(out_fd, BLKGETSIZE, (unsigned long)&rd_blocks) < 0)
+               rd_blocks = 0;
+       else
+               rd_blocks >>= 1;
+
+       if (nblocks > rd_blocks) {
+               printk("RAMDISK: image too big! (%d/%d blocks)\n",
+                      nblocks, rd_blocks);
+               goto done;
+       }
+               
+       /*
+        * OK, time to copy in the data
+        */
+       buf = kmalloc(BLOCK_SIZE, GFP_KERNEL);
+       if (buf == 0) {
+               printk(KERN_ERR "RAMDISK: could not allocate buffer\n");
+               goto done;
+       }
+
+       if (sys_ioctl(in_fd, BLKGETSIZE, (unsigned long)&devblocks) < 0)
+               devblocks = 0;
+       else
+               devblocks >>= 1;
+
+       if (strcmp(from, "/dev/initrd") == 0)
+               devblocks = nblocks;
+
+       if (devblocks == 0) {
+               printk(KERN_ERR "RAMDISK: could not determine device size\n");
+               goto done;
+       }
+
+       printk(KERN_NOTICE "RAMDISK: Loading %d blocks [%d disk%s] into ram disk... ", 
+               nblocks, ((nblocks-1)/devblocks)+1, nblocks>devblocks ? "s" : "");
+       for (i=0; i < nblocks; i++) {
+               if (i && (i % devblocks == 0)) {
+                       printk("done disk #%d.\n", i/devblocks);
+                       rotate = 0;
+                       if (close(in_fd)) {
+                               printk("Error closing the disk.\n");
+                               goto noclose_input;
+                       }
+                       change_floppy("disk #%d", i/devblocks+1);
+                       in_fd = open(from, O_RDONLY, 0);
+                       if (in_fd < 0)  {
+                               printk("Error opening disk.\n");
+                               goto noclose_input;
+                       }
+                       printk("Loading disk #%d... ", i/devblocks+1);
                 }
-       } else {
-               spin_lock(&dcache_lock);
-               if (new_devname) {
-                       void *p = old_rootmnt->mnt_devname;
-                       old_rootmnt->mnt_devname = new_devname;
-                       new_devname = p;
+               read(in_fd, buf, BLOCK_SIZE);
+               write(out_fd, buf, BLOCK_SIZE);
+#if !defined(CONFIG_ARCH_S390) && !defined(CONFIG_PPC_ISERIES)
+               if (!(i % 16)) {
+                       printk("%c\b", rotator[rotate & 0x3]);
+                       rotate++;
                 }
-               spin_unlock(&dcache_lock);
+#endif
         }
+       printk("done.\n");
+       kfree(buf);
  
-       /* put the old stuff */
-       mntput(old_rootmnt);
-       kfree(new_devname);
-       return error;
+successful_load:
+       res = 1;
+done:
+       close(in_fd);
+noclose_input:
+       close(out_fd);
+out:
+       sys_unlink("/dev/ram");
+#endif
+       return res;
+}
+
+static int __init rd_load_disk(int n)
+{
+#ifdef CONFIG_BLK_DEV_RAM
+       extern int rd_prompt;
+       if (rd_prompt)
+               change_floppy("root floppy disk to be loaded into RAM disk");
+       create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, n), NULL);
+#endif
+       return rd_load_image("/dev/root");
  }
  
+static void __init mount_root(void)
+{
+#ifdef CONFIG_ROOT_NFS
+       if (MAJOR(ROOT_DEV) == UNNAMED_MAJOR) {
+               if (mount_nfs_root()) {
+                       sys_chdir("/root");
+                       ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
+                       printk("VFS: Mounted root (nfs filesystem).\n");
+                       return;
+               }
+               printk(KERN_ERR "VFS: Unable to mount root fs via NFS, trying floppy.\n");
+               ROOT_DEV = MKDEV(FLOPPY_MAJOR, 0);
+       }
  #endif
+       devfs_make_root(root_device_name);
+       create_dev("/dev/root", ROOT_DEV, root_device_name);
+#ifdef CONFIG_BLK_DEV_FD
+       if (MAJOR(ROOT_DEV) == FLOPPY_MAJOR) {
+               /* rd_doload is 2 for a dual initrd/ramload setup */
+               if (rd_doload==2) {
+                       if (rd_load_disk(1)) {
+                               ROOT_DEV = MKDEV(RAMDISK_MAJOR, 1);
+                               create_dev("/dev/root", ROOT_DEV, NULL);
+                       }
+               } else
+                       change_floppy("root floppy");
+       }
+#endif
+       mount_block_root("/dev/root", root_mountflags);
+}
  
  #ifdef CONFIG_BLK_DEV_INITRD
  static int do_linuxrc(void * shell)
@@ -470,9 +636,9 @@ static int do_linuxrc(void * shell)
         static char *argv[] = { "linuxrc", NULL, };
         extern char * envp_init[];
  
-       chdir("/root");
-       mount(".", "/", NULL, MS_MOVE, NULL);
-       chroot(".");
+       sys_chdir("/root");
+       sys_mount(".", "/", NULL, MS_MOVE, NULL);
+       sys_chroot(".");
  
         mount_devfs_fs ();
  
@@ -486,76 +652,247 @@ static int do_linuxrc(void * shell)
  
  #endif
  
+static void __init handle_initrd(void)
+{
+#ifdef CONFIG_BLK_DEV_INITRD
+       int ram0 = kdev_t_to_nr(MKDEV(RAMDISK_MAJOR,0));
+       int error;
+       int i, pid;
+
+       create_dev("/dev/root.old", ram0, NULL);
+       mount_block_root("/dev/root.old", root_mountflags & ~MS_RDONLY);
+       sys_mkdir("/old", 0700);
+       sys_chdir("/old");
+
+       pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
+       if (pid > 0) {
+               while (pid != wait(&i)) {
+                       current->policy |= SCHED_YIELD;
+                       schedule();
+               }
+       }
+
+       sys_mount("..", ".", NULL, MS_MOVE, NULL);
+       sys_umount("/old/dev", 0);
+
+       if (real_root_dev == ram0) {
+               sys_chdir("/old");
+               return;
+       }
+
+       ROOT_DEV = real_root_dev;
+       mount_root();
+
+       printk(KERN_NOTICE "Trying to move old root to /initrd ... ");
+       error = sys_mount("/old", "/root/initrd", NULL, MS_MOVE, NULL);
+       if (!error)
+               printk("okay\n");
+       else {
+               int fd = open("/dev/root.old", O_RDWR, 0);
+               printk("failed\n");
+               printk(KERN_NOTICE "Unmounting old root\n");
+               sys_umount("/old", MNT_DETACH);
+               printk(KERN_NOTICE "Trying to free ramdisk memory ... ");
+               if (fd < 0) {
+                       error = fd;
+               } else {
+                       error = sys_ioctl(fd, BLKFLSBUF, 0);
+                       close(fd);
+               }
+               printk(error ? "okay\n" : "failed\n");
+       }
+#endif
+}
+
+static int __init initrd_load(void)
+{
+#ifdef CONFIG_BLK_DEV_INITRD
+       create_dev("/dev/ram", MKDEV(RAMDISK_MAJOR, 0), NULL);
+       create_dev("/dev/initrd", MKDEV(RAMDISK_MAJOR, INITRD_MINOR), NULL);
+#endif
+       return rd_load_image("/dev/initrd");
+}
+
  /*
   * Prepare the namespace - decide what/where to mount, load ramdisks, etc.
   */
  void prepare_namespace(void)
  {
+       int do_initrd = 0;
+       int is_floppy = MAJOR(ROOT_DEV) == FLOPPY_MAJOR;
  #ifdef CONFIG_BLK_DEV_INITRD
-       int real_root_mountflags = root_mountflags;
         if (!initrd_start)
                 mount_initrd = 0;
         if (mount_initrd)
-               root_mountflags &= ~MS_RDONLY;
+               do_initrd = 1;
         real_root_dev = ROOT_DEV;
  #endif
-       mkdir("/dev", 0700);
-       mkdir("/root", 0700);
-
-#ifdef CONFIG_BLK_DEV_RAM
-#ifdef CONFIG_BLK_DEV_INITRD
-       if (mount_initrd)
-               initrd_load();
-       else
-#endif
-       rd_load();
+       sys_mkdir("/dev", 0700);
+       sys_mkdir("/root", 0700);
+#ifdef CONFIG_DEVFS_FS
+       sys_mount("devfs", "/dev", "devfs", 0, NULL);
+       do_devfs = 1;
  #endif
  
-       /* Mount the root filesystem.. */
+       create_dev("/dev/root", ROOT_DEV, NULL);
+       if (do_initrd) {
+               if (initrd_load() && ROOT_DEV != MKDEV(RAMDISK_MAJOR, 0)) {
+                       handle_initrd();
+                       goto out;
+               }
+       } else if (is_floppy && rd_doload && rd_load_disk(0))
+               ROOT_DEV = MKDEV(RAMDISK_MAJOR, 0);
         mount_root();
-       chdir("/root");
-       ROOT_DEV = current->fs->pwdmnt->mnt_sb->s_dev;
-       printk("VFS: Mounted root (%s filesystem)%s.\n",
-               current->fs->pwdmnt->mnt_sb->s_type->name,
-               (current->fs->pwdmnt->mnt_sb->s_flags & MS_RDONLY) ? " readonly" : "");
+out:
+       sys_umount("/dev", 0);
+       sys_mount(".", "/", NULL, MS_MOVE, NULL);
+       sys_chroot(".");
+       mount_devfs_fs ();
+}
  
-#ifdef CONFIG_BLK_DEV_INITRD
-       root_mountflags = real_root_mountflags;
-       if (mount_initrd && ROOT_DEV != real_root_dev
-           && MAJOR(ROOT_DEV) == RAMDISK_MAJOR && MINOR(ROOT_DEV) == 0) {
-               int error;
-               int i, pid;
-               mkdir("/old", 0700);
-               chdir("/old");
-
-               pid = kernel_thread(do_linuxrc, "/linuxrc", SIGCHLD);
-               if (pid > 0) {
-                       while (pid != wait(&i)) {
-                               current->policy |= SCHED_YIELD;
-                               schedule();
-                       }
-               }
-               if (MAJOR(real_root_dev) != RAMDISK_MAJOR
-                    || MINOR(real_root_dev) != 0) {
-                       error = change_root(real_root_dev,"/initrd");
-                       if (error)
-                               printk(KERN_ERR "Change root to /initrd: "
-                                   "error %d\n",error);
-
-                       chdir("/root");
-                       mount(".", "/", NULL, MS_MOVE, NULL);
-                       chroot(".");
-
-                       mount_devfs_fs ();
-                       return;
-               }
-               chroot("..");
-               chdir("/");
-               return;
-       }
+#ifdef BUILD_CRAMDISK
+
+/*
+ * gzip declarations
+ */
+
+#define OF(args)  args
+
+#ifndef memzero
+#define memzero(s, n)     memset ((s), 0, (n))
  #endif
-       mount(".", "/", NULL, MS_MOVE, NULL);
-       chroot(".");
  
-       mount_devfs_fs ();
+typedef unsigned char  uch;
+typedef unsigned short ush;
+typedef unsigned long  ulg;
+
+#define INBUFSIZ 4096
+#define WSIZE 0x8000    /* window size--must be a power of two, and */
+                       /*  at least 32K for zip's deflate method */
+
+static uch *inbuf;
+static uch *window;
+
+static unsigned insize;  /* valid bytes in inbuf */
+static unsigned inptr;   /* index of next byte to be processed in inbuf */
+static unsigned outcnt;  /* bytes in output buffer */
+static int exit_code;
+static long bytes_out;
+static int crd_infd, crd_outfd;
+
+#define get_byte()  (inptr < insize ? inbuf[inptr++] : fill_inbuf())
+               
+/* Diagnostic functions (stubbed out) */
+#define Assert(cond,msg)
+#define Trace(x)
+#define Tracev(x)
+#define Tracevv(x)
+#define Tracec(c,x)
+#define Tracecv(c,x)
+
+#define STATIC static
+
+static int  fill_inbuf(void);
+static void flush_window(void);
+static void *malloc(int size);
+static void free(void *where);
+static void error(char *m);
+static void gzip_mark(void **);
+static void gzip_release(void **);
+
+#include "../lib/inflate.c"
+
+static void __init *malloc(int size)
+{
+       return kmalloc(size, GFP_KERNEL);
+}
+
+static void __init free(void *where)
+{
+       kfree(where);
+}
+
+static void __init gzip_mark(void **ptr)
+{
  }
+
+static void __init gzip_release(void **ptr)
+{
+}
+
+
+/* ===========================================================================
+ * Fill the input buffer. This is called only when the buffer is empty
+ * and at least one byte is really needed.
+ */
+static int __init fill_inbuf(void)
+{
+       if (exit_code) return -1;
+       
+       insize = read(crd_infd, inbuf, INBUFSIZ);
+       if (insize == 0) return -1;
+
+       inptr = 1;
+
+       return inbuf[0];
+}
+
+/* ===========================================================================
+ * Write the output window window[0..outcnt-1] and update crc and bytes_out.
+ * (Used for the decompressed data only.)
+ */
+static void __init flush_window(void)
+{
+    ulg c = crc;         /* temporary variable */
+    unsigned n;
+    uch *in, ch;
+    
+    write(crd_outfd, window, outcnt);
+    in = window;
+    for (n = 0; n < outcnt; n++) {
+           ch = *in++;
+           c = crc_32_tab[((int)c ^ ch) & 0xff] ^ (c >> 8);
+    }
+    crc = c;
+    bytes_out += (ulg)outcnt;
+    outcnt = 0;
+}
+
+static void __init error(char *x)
+{
+       printk(KERN_ERR "%s", x);
+       exit_code = 1;
+}
+
+static int __init crd_load(int in_fd, int out_fd)
+{
+       int result;
+
+       insize = 0;             /* valid bytes in inbuf */
+       inptr = 0;              /* index of next byte to be processed in inbuf */
+       outcnt = 0;             /* bytes in output buffer */
+       exit_code = 0;
+       bytes_out = 0;
+       crc = (ulg)0xffffffffL; /* shift register contents */
+
+       crd_infd = in_fd;
+       crd_outfd = out_fd;
+       inbuf = kmalloc(INBUFSIZ, GFP_KERNEL);
+       if (inbuf == 0) {
+               printk(KERN_ERR "RAMDISK: Couldn't allocate gzip buffer\n");
+               return -1;
+       }
+       window = kmalloc(WSIZE, GFP_KERNEL);
+       if (window == 0) {
+               printk(KERN_ERR "RAMDISK: Couldn't allocate gzip window\n");
+               kfree(inbuf);
+               return -1;
+       }
+       makecrc();
+       result = gunzip();
+       kfree(inbuf);
+       kfree(window);
+       return result;
+}
+
+#endif  /* BUILD_CRAMDISK */
diff --git a/mm/memory.c b/mm/memory.c

index cd99761be47571c1c64bd5f4b995b5d537be1e3f..1315130e918fe602c89e6ffded716b5d4d6c6314 100644 (file)
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1221,8 +1221,10 @@ static int do_no_page(struct mm_struct * mm, struct vm_area_struct * vma,
          */
         if (write_access && !(vma->vm_flags & VM_SHARED)) {
                 struct page * page = alloc_page(GFP_HIGHUSER);
-               if (!page)
+               if (!page) {
+                       page_cache_release(new_page);
                         return -1;
+               }
                 copy_highpage(page, new_page);
                 page_cache_release(new_page);
                 lru_cache_add(page);
diff --git a/mm/mempool.c b/mm/mempool.c

index 8116cac13cf41b706da34bf90e3522d75bb3d917..0c0bf99965ca11b449a9f714e3b5f7646e212873 100644 (file)
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -1,9 +1,9 @@
  /*
   *  linux/mm/mempool.c
   *
- *  memory buffer pool support. Such pools are mostly used to
- *  guarantee deadlock-free IO operations even during extreme
- *  VM load.
+ *  memory buffer pool support. Such pools are mostly used
+ *  for guaranteed, deadlock-free memory allocations during
+ *  extreme VM load.
   *
   *  started by Ingo Molnar, Copyright (C) 2001
   */
@@ -74,6 +74,71 @@ mempool_t * mempool_create(int min_nr, mempool_alloc_t *alloc_fn,
         return pool;
  }
  
+/**
+ * mempool_resize - resize an existing memory pool
+ * @pool:       pointer to the memory pool which was allocated via
+ *              mempool_create().
+ * @new_min_nr: the new minimum number of elements guaranteed to be
+ *              allocated for this pool.
+ * @gfp_mask:   the usual allocation bitmask.
+ *
+ * This function shrinks/grows the pool. In the case of growing,
+ * it cannot be guaranteed that the pool will be grown to the new
+ * size immediately, but new mempool_free() calls will refill it.
+ *
+ * Note, the caller must guarantee that no mempool_destroy is called
+ * while this function is running. mempool_alloc() & mempool_free()
+ * might be called (eg. from IRQ contexts) while this function executes.
+ */
+void mempool_resize(mempool_t *pool, int new_min_nr, int gfp_mask)
+{
+       int delta;
+       void *element;
+       unsigned long flags;
+       struct list_head *tmp;
+
+       if (new_min_nr <= 0)
+               BUG();
+
+       spin_lock_irqsave(&pool->lock, flags);
+       if (new_min_nr < pool->min_nr) {
+               pool->min_nr = new_min_nr;
+               /*
+                * Free possible excess elements.
+                */
+               while (pool->curr_nr > pool->min_nr) {
+                       tmp = pool->elements.next;
+                       if (tmp == &pool->elements)
+                               BUG();
+                       list_del(tmp);
+                       element = tmp;
+                       pool->curr_nr--;
+                       spin_unlock_irqrestore(&pool->lock, flags);
+
+                       pool->free(element, pool->pool_data);
+
+                       spin_lock_irqsave(&pool->lock, flags);
+               }
+               spin_unlock_irqrestore(&pool->lock, flags);
+               return;
+       }
+       delta = new_min_nr - pool->min_nr;
+       pool->min_nr = new_min_nr;
+       spin_unlock_irqrestore(&pool->lock, flags);
+
+       /*
+        * We refill the pool up to the new treshold - but we dont
+        * (cannot) guarantee that the refill succeeds.
+        */
+       while (delta) {
+               element = pool->alloc(gfp_mask, pool->pool_data);
+               if (!element)
+                       break;
+               mempool_free(element, pool);
+               delta--;
+       }
+}
+
  /**
   * mempool_destroy - deallocate a memory pool
   * @pool:      pointer to the memory pool which was allocated via
@@ -110,7 +175,7 @@ void mempool_destroy(mempool_t *pool)
   * @gfp_mask:  the usual allocation bitmask.
   *
   * this function only sleeps if the alloc_fn function sleeps or
- * returns NULL. Note that due to preallocation guarantees this function
+ * returns NULL. Note that due to preallocation, this function
   * *never* fails.
   */
  void * mempool_alloc(mempool_t *pool, int gfp_mask)
@@ -175,7 +240,7 @@ repeat_alloc:
  
  /**
   * mempool_free - return an element to the pool.
- * @gfp_mask:  pool element pointer.
+ * @element:   pool element pointer.
   * @pool:      pointer to the memory pool which was allocated via
   *             mempool_create().
   *
@@ -200,6 +265,7 @@ void mempool_free(void *element, mempool_t *pool)
  }
  
  EXPORT_SYMBOL(mempool_create);
+EXPORT_SYMBOL(mempool_resize);
  EXPORT_SYMBOL(mempool_destroy);
  EXPORT_SYMBOL(mempool_alloc);
  EXPORT_SYMBOL(mempool_free);
author	Linus Torvalds <torvalds@athlon.transmeta.com>
	Tue, 5 Feb 2002 07:59:01 +0000 (23:59 -0800)
committer	Linus Torvalds <torvalds@athlon.transmeta.com>
	Tue, 5 Feb 2002 07:59:01 +0000 (23:59 -0800)
Documentation/driver-model.txt	[new file with mode: 0644]	patch \| blob
Documentation/filesystems/driverfs.txt	[new file with mode: 0644]	patch \| blob
Makefile		patch \| blob \| history
arch/i386/lib/iodebug.c		patch \| blob \| history
drivers/block/cciss.c		patch \| blob \| history
drivers/block/cciss.h		patch \| blob \| history
drivers/block/cpqarray.c		patch \| blob \| history
drivers/block/cpqarray.h		patch \| blob \| history
drivers/block/floppy.c		patch \| blob \| history
drivers/block/ll_rw_blk.c		patch \| blob \| history
drivers/block/nbd.c		patch \| blob \| history
drivers/block/paride/pcd.c		patch \| blob \| history
drivers/block/paride/pf.c		patch \| blob \| history
drivers/block/ps2esdi.c		patch \| blob \| history
drivers/block/rd.c		patch \| blob \| history
drivers/ide/ide-probe.c		patch \| blob \| history
drivers/ide/ide.c		patch \| blob \| history
drivers/md/linear.c		patch \| blob \| history
drivers/md/md.c		patch \| blob \| history
drivers/md/raid0.c		patch \| blob \| history
drivers/md/raid1.c		patch \| blob \| history
drivers/net/tulip/ChangeLog		patch \| blob \| history
drivers/net/tulip/eeprom.c		patch \| blob \| history
drivers/net/tulip/media.c		patch \| blob \| history
drivers/net/tulip/timer.c		patch \| blob \| history
drivers/net/tulip/tulip_core.c		patch \| blob \| history
drivers/scsi/eata.c		patch \| blob \| history
drivers/scsi/eata.h		patch \| blob \| history
drivers/scsi/scsi.c		patch \| blob \| history
drivers/scsi/scsi_error.c		patch \| blob \| history
drivers/scsi/scsi_lib.c		patch \| blob \| history
drivers/scsi/scsi_merge.c		patch \| blob \| history
drivers/scsi/scsi_queue.c		patch \| blob \| history
drivers/scsi/u14-34f.c		patch \| blob \| history
drivers/scsi/u14-34f.h		patch \| blob \| history
fs/bio.c		patch \| blob \| history
fs/block_dev.c		patch \| blob \| history
fs/buffer.c		patch \| blob \| history
fs/ufs/inode.c		patch \| blob \| history
include/asm-i386/io.h		patch \| blob \| history
include/asm-s390/io.h		patch \| blob \| history
include/asm-s390x/io.h		patch \| blob \| history
include/linux/blkdev.h		patch \| blob \| history
include/linux/devfs_fs_kernel.h		patch \| blob \| history
include/linux/ide.h		patch \| blob \| history
include/linux/mempool.h		patch \| blob \| history
include/linux/nbd.h		patch \| blob \| history
include/linux/raid/md.h		patch \| blob \| history
include/linux/raid/md_compatible.h	[deleted file]	patch \| blob \| history
include/linux/raid/md_k.h		patch \| blob \| history
include/linux/raid/raid1.h		patch \| blob \| history
init/do_mounts.c		patch \| blob \| history
mm/memory.c		patch \| blob \| history
mm/mempool.c		patch \| blob \| history