Implement a "mapping change" notification for virtual lookup caches, and
make the futex code use that to keep the futex page pinning consistent
across copy-on-write events in the VM space.
x86-64 has an early console implementation which runs before the normal
console is initialized. To avoid duplicated output it needs to be
disabled when the real console starts. This patch adds an function call
for that to the appropiate part of console_init.
Add an AGP driver for the AGP aperture in the northbridge of the AMD Hammer.
The AGP driver works for both 32bit and 64bit kernels.
It also adds some hooks to the AGP driver to allow the x86-64 GART based
IOMMU code to share the aperture with AGP. The hooks are intentionally kept
minimalistic. In addition it needs some Config.in hackery, because AGP cannot
be modular in this case, because the IOMMU needs to control its startup and
it runs early when PCI is initialized.
The original AGP driver was done by Dave Jones, I added the IOMMU support.
This fixes a problem with the deadline io scheduler, if the correct
insertion point is at the front of the list. This is something that we
never have gotten right in 2.4 either.
The problem is that the elevator merge function has to return a pointer
to a struct request, and for front insert we really have to return the
head of the list which cannot be expressed as a request of course.
The real issue is that the elevator_merge function actually performs two
functions - it scans for a merge, and if it can't find any, it selects
and insertion point. It's done this way for efficiency reasons, even if
the design isn't all that clean.
So we change the io scheduler merge functions to get passed a pointer to
a list_head pointer instead. This works for both inserts and merges.
In addition, deadline checks if it really should insert at the very
front.
Also don't pass in request to elv_try_last_merge(), the very name of the
function suggests that it's q->last_merge that we are interested in.
Tim Schmielau [Thu, 26 Sep 2002 16:01:59 +0000 (09:01 -0700)]
[PATCH] fix compares of jiffies
on rechecking the current stable kernel code, I found some places where jiffies
were compared in a way that seems to break when they wrap. For these,
I made up patches to use the macros "time_before()" or "time_after()"
that are supposed to handle wraparound correctly.
Brian Hall [Thu, 26 Sep 2002 16:01:20 +0000 (09:01 -0700)]
[PATCH] Update for JMTek USBDrive
Attached is a patch against the 2.4.19 linux kernel. It adds an entry
for another version of the JMTek USBDrive (driverless), and also updates
my email address.
Andrew Morton [Thu, 26 Sep 2002 14:51:52 +0000 (07:51 -0700)]
[PATCH] export test_clear_page_dirty() to modules.
- XFS has started to use clear_page_dirty(), so we should export
test_clear_page_dirty() to modules.
This function is ued by the inlined clear_page_dirty(). It marks a
page clean and updates the global dirty memory accounting. Anyone
who cleans pagecache pages should use this, so the export makes
sense. Can't implement aops->writepages() without it, really.
- __mark_inode_dirty is no longer called under mapping->private_lock.
Update comment.
Dave Kleikamp [Thu, 26 Sep 2002 07:15:35 +0000 (02:15 -0500)]
JFS: detect and fix invalid directory index values
The directory index values are the unique cookies used to resume
a readdir at the proper place. These are stored with each entry
in a directory. fsck.jfs does not currently validate these entries,
nor even create them when populating the lost+found directory.
This patch causes readdir to detect the invalid cookies, and generate
new ones, if possible.
Don Dugger [Thu, 26 Sep 2002 05:45:17 +0000 (22:45 -0700)]
[PATCH] ia64: Implement ia32 emulation for SG_IO.
Attached is a kernel patch that should fix the SG_IO ioctl call for
IA32 programs. If you could test it out and let me know how it works
that would be a big help. I don't have a test program so I haven't
tested it myself but I think it should be correct, I just lifted
code from the sparc64 port that does the same thing.
Some various small cleanups, optimizations, and fixes.
o Make fifo_batch=32 as default, from testing this appears a good
default value. We still get good throughput, and latency is good.
o Reintroduce the merge_cleanup logic. We need it for deadline for
rehashing requests when they have been merged.
o Cleanup last_merge logic. Move it to the new elv_merged_request(),
this is where it really belongs. Doing it inside the io scheduler core
can causes false positives, when the queue merge functions reject an
otherwise good merge
o Have deadline_move_requests() account from last entry on the dispatch
queue, if it is non-empty. It doesn't really matter what the last
extracted sector was, if we are not right behind it.
o Clean/optimize deadline_move_requests()
o Account size of a request just a little bit. Streaming transfer isn't
for free, it's just a lot cheaper than a seek.
Stupid me, this is really needed, IPX as it supports several datalink_protos
and needs pt->type to find the right interface. Appletalk doesn't care, so
it worked without this. And these are the only snap users in the kernel.
Andrew Morton [Wed, 25 Sep 2002 14:22:24 +0000 (07:22 -0700)]
[PATCH] tighter locking in pdflush
Had a weird oops from Bill Irwin - the pdflush_list was corrupt.
The only thing I can think of is that something sprayed out a wakeup
when it shouldn't. So tighten things up against that, and add some
printks to catch it if it happens again.
Andrew Morton [Wed, 25 Sep 2002 14:22:19 +0000 (07:22 -0700)]
[PATCH] speed up sys_sync()
Well it's a one-liner. sys_sync() only syncs one queue at a time, and
can be slow if you have a lot of disks. So poke pdflush, which knows
how to write all the queues in parallel.
Andrew Morton [Wed, 25 Sep 2002 14:20:23 +0000 (07:20 -0700)]
[PATCH] increase traffic on linux-kernel
[This has four scalps already. Thomas Molina has agreed
to track things as they are identified ]
Infrastructure to detect sleep-inside-spinlock bugs. Really only
useful if compiled with CONFIG_PREEMPT=y. It prints out a whiny
message and a stack backtrace if someone calls a function which might
sleep from within an atomic region.
This patch generates a storm of output at boot, due to
drivers/ide/ide-probe.c:init_irq() calling lots of things which it
shouldn't under ide_lock.
Andrew Morton [Wed, 25 Sep 2002 14:20:18 +0000 (07:20 -0700)]
[PATCH] slab reclaim balancing
A patch from Ed Tomlinson which improves the way in which the kernel
reclaims slab objects.
The theory is: a cached object's usefulness is measured in terms of the
number of disk seeks which it saves. Furthermore, we assume that one
dentry or inode saves as many seeks as one pagecache page.
So we reap slab objects at the same rate as we reclaim pages. For each
1% of reclaimed pagecache we reclaim 1% of slab. (Actually, we _scan_
1% of slab for each 1% of scanned pages).
Furthermore we assume that one swapout costs twice as many seeks as one
pagecache page, and twice as many seeks as one slab object. So we
double the pressure on slab when anonymous pages are being considered
for eviction.
The code works nicely, and smoothly. Possibly it does not shrink slab
hard enough, but that is now very easy to tune up and down. It is just:
ratio *= 3;
in shrink_caches().
Slab caches no longer hold onto completely empty pages. Instead, pages
are freed as soon as they have zero objects. This is possibly a
performance hit for slabs which have constructors, but it's doubtful.
Most allocations after a batch of frees are satisfied from inside
internally-fragmented pages and by the time slab gets back onto using
the wholly-empty pages they'll be cache-cold. slab would be better off
going and requesting a new, cache-warm page and reconstructing the
objects therein. (Once we have the per-cpu hot-page allocator in
place. It's happening).
As a consequence of the above, kmem_cache_shrink() is now unused. No
great loss there - the serialising effect of kmem_cache_shrink and its
semaphore in front of page reclaim was measurably bad.
Still todo:
- batch up the shrinking so we don't call into prune_dcache and
friends at high frequency asking for a tiny number of objects.
- Maybe expose the shrink ratio via a tunable.
- clean up slab.c
- highmem page reclaim in prune_icache: highmem pages can pin
inodes.
Andrew Morton [Wed, 25 Sep 2002 14:20:13 +0000 (07:20 -0700)]
[PATCH] use prepare_to_wait in VM/VFS
This uses the new wakeup machinery in some hot parts of the VFS and
block layers.
wait_on_buffer(), wait_on_page(), lock_page(), blk_congestion_wait().
Also in get_request_wait(), although the benefit for exclusive wakeups
will be lower.