Re: inode highmem imbalance fix [Re: Bug with shared memory.]

Ed Tomlinson (tomlins@cam.org)
Fri, 24 May 2002 07:47:19 -0400


I Wonder if the changes I posted putting slab pages in the lru
would help? For now the patch is against rmap (pre8-ac5). I
would be very interested in hearing about results testing it.
See subject:

Re: [RFC][PATCH] using page aging to shrink caches (pre8-ac5)

Comments?

Ed Tomlinson

Andrea Arcangeli wrote:

> On Mon, May 20, 2002 at 06:30:40AM +0200, Andrea Arcangeli wrote:
>> As next thing I'll go ahead on the inode/highmem imbalance repored by
>> Alexey in the weekend. Then the only pending thing before next -aa is
>
> Here it is, you should apply it together with vm-35 that you need too
> for the bh/highmem balance (or on top of 2.4.19pre8aa3). I tested it
> slightly on uml and it didn't broke so far, so be careful because it's not
> very well tested yet. On the lines of what Alexey suggested originally,
> if goal isn't reached, in a second pass we shrink the cache too, but
> only if the cache is the only reason for the "pinning" beahiour of the
> inode. If for example there are dirty blocks of metadata or of data
> belonging to the inode we wakeup_bdflush instead and we never shrink the
> cache in such case. If the inode itself is dirty as well we let the two
> passes fail so we will schedule the work for keventd. This logic should
> ensure we never fall into shrinking the cache for no good reason and
> that we free the cache only for the inodes that we actually go ahead and
> free. (basically only dirty pages set with SetPageDirty aren't trapped
> by the logic before calling the invalidate, like ramfs, but that's
> expected of course, those pages cannot be damaged by the non destructive
> invalidate anyways)
>
> Comments?
>
> --- 2.4.19pre8aa4/fs/inode.c.~1~ Fri May 24 03:17:10 2002
> +++ 2.4.19pre8aa4/fs/inode.c Fri May 24 05:03:54 2002
> @@ -672,35 +672,87 @@ void prune_icache(int goal)
> {
> LIST_HEAD(list);
> struct list_head *entry, *freeable = &list;
> - int count;
> + int count, pass;
> struct inode * inode;
>
> - spin_lock(&inode_lock);
> + count = pass = 0;
>
> - count = 0;
> - entry = inode_unused.prev;
> - while (entry != &inode_unused)
> - {
> - struct list_head *tmp = entry;
> + spin_lock(&inode_lock);
> + while (goal && pass++ < 2) {
> + entry = inode_unused.prev;
> + while (entry != &inode_unused)
> + {
> + struct list_head *tmp = entry;
>
> - entry = entry->prev;
> - inode = INODE(tmp);
> - if (inode->i_state & (I_FREEING|I_CLEAR|I_LOCK))
> - continue;
> - if (!CAN_UNUSE(inode))
> - continue;
> - if (atomic_read(&inode->i_count))
> - continue;
> - list_del(tmp);
> - list_del(&inode->i_hash);
> - INIT_LIST_HEAD(&inode->i_hash);
> - list_add(tmp, freeable);
> - inode->i_state |= I_FREEING;
> - count++;
> - if (!--goal)
> - break;
> + entry = entry->prev;
> + inode = INODE(tmp);
> + if (inode->i_state & (I_FREEING|I_CLEAR|I_LOCK))
> + continue;
> + if (atomic_read(&inode->i_count))
> + continue;
> + if (pass == 2 && !inode->i_state && !CAN_UNUSE(inode)) {
> + if (inode_has_buffers(inode))
> + /*
> + * If the inode has dirty buffers
> + * pending, start flushing out bdflush.ndirty
> + * worth of data even if there's no dirty-memory
> + * pressure. Do nothing else in this
> + * case, until all dirty buffers are gone
> + * we can do nothing about the inode other than
> + * to keep flushing dirty stuff. We could also
> + * flush only the dirty buffers in the inode
> + * but there's no API to do it asynchronously
> + * and this simpler approch to deal with the
> + * dirty payload shouldn't make much difference
> + * in practice. Also keep in mind if somebody
> + * keeps overwriting data in a flood we'd
> + * never manage to drop the inode anyways,
> + * and we really shouldn't do that because
> + * it's an heavily used one.
> + */
> + wakeup_bdflush();
> + else if (inode->i_data.nrpages)
> + /*
> + * If we're here it means the only reason
> + * we cannot drop the inode is that its
> + * due its pagecache so go ahead and trim it
> + * hard. If it doesn't go away it means
> + * they're dirty or dirty/pinned pages ala
> + * ramfs.
> + *
> + * invalidate_inode_pages() is a non
> + * blocking operation but we introduce
> + * a dependency order between the
> + * inode_lock and the pagemap_lru_lock,
> + * the inode_lock must always be taken
> + * first from now on.
> + */
> + invalidate_inode_pages(inode);
> + }
> + if (!CAN_UNUSE(inode))
> + continue;
> + list_del(tmp);
> + list_del(&inode->i_hash);
> + INIT_LIST_HEAD(&inode->i_hash);
> + list_add(tmp, freeable);
> + inode->i_state |= I_FREEING;
> + count++;
> + if (!--goal)
> + break;
> + }
> }
> inodes_stat.nr_unused -= count;
> +
> + /*
> + * the unused list is hardly an LRU so it makes
> + * more sense to rotate it so we don't bang
> + * always on the same inodes in case they're
> + * unfreeable for whatever reason.
> + */
> + if (entry != &inode_unused) {
> + list_del(&inode_unused);
> + list_add(&inode_unused, entry);
> + }
> spin_unlock(&inode_lock);
>
> dispose_list(freeable);
>
> Andrea
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/