Questions about EROS

Iain McClatchie (iainmcc@ix.netcom.com)
Wed, 23 Jun 1999 17:10:19 -0700


Jonathan,

The discussion about EROS has been quite entertaining. I have three
questions.

1. Massive temporary computation caches
2. Software bugs and robust versus execution states
3. Allocation policy

-----
Massive temporary computation caches

I run an application which uses Binary Decision DAGs (BDDs) in the
optimization of netlists. On one small testcase, this application
grabs 100 MB of memory, which it overwrites almost 1000 times over the
course of 10 minutes. When I imagine running this application with
the BDDs in persistent storage, checkpointed once every five minutes,
I imagine one of two things will happen:

Either the application will grab less than half of the available
physical RAM, in which case half the RAM will store the checkpoint
being written and the other half will store the active process state,
or the application will come to a dead stop every five minutes when it
runs out of memory attempting to copy 96MB of its COW pages. It will
stay stopped like this for 10 to 15 seconds every 5 minutes, which is
about a 5% slowdown, or half the difference between a $550 500MHz PIII
and a $330 450MHz PIII. Either way it appears that persistence will
cost approximately $100 for this admittedly small testcase.

It seems that the checkpointing of massive temporary computation
caches is fairly expensive. Do you folks on EROS have some ideas
about how to mitigate this cost? For instance, how would an
application like Navigator cache the decompressed versions of the JPEG
pictures it presents frequently?

-----

Software bugs and robust versus execution states

I use another wonderful application, MAGIC, to edit the layout for my
circuits. MAGIC crashes every month or two due to software problems,
and yet I have little difficulty maintaining, uncorrupted, the state
of hundreds of interconnected layout cells. It appears to me that
this ease of use stems from having two representations of the layout
cells: one on disk, in ASCII, which is terse, simple, and robust, and
another in memory which is larger, complex, fast, but more prone to
accidental corruption.

How do persistent object systems contain software errors? At first
glance, it seems to me that you would need, at the very least, some
form of this dual data representation, along with its attendant
operator-directed translations between the two, and a way of putting
at least one representation into a seperately controlled, global
namespace. Having done this, you've recapitulated the programming
effort of having a file system and interfaces to that file system from
every program. Wasn't avoiding this effort one of the proposed
benefits of a persistent object system in the first place?

-----

Allocation policy

A few months ago we had a wonderful debate on linux-kernel about the
goodness of reiserfs, which is a new filesystem being developed by
Hans Reiser. I took away from that debate a notion that any decently
designed filesystem can find and read data already on the disk in a
minimum of seeks. The number of seeks, and the overall performance of
the filesystem is determined by their allocation policy: how they lay
the data down on the disk.

The filesystem in a persistent object system has the same problem
presented to it as the filesystem in a UNIX system, except (a) it is
also expected to store many short-lived objects, and (b) the entire
contents of memory are regularly swapped to disk. It seems that both
(a) and, to a lesser extent, (b) would form a kind of noise would tend
to swamp the signal that your filesystem allocator is looking for. If
you think of the allocator as being something like the malloc()
algorithm in a garbage collector, the programs in a UNIX system mostly
seperate the long-lived objects from the short-lived objects before
they get to the allocator.

Does EROS have some sort of system for implicitly or explicitly
recognising long-lived versus short-lived objects?

-Iain
iainmcc@ix.netcom.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/