Re: Super Lint

Magnus Danielson (cfmd@swipnet.se)
Wed, 05 Jan 2000 07:48:47 +0100


From: Dale Amon <amon@vnl.com>
Subject: Re: Super Lint
Date: Tue, 4 Jan 2000 22:22:34 +0000 (GMT)

> I'm glad to see Martin Dalecki bringing the discussion
> back around to pragmatics.
>
> It is probably true that perfect analysis of arbitrary
> programs for exploits is theoretically impossible.

... or at least very computational intensive at times. However, the danger for
real infinity is there.

> But we are not dealing with arbitrary random programs;
> and we do not have a need for perfect analysis.

The trouble is, what IS a perfect analysis?

Catch intruders?
Catch bugs?
Catch some class of bugs?
Catch anything suspicous?

How do you define a "intruder", "bug", "suspicous"?

These definitions migth be felt as easy to define, but not as easy to convert
into some logical sentence that migth be fed into some system. The definitions
that we could actually use to anything would be just mear approximations and
as we ramp away to perfect those would we learn more about how to missbehave
and versions of that. We migth even argue about the meaning and implications
of the word "perfect" and what goals that sets up. I'm sure that Ted and his
colleages would have gone strait at the "perfect" system if they only could
figure out what that was. Lets not live in illusions, we can only do better
but never reach perfect in a generically acceptable way.

> I personally would lean towards a heuristic rule based
> Super Lint. The rule database could be built up from
> many people's experience as to what constitute warning
> signs of bad coding practices, dangerous structures
> and possible exploits.

This is one potentially very usefull tool in our toolbox to get a better
security. Let's go about and actually do something here.

> It does no harm to print a warning message that says
> that something should be looked at more closely. Why
> bother to analyse in that detail? If you go through a
> source tree and branches, then apply game theory.
> Choose the worst possible value at each point, or even
> just assume that there is a value such that... or
> perhaps run a Monte-Carlo on it. There are lots of
> imperfect but effective and useful techniques that can
> be applied to our imperfect world.

This is also true, the trouble is how to wade through messages.

Applying Monte-Carlo analysis should be of interest for some tests where as
others you would really like to get done even if they take some time.

> I've heard quite a few good ideas. I also think there
> should be a real effort to get gcc and libc to build
> a few protections in. I've always had a deep dislike
> for some of the built in problems in C. Run time
> bounds checking certainly can't be that difficult to
> add. Other languages had it before C was a gleam in the
> eyes of the K&R team. I'd default it to on and have a
> --switch to turn it off for those who need that last
> little smidgen of performance.

This would mean:

1) Having a way to convey bounds through binary format (ELF)

2) Having compiler support for generating bounds (GCC, GAS)

3) Having libc support for accessing bounds (glibc)

4) Having libc routines to use bounds (glibc)

Target routines:

Routines that write to a memory vector of some sort (char *, const char *,
...) in order to have them not write beyond the memory bound.

Routined that read strings out of memory (char *, const char *) in order
to have them not read beyond the memory bound and thus cause a crash or
missbehaviour.

5) Having nothing done to the kernel (linux-kernel ;)

Once there are routines to access bounds (when they exist) those calls could
be exported out of libc to the user in case the user (which migth be another
library, it's just outside libc) migth do checks aswell.

The access to the bound should preferably be some inline function I'd say...

Notice that we have had the our discussion limited to vectors declared inside
functions, since those live their life on the stack and that this thread was
discussing stack overruns. However, global vectors, static vectors as well as
dynamically allocated memory is also of interest, since overun there can also
cause trouble. For instance, for a long time I have been longing for a function
that can just say if a pointer is "good" or "bad" as a sanity check. This all
means cooperation with compiler derivated bounds and malloc derivated bounds.
However, these things some people migth feel is going a bit overboard.

Once you have a method for bound checking we could also have the compiler to
have a compile option to let all memory writes be automatically boundchecked.
This would however only be used in the development phase and never be turned
on by default due to the time penalty. It doesn't have to be done at the time
of the introduction of the first bound checking thougth, it can be added later.

> It might be a good idea to talk to the libc people
> about adding some test hooks for the taint checks.
> In any event, we are probably talking about things
> which will require some cross community coooperation.
> I doubt that should be a serious problem because
> we're all in this together and we all face the same
> risks.

Right. This will act as a late detection scheme of incorrect code while also
protect against attacks. Both these goals should be usefull.

> There is no such thing as perfect security. There is
> no such thing as a 100% guaranteed unbreakable
> daemon (a real one, not a theoretical toy construct).
> But there can be ones that are extremely difficult
> to crack instead of ones that are extremely easy to
> crack. If you've got a rule based system you can
> add a new rule every time a new type of weakness is
> found... (hey, after all we can read the scripts
> as well as the kiddies). And if a particular exploit
> is a one off, then we are no worse off. The daemon
> gets patched and the one off is gone.

Certainly, there is nothing for instance in the proposed schemes that will
check for things that actually migth cause a deamon to go into spin, go into
a obscure state or whatever. This requires techniques for protocol validation
and the trouble here is that no one has written (to my knowledge) an accurate
protocol model for the popular protocols into a system which can actually check
the real code. There are protocol validation tools and they are quite usefull
(god knows, I have been teaching my colleagues to use it with variable success)
but we have still to take it to the real code somehow. This later is (to say
the least) a bit hairy and just having the protocols being validated in a
abstractly reduced way would be good (you learn a lot by doing that, which
teach you about what to look out for in the real code).

> By all means look for a theory that will give us
> perfect protection. But let the grad students do their
> thesis on it and in the mean time let's keep raising
> the bar as fast as we can. We are in a classic arms race.

Certainly, but we as the community must help setting the goals and also be
ready to use the result.

Cheers,
Magnus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/