Tik-76.270, Research Seminar: Java-based Software Technologies

"Java security issues" / Jarek Krol (73808P)

Keywords:

sandbox, class loader, class file verifier, bytecode verifier, security manager, trusted / untrusted source, access control, digital signature, JDK 1.2 security enhancements.

Material  presented in this document is based  on  the  book
"Java Network Security" by Dave Durbin, Rob Macgregor,  John
Owlett and Andrew Yeomans (Prentice-Hall 0-13-761529-9)

1.   First things first
#######################
Most  of  the books on Java deals with Java as a programming
language. As a programming language it has much to recommend
it.  Its  syntax  is  very like C,  but  with  many  special
features. It is strongly object-oriented, but it avoids  the
more  obscure corners of the oo-world. For most  programming
languages  the questions "how secure it is" does not  arise.
It  is the application that needs to implement the security,
not  the  language it is written in. However, Java  is  many
other things in addition to being a programming language:

a)    a set of object-oriented frameworks, primarily for GUI
building and networking
b)   an operating system
c)   a client / server management mechanizm
d)   a unifying force that cuts across operating system and
network boundaries

It  is  not  a  surprise  that Java  has  become  so  widely
accepted, so quickly.

There are a number of different components to Java:

a)    Development  Environment - The  Java  Development  Kit
(JDK)  contains  the  tools and executable  code  needed  to
compile  and  test Java programs. However, unlike  a  normal
language,  the  JDK  includes object  framers  for  creating
graphical  user interfaces, for networking and complex  I/O.
Normally  these things are provided as additions, either  by
the operating system or by another software package.

b)   Execution Environment - Java's execution environment is
neither  that  of  a  compiled language nor  an  interpreted
language.  Instead it is a hybrid, implemented by  the  Java
Virtual  Machine (JVM). Java is often said to  be  platform-
independent,  but  first  the JVM must  be  ported  to  each
platform  to  provide  the environment  it  needs.  The  JVM
implementation  is  responsible  for  all  of  the  built-in
security  of  Java,  so  it is important  that  it  is  done
properly.

c)    Interfaces and Architectures - Java applications  live
in  the  real world. This means that they must  be  able  to
interact   with   non-Java  applications.  Some   of   these
interactions are very simple (such as the way a Java  applet
is  invoked in a web page). Others are the subject  of  more
complex   architectural  definitions,  such  as   the   JDBC
interface for relational database support. The mechnizm  for
adding  encryption  to Java security, the Java  Cryptography
Architecture (JCA) falls into this later category.

This  report  focuses  on  Execution  Environment  and   its
component  as  they are vital parts for making Java  secure.
Also,   digital  signature  as  well  as  JDK  1.2  security
enhancement are discussed here.

2.   Execution Process
######################
The  JVM  operates on stream of bytecode as an  interpreter.
This  means  that is processes bytecode when the program  is
running  and  converts it to the real machine code  that  it
executes  on  the  fly.  Before  the  JVM   can  start   the
interpretation process, it has to do an number of things  to
set  up the environment in which the program will run.  This
is  the  point  at which the built in security  of  Java  is
implemented. There are three parts to the process:

a)    The  first component of applet checking is the  applet
class  loader - this separates the classes it loads to avoid
attack: local classes are separated from remote classes  and
classes  from  different  applets are  separated  from  each
other. The search order is then Java built-in classes first,
local  classes  next  and  remote classes  last.  So  if  by
incident  of design an applet contains a class  of  be  same
name  as  an  built-in  class or local  class  it  will  not
overwrite it.

b)    The second component is the class file verifier - this
runs  when  the applet is loaded and aims either to  confirm
that the bytecode will stay within the sandbox or reject it.
It  is  a multipass process that begins by making sure  that
the  syntax  is  valid, checks for the  stack  overflow  and
underflow and runs the theorem prover that looks to see  the
access and rights restrictions are observed.

c)   The third component is the security manager - it checks
sensitive  accesses at runtime. This is the  component  that
will  not allow Java applet unauthorized access to the  file
system,  or  to  the  network or to  the  runtime  operating
system.

2.1. The Class Loader
---------------------
When the browser finds an "applet" tag in a page, it starts the
JVM which in turn invokes the applet class loader. The class
loader is just another Java class that contains the code for
fetching the bytecode of the applet and presenting in to the
JVM  in an executable form. The bytecode includes a list  of
referenced  classes and the JVM works thru the list,  checks
if  the class is already loaded and attempts to load  it  if
not.  It  first tries to load from the local  disk  using  a
platform specific functions provided by the browser. If  the
class  could  not be found on the local disk the  JVM  again
calls  the class loader to retrieve the class from  the  web
server.

Classes can be divided into three categories:

a)    Classes  forming the core Java API  -  these  are  the
classes  shipped  with  the  JVM  which  provide  access  to
network,   GUI   and   threading   functions   (java.lang.*,
java.applet.*  etc.).  They  are  shipped  with   the   Java
implementation  and are part of the Java  specification.  As
such they are regarded as highly trusted classes and are not
subject to the same degree of scrutiny at runtime as classes
brought into the JVM from an external source.

b)    Classes installed in the local filing system  -  these
are not a part of the core Java class set but are assumed to
be  saved  since  the user has at some point installed  them
onto  his  or  her  machine  and  presumably  accepted   the
associated risks.

c)    Classes  loaded from the other source  -  in  the  web
browser  these would be the classes constituting  an  applet
loaded from a remote web server. These are the least trusted
classes  of  all  as they are being brought  into  the  safe
environment  of  JVM  from potentially hostile  sources  and
often  without  the specific consent of the user.  For  this
reason  these classes must be subjected to a high degree  of
checking before being made available for use in the JVM.

An application can declare any number of class loaders, each
of  which could be targeted at specific class types. One  of
class  loaders, the primordial class loader, is  a  built-in
part  of the JVM ea. it is written in C or whatever language
the JVM is written and is an integral part of the JVM. It is
the root class loader and is responsible for loading trusted
classes  - the classes from the core Java classes and  those
classes that can be found from the local file store. Classes
loaded  by  the  promordial class loader are regarded  as  a
special  insofar  and they are not subject  to  verification
prior  to  execution - they are assumed to  be  well-formed,
safe Java classes.

In  addition  to  the  primordial class  loader  application
writes (including JVM implementators) are free to built more
class  loaders  to  handle  the  loading  of  classes   from
different  sources such as the Internet, an intranet,  local
storage  or  perhaps  even from ROM in an  embedded  system.
These classes are not part of the JVM , rather they are part
of  the  application running on top of the JVM,  written  in
Java and extending the java.lang.ClassLoader class.

The  most obvious example of this is in the context of a web
browser  which  knows  how to load  classes  from  HTTP  web
server. The class loader which does this is generally  known
as  the applet class loader and is itself a Java class which
knows how to request and load other Java class files from  a
web servers across a TCP/IP network.

In  addition,  application writers can implement  their  own
class loaders by subclassing the ClassLoader class. However,
such behavior may be disallowed by the SecurityManager in an
applet. Also, declaring its own class loader is not true  of
an  applet.  The  SecurityManager prevents  an  applet  from
creating  its  own class loader. Clearly, if an  applet  can
somehow  overcome this limitation it can subvert  the  class
loading  process and potentially take over the whole browser
machine.

It  is  clear  that there can be many types of class  loader
within  the  Java environment at any one time. In  addition,
there  may be many instances of a particular type  of  class
loader operating at once. The JVM keeps track of which class
loader was responsible for loading any particular class.  It
also keeps classes loaded by different applets separate from
other. Every class present in the JVM has been loaded by one
and  only  one  class loader. For any given class,  the  JVM
remembers which class loader was responsible for loading it.
If  that  class  subsequently requires other classes  to  be
loaded  the  JVM uses the same class loader  to  load  those
classes.

This  gives rise to the concept of the name space - the  set
of classes that have been loaded by a particular instance of
a class loader. Within this name space duplicate class names
are  prohibited.  More importantly there is  no  cross  name
visibility of classes; a class in one name space (loaded  by
a  particular class loader) cannot access a class in another
name  space  (loaded by a different class loader).  On  most
networks  including the Internet there are many web  servers
from  where  classes can be loaded and there is  nothing  to
prevent  two  web server from having different classes  with
the same name.

Since  a  given  instance  of a  class  loader  cannot  load
multiple   classes  with  the  same  name,  if   there   was
possibility  for  multiple instances  of  the  applet  class
loader  we would very quickly run into problems when loading
classes from multiple sites. Moreover, it is essential  from
the  security of the JVM to separate classes from  different
sites  so  that  they cannot inadvertently  or  deliberately
cross  reference  each  other. This is  achieved  by  having
classes  from  separate web sites loaded  into  spirit  name
spaces   which  in  turn  is  managed  by  having  different
instances  of  the applet class loader for  each  site  from
which applets are loaded.

Another  meaningful observation about class loader  is  that
they   frequently  interoperate,  one  class  loader  asking
another  to  load a class for it. If the first class  loaded
from  a web server requires access to a class from a trusted
core  classes  such as java.lang.String then  the  primodial
class  loader will take over as it knows how to load classes
from  trusted packages. The search strategy of class loading
can be formulated in the following way:

a)    ask  the primordial class loader to load a class  from
trusted packages
b)    if  this fails, require the class from the web  server
from which the original class was loaded
c)   if this fails, report the class are not locatable by
throwing ClassNotFound exception

This  search  strategy ensures that classes are loaded  from
the most trusted source in which they are available.

If  it  is  done  correctly, a user-built class  loader  can
significantly   enhance  the  security  of  an   application
deployed  on  intranet  particularly  if  it  is   used   in
conjunction   with  a  firewall  or  other  local   security
measures.  Some  of the situations in which  a  user-written
class loader could be used are:

a)     to  restrict  searches  for  trusted  classes  to   a
particular directory or path other than the CLASSPATH
b)    to  allow  the JVM to load classes from  a  particular
source such as from EPROM or a non-TCP/IP network
c)    to specify paths that should be searched in advance of
the CLASSPATH
d)   to provide auditing information about access to classes

2.2. The Class File Verifier
----------------------------
Java   divides  the  world  into  two  parts:  trusted   and
untrusted.  Trusted  code includes the  local  Java  classes
which  are  shipped  as a part of JVM  and  sometimes  other
classes on the local disk. Everything else is untrusted  and
thus  must  be checked by the class file verifier to  ensure
that the integrity of the JVM is not threatened.

Also, it is obvious that the class loader and the class file
verifier  must operate as a team if they are to  succeed  in
their  task of making sure that only safe and valid code  is
executed.  The class file verifier is invoked by  the  class
loader to perform a series of tests on class files which are
regarded as potentially unsafe. This tests check all aspects
of  a  class  file from its size and structure down  to  its
runtime  characteristics. Only when  these  test  have  been
passed is the file made available for use.

The  class file verifier is itself a part of JVM and as such
it cannot be removed or overridden without replacing the JVM
itself.

At  first  sight,  the job of the class  file  verifier  may
appear  to  be  redundant.  After  all,  bytecode  is   only
generated  be  the Java compiler so if it is  not  correctly
formatted  and valid surly the compiler needs  to  be  fixed
rather than having to go thru the overhead of checking  each
time  a  program  is run. Unfortunately  this  is  not  that
simple. The compiled program is just a file of type ".class"
containing  a  string  of bytes so it could  be  created  or
modified using any binary editor. Also, nobody can guarantee
that  only  well behaving and error free Java compilers  are
used.  Given  this  fact  JVM has to  treat  any  code  from
external source as potentially damaged and therefor in  need
of verification.

Before  detailed  investigation of the class  file  verifier
performance   it  is  important  to  note  that   the   Java
specification requires the JVM to behave in a particular way
when  it encounters certain problems with class files, which
is  usually to throw an error and refuse to use the classes.
The precise implementation varies from one vendor to another
and  is not specified. Thus some vendors may make all checks
prior to making the file available, others may defer some or
all checks until runtime. The process described below is the
way  in which Sun's HotJava web browser works - it has  been
adopted  by most JVM writers because it saves the effort  of
reinventing a complex process.

The  class  file  verifier makes four passes  over  a  newly
loaded  class  file, each pass examing it in closer  detail.
Should  any of the passes find fault with the code then  the
class  file  is rejected. For reasons to be explained  later
not  all of these tests are performed prior to executing the
code.  The  first  three  passes  are  performed  prior   to
execution and only of the code passes the tests here will be
made available for use. The fourth pass, really a series  of
ad  hoc tests, is performed at execution time once the  code
has already started to run.

Pass  1  / File Integrity Check - The first and the simplest
test checks the structure of the class file. It ensures that
the file has the appropriate signature (first four bytes are
0xcafebabe) and that each of the structures within the  file
is  of the appropriate length. It checks that the class file
itself  is  neither  too long nor too  short  and  that  the
constant  pool contains only valid entries. Of course  class
files  may  have varying lengths but each of the  structures
(such  as  the constant pool) has its length included  as  a
part of the file specification. If the files is too long  or
too  short  the  class  file verifier throws  an  error  and
refuses to make the class available for use.

Pass  2  /  Class Integrity Check - The second pass performs
all  other  checking which is possible without  examing  the
actual   bytecode  instructions  themselves.  Specially   it
ensures  that: the class has a superclass unless it  is  the
Object  class, the superclass is not a final class and  that
this  class does not attempt to override a final  method  in
its  superclass, constant pool entries are well formed,  and
that  all methods and field references have legal names  and
signatures.  However,  in this pass  no  check  is  made  as
whether  fields, methods or classes actually exists,  merely
that  their names and signatures are legal and according  to
the language specification.

Pass  3 / Bytecode Integrity Check - This pass is which  the
bytecode  verifier runs and it is the most complex  pass  of
the  class  file  verifier.  The  individual  bytecodes  are
examined  to determine how the code will actually behave  at
runtime.  This  includes data-flow analysis, stack  checking
and  static type checking for method arguments and  bytecode
operands.  It is the bytecode verifier which is  responsible
for  checking that the bytecodes have the correct number and
type of operands, that datatypes are not accessed illegally,
the  stack  is not over or underflowed and that methods  are
called the appropriate parameter types.

Pass  4  /  Runtime  Integrity Check -  The  JVM  must  make
tradeoff  between security and efficiency. For that  reason,
the  bytecode verifier does not exhaustively check  for  the
existence of fields and classes in pass 3. If did, then  the
JVM would need to load all classes required by an applet  or
application prior to running it. This would result in a very
heavy  overhead  which is not strictly required.  Loading  a
class involves possible network access and running the class
file  verifier for the class and it may well be  that  these
lines of code are never executed in the normal course of the
program's  execution in which case loading and checking  the
subclass  would be a waste of time. For that  reason,  class
files are only loaded when they are required, that is when a
method  call  is  executed or a field in an object  of  that
class is modified. This is determined at runtime and so that
is when the fourth pass of the verifier is executed.

2.3. The Security Manager
-------------------------
The  third  component involved in loading and  running  Java
program  is the security manager. Event when untrusted  code
has   been   verified,  it  is  still  subject  to   runtime
restrictions.  The  security  manager  is  responsible   for
enforcing these restrictions.

The  security manager is similar to the class loader in that
it  is  a  Java class (java.lang.SecurityManager)  that  any
application can extend for its own purpose. In a browser the
security manager is provided by the browser manufacturer and
is  the  component  of the JVM which prevents  applets  from
reading or writing to the file system, accessing the network
in   an  unsafe  way,  making  inquires  about  the  runtime
environment, printing and so on.

By  default, in a stand-alone JVM implementation there is no
security  manager  (since  there is  no  mechanizm  to  load
classes  from  untrusted sources). It is, however,  possible
for an application writer to implement a security manager to
enforce a particular security policy. For example, there  is
a  checkRead  method which receives a file reference  as  an
argument. If the security manager is to prevent the  program
from reading this particular file, checkRead should be coded
to throw a security exception.

Although any Java program, applet or application, can extend
SecurityManager class, the JVM will allow only one  security
manager  to be active at a time. To make a security  manager
active   you   have  to  invoke  a  static  system   method:
java.System.setSecurityManager(). This can be done only once
in  an  application  environment; any subsequent  invocation
results  in an exception. In the case of an applet, the  web
browser has already installed a security manager as part  of
the  JVM  initialization.  This  means,  assuming  that  the
trusted  classes are not subverted, that an  applet  has  no
choice  but  to  live with the limitations of  the  security
manager provided by the browser.

The  installed  security manager is only  really  active  on
request:  it does not check anything unless it is called  by
other  system  functions. When loading untrusted  class  the
calling  code creates a new Socket class, using one  of  the
constructor  methods it provides. This methods  invokes  the
checkConnect  method  of the local SecurityManager  subclass
instance. In this case the security manager has a number  of
things to consider:

a)    It needs to know if the top level class is trusted  or
not.  This  is, was it loaded by the class loader  over  the
network  or  by  a local class loader, or was  it  installed
locally, from the trusted class path.
b)    As  an  extension of the first point, if the  security
manager  is  checking  a file access or  network  connection
request  it not only needs to know if the applet is trusted,
but  also if it was loaded from the network or from a  local
file.  This is because there are variations in the level  of
access allowed for these functions.
c)    It may run some further check specific to the type  of
access requested. It has to check whether the host to  which
the socket connection is being attempted is at the same host
from which the calling class was loaded.

If  all  of these checks are successful the security manager
  can permit the connection to go ahead.

Although the three elements of JVM security - class  loader,
class  file verifier and security manager - each have unique
functions they have to intercooperate tightly. The  security
manager relies on the class loader to keep untrusted classes
and local classes in separate name spaces and to prevent the
local  trusted  classes from being overwritten.  Conversely,
the  class loader relies on the security manager to  prevent
an  applet  from loading own class loader, which could  flag
untrusted  code  as trusted. And everything  relies  on  the
class  file  verifier to make sure that class  confusion  is
avoided and that class protection directives are honored.

3.   Digital Signature
######################
Using Java Cryptography Extensions (JCE) it is possible  for
Java  application  or  applet  to  create  its  own  digital
signatures.   This   allows  to  write  more   sophisticated
programs,  but  more common scenario is when  an  applet  is
about  to  do  something  that  the  sand  box  restrictions
normally forbid. In this case, the browser user needs to  be
convinced  that the applet is from trustworthy source.  This
way is achieved by digital signing the applet.

The signature on the applet links the code to the programmer
or  administrator who created or packaged it.  However,  the
user  has  to be able to check that the signature is  valid.
The signer enables it by providing a public key certificate.

One characteristic of dynamic loading of class files is that
a  typical  applet  may involved a number of  small  network
transfers.  It may also involve a retrieval of other  files.
graphic   images   for   example.  Given   the   indifferent
performance of many www connections, this can  be a  serious
performance  hit.  JDK  1.1  provides  relief  for  this  by
introducing  the  JAR  (Java Archive) format  for  packaging
everything   into  a  single  file.  JAR  also  allows   for
compression which can further improve performance.

The files that make up the payload of JAR are packed into  a
copy  of original directory structure. The MANIFEST.MF  file
contains  details of the "payload" of the  JAR.  The  digest
values  recorded  in  the manifest are calculated  from  the
contents  of the payload files they refer to. They are  used
to validate the payload files when they are unpacked.

JAR signing allows to generate digital signatures for any of
the  files  in the archive. In fact, files can be signed  by
more  than  one signer. So for example, an applet  could  be
signed by a developer who created it and then also signed by
the  IT department of the company who use it. When the  user
loads  the applet, he or she not only knows that the  applet
comes  from trustworthy source, but also knows that  it  has
been approved for the corporate use.

When  files in a JAR are signed, two new files are added  to
the manifest directory:

a)    Signer file - this is very much like the manifest file
itself,  except  that the digests in it are calculated  from
the manifest file entries, not from  the actual contents  of
the payload files. The signer file may contain fewer entries
than the manifest file, because the signer does not have  to
sign  every  file in the archive. The file name  is  .SF,  where   is an arbitrary  name  for  the
creator of the signature. If the JAR has been signed by more
than one signer, each will have a separate .SF file.

b)    Digital  signature  file -  this  is  a  binary  file,
containing  the digital signature in the PKCS7  format.  The
signature file name depends on the signature algorithm used.
For  example,  a  DSA signature would be  in  a  file  named
.DSA  (other  possibilities  are  .RSA   for   a
signature using MD5 digest and RSA encryption and .PGP for a
Pretty Good Privacy signature.

4.   Coming Next from JavaSoft: JDK 1.2
#######################################
At  the  time of writing of this document JDK 1.2  is  still
under  development  and only limited information  about  the
security model is publicly available. What is known is  that
Sun  will  develop  the  sandbox model  with  the  following
objectives in mind:

a)    To  provide  fine-grained access  control.  Under  the
present  scheme a customized SecurityManager and ClassLoader
has  to  be  written. The intention is  that  JDK  and  Java
Runtime   Environment  (JRE)  will  provide  much  of   this
programming by default.

b)    To enable an easily configurable security policy. When
the  HotJava browser was introduced it provided some limited
capabilities for modifying the restrictions of the  sandbox.
However,  later Java browsers have removed all such controls
leaving  the  restrictive  virtual  machine  of  today.  The
runtime  environment needs to be fitted with  controls  that
allows  a  user  or administrator to define  their  security
policy.

c)    To allow checks to be extended to other Java programs.
Under  the  present scheme, local code is always treated  as
being  trusted, whereas applet code is not.  The  new  model
will  apply  consistently to local  code  as  well,  whether
classes  permanently  installed on a browser  that  interact
with  applets  or part of Java applications. This  does  not
eliminate the concept of system code. There must always be a
layer  of trusted code that applet and local classes  invoke
when  they need access to protected resources. What it  does
mean  is  that applets and applications can be subjected  to
the same set of controls

The  JDK  1.2 will extend the concept of protection domains.
These  are logical boundaries within which a given  security
policy  applies. A protection domain is defined by a set  of
permissions which act as a set of filters to tie together:

a)    The  code source, made up of origin (where a piece  of
code  comes  from) and a principal (who the code  is  signed
by).
b)   Resources (protected systems or network elements)

The  way the permissions are applied will mirror the current
SecurityManager function. That is, every attempt to access a
protected  resource  will be routed to  the  access  control
function   which  will  examine  the  permissions   of   its
protection  domain  and either return quietly  or  throw  an
exception  (in fact is will have to trace back the execution
thread  to  check  all of the protection  domains,  so  that
unauthorized  code  cannot  beat  the  system   by   calling
authorized function).

The  elements  for the protection domain will  initially  be
controlled by a policy configuration file. So, for  example,
an  entry  in  the file could be specified that would  grant
applet  code from a specific site, signed by a named trusted
signer, read-only permissions to a specific file.

Each of the elements of the protection domain can be defined
as tightly or as closely as required. This means that at one
extreme  it  will be possible to define a protection  domain
that  re-creates the operation of the sandbox by  specifying
an origin of "any URL" and a principal of "unsigned".