RFC 2196

Site Security Handbook

Pages: 75
FYI 8
→ Errata
Obsoletes: 1244

Part 2 of 3 – Pages 24 to 49

noToC RFC2196 - Page 24 prevText

4.  Security Services and Procedures

   This chapter guides the reader through a number of topics that should
   be addressed when securing a site.  Each section touches on a
   security service or capability that may be required to protect the
   information and systems at a site.  The topics are presented at a
   fairly high-level to introduce the reader to the concepts.

   Throughout the chapter, you will find significant mention of
   cryptography.  It is outside the scope of this document to delve into
   details concerning cryptography, but the interested reader can obtain
   more information from books and articles listed in the reference
   section of this document.

4.1  Authentication

   For many years, the prescribed method for authenticating users has
   been through the use of standard, reusable passwords.  Originally,
   these passwords were used by users at terminals to authenticate
   themselves to a central computer.  At the time, there were no
   networks (internally or externally), so the risk of disclosure of the
   clear text password was minimal.  Today, systems are connected
   together through local networks, and these local networks are further
   connected together and to the Internet.  Users are logging in from
   all over the globe; their reusable passwords are often transmitted
   across those same networks in clear text, ripe for anyone in-between
   to capture.  And indeed, the CERT* Coordination Center and other
   response teams are seeing a tremendous number of incidents involving
   packet sniffers which are capturing the clear text passwords.

   With the advent of newer technologies like one-time passwords (e.g.,
   S/Key), PGP, and token-based authentication devices, people are using
   password-like strings as secret tokens and pins.  If these secret
   tokens and pins are not properly selected and protected, the
   authentication will be easily subverted.

noToC RFC2196 - Page 25

4.1.1  One-Time passwords

   As mentioned above, given today's networked environments, it is
   recommended that sites concerned about the security and integrity of
   their systems and networks consider moving away from standard,
   reusable passwords.  There have been many incidents involving Trojan
   network programs (e.g., telnet and rlogin) and network packet
   sniffing programs.  These programs capture clear text
   hostname/account name/password triplets.  Intruders can use the
   captured information for subsequent access to those hosts and
   accounts.  This is possible because 1) the password is used over and
   over (hence the term "reusable"), and 2) the password passes across
   the network in clear text.

   Several authentication techniques have been developed that address
   this problem.  Among these techniques are challenge-response
   technologies that provide passwords that are only used once (commonly
   called one-time passwords). There are a number of products available
   that sites should consider using. The decision to use a product is
   the responsibility of each organization, and each organization should
   perform its own evaluation and selection.

4.1.2  Kerberos

   Kerberos is a distributed network security system which provides for
   authentication across unsecured networks.  If requested by the
   application, integrity and encryption can also be provided.  Kerberos
   was originally developed at the Massachusetts Institute of Technology
   (MIT) in the mid 1980s.  There are two major releases of Kerberos,
   version 4 and 5, which are for practical purposes, incompatible.

   Kerberos relies on a symmetric key database using a key distribution
   center (KDC) which is known as the Kerberos server.  A user or
   service (known as "principals") are granted electronic "tickets"
   after properly communicating with the KDC.  These tickets are used
   for authentication between principals.  All tickets include a time
   stamp which limits the time period for which the ticket is valid.
   Therefore, Kerberos clients and server must have a secure time
   source, and be able to keep time accurately.

   The practical side of Kerberos is its integration with the
   application level.  Typical applications like FTP, telnet, POP, and
   NFS have been integrated with the Kerberos system.  There are a
   variety of implementations which have varying levels of integration.
   Please see the Kerberos FAQ available at http://www.ov.com/misc/krb-
   faq.html for the latest information.

noToC RFC2196 - Page 26

4.1.3  Choosing and Protecting Secret Tokens and PINs

   When selecting secret tokens, take care to choose them carefully.
   Like the selection of passwords, they should be robust against brute
   force efforts to guess them.  That is, they should not be single
   words in any language, any common, industry, or cultural acronyms,
   etc.  Ideally, they will be longer rather than shorter and consist of
   pass phrases that combine upper and lower case character, digits, and
   other characters.

   Once chosen, the protection of these secret tokens is very important.
   Some are used as pins to hardware devices (like token cards) and
   these should not be written down or placed in the same location as
   the device with which they are associated.  Others, such as a secret
   Pretty Good Privacy (PGP) key, should be protected from unauthorized
   access.

   One final word on this subject.  When using cryptography products,
   like PGP, take care to determine the proper key length and ensure
   that your users are trained to do likewise.  As technology advances,
   the minimum safe key length continues to grow.  Make sure your site
   keeps up with the latest knowledge on the technology so that you can
   ensure that any cryptography in use is providing the protection you
   believe it is.

4.1.4  Password Assurance

   While the need to eliminate the use of standard, reusable passwords
   cannot be overstated, it is  recognized that some organizations may
   still be using them.  While it's recommended that these organizations
   transition to the use of better technology, in the mean time, we have
   the following advice to help with the selection and maintenance of
   traditional passwords. But remember, none of these measures provides
   protection against disclosure due to sniffer programs.

   (1)  The importance of robust passwords - In many (if not most) cases
        of system penetration, the intruder needs to gain access to an
        account on the system. One way that goal is typically
        accomplished is through guessing the password of a legitimate
        user.  This is often accomplished by running an automated
        password cracking program, which utilizes a very large
        dictionary, against the system's password file.  The only way to
        guard against passwords being disclosed in this manner is
        through the careful selection of passwords which cannot be
        easily guessed (i.e., combinations of numbers, letters, and
        punctuation characters).  Passwords should also be as long as
        the system supports and users can tolerate.

noToC RFC2196 - Page 27

   (2)  Changing default passwords - Many operating systems and
        application programs are installed with default accounts and
        passwords.  These must be changed immediately to something that
        cannot be guessed or cracked.

   (3)  Restricting access to the password file - In particular, a site
        wants to protect the encrypted password portion of the file so
        that would-be intruders don't have them available for cracking.
        One effective technique is to use shadow passwords where the
        password field of the standard file contains a dummy or false
        password.  The file containing the legitimate passwords are
        protected elsewhere on the system.

   (4)  Password aging - When and how to expire passwords is still a
        subject of controversy among the security community.  It is
        generally accepted that a password should not be maintained once
        an account is no longer in use, but it is hotly debated whether
        a user should be forced to change a good password that's in
        active use.  The arguments for changing passwords relate to the
        prevention of the continued use of penetrated accounts.
        However, the opposition claims that frequent password changes
        lead to users writing down their passwords in visible areas
        (such as pasting them to a terminal), or to users selecting very
        simple passwords that are easy to guess.  It should also be
        stated that an intruder will probably use a captured or guessed
        password sooner rather than later, in which case password aging
        provides little if any protection.

        While there is no definitive answer to this dilemma, a password
        policy should directly address the issue and provide guidelines
        for how often a user should change the password.  Certainly, an
        annual change in their password is usually not difficult for
        most users, and you should consider requiring it.  It is
        recommended that passwords be changed at least whenever a
        privileged account is compromised, there is a critical change in
        personnel (especially if it is an administrator!), or when an
        account has been compromised.  In addition, if a privileged
        account password is compromised, all passwords on the system
        should be changed.

   (5)  Password/account blocking - Some sites find it useful to disable
        accounts after a predefined number of failed attempts to
        authenticate.  If your site decides to employ this mechanism, it
        is recommended that the mechanism not "advertise" itself. After

noToC RFC2196 - Page 28

        disabling, even if the correct password is presented, the
        message displayed should remain that of a failed login attempt.
        Implementing this mechanism will require that legitimate users
        contact their system administrator to request that their account
        be reactivated.

   (6)  A word about the finger daemon - By default, the finger daemon
        displays considerable system and user information. For example,
        it can display a list of all users currently using a system, or
        all the contents of a specific user's .plan file.  This
        information can be used by would-be intruders to identify
        usernames and guess their passwords. It is recommended that
        sites consider modifying finger to restrict the information
        displayed.

4.2  Confidentiality

   There will be information assets that your site will want to protect
   from disclosure to unauthorized entities.  Operating systems often
   have built-in file protection mechanisms that allow an administrator
   to control who on the system can access, or "see," the contents of a
   given file.  A stronger way to provide confidentiality is through
   encryption.  Encryption is accomplished by scrambling data so that it
   is very difficult and time consuming for anyone other than the
   authorized recipients or owners to obtain the plain text.  Authorized
   recipients and the owner of the information will possess the
   corresponding decryption keys that allow them to easily unscramble
   the text to a readable (clear text) form.  We recommend that sites
   use encryption to provide confidentiality and protect valuable
   information.

   The use of encryption is sometimes controlled by governmental and
   site regulations, so we encourage administrators to become informed
   of laws or policies that regulate its use before employing it.  It is
   outside the scope of this document to discuss the various algorithms
   and programs available for this purpose, but we do caution against
   the casual use of the UNIX crypt program as it has been found to be
   easily broken.  We also encourage everyone to take time to understand
   the strength of the encryption in any given algorithm/product before
   using it.  Most well-known products are well-documented in the
   literature, so this should be a fairly easy task.

4.3  Integrity

   As an administrator, you will want to make sure that information
   (e.g., operating system files, company data, etc.) has not been
   altered in an unauthorized fashion.  This means you will want to
   provide some assurance as to the integrity of the information on your

noToC RFC2196 - Page 29

   systems.  One way to provide this is to produce a checksum of the
   unaltered file, store that checksum offline, and periodically (or
   when desired) check to make sure the checksum of the online file
   hasn't changed (which would indicate the data has been modified).

   Some operating systems come with checksumming programs, such as the
   UNIX sum program.  However, these may not provide the protection you
   actually need.  Files can be modified in such a way as to preserve
   the result of the UNIX sum program!  Therefore, we suggest that you
   use a cryptographically strong program, such as the message digesting
   program MD5 [ref], to produce the checksums you will be using to
   assure integrity.

   There are other applications where integrity will need to be assured,
   such as when transmitting an email message between two parties. There
   are products available that can provide this capability.  Once you
   identify that this is a capability you need, you can go about
   identifying technologies that will provide it.

4.4  Authorization

   Authorization refers to the process of granting privileges to
   processes and, ultimately, users.  This differs from authentication
   in that authentication is the process used to identify a user.  Once
   identified (reliably), the privileges, rights, property, and
   permissible actions of the user are determined by authorization.

   Explicitly listing the authorized activities of each user (and user
   process) with respect to all resources (objects) is impossible in a
   reasonable system.  In a real system certain techniques are used to
   simplify the process of granting and checking authorization(s).

   One approach, popularized in UNIX systems, is to assign to each
   object three classes of user: owner, group and world.  The owner is
   either the creator of the object or the user assigned as owner by the
   super-user.  The owner permissions (read, write and execute) apply
   only to the owner.  A group is a collection of users which share
   access rights to an object.  The group permissions (read, write and
   execute) apply to all users in the group (except the owner).  The
   world refers to everybody else with access to the system.  The world
   permissions (read, write and execute) apply to all users (except the
   owner and members of the group).

   Another approach is to attach to an object a list which explicitly
   contains the identity of all permitted users (or groups).  This is an
   Access Control List (ACL).  The advantage of ACLs are that they are

noToC RFC2196 - Page 30

   easily maintained (one central list per object) and it's very easy to
   visually check who has access to what. The disadvantages are the
   extra resources required to store such lists, as well as the vast
   number of such lists required for large systems.

4.5  Access

4.5.1  Physical Access

   Restrict physical access to hosts, allowing access only to those
   people who are supposed to use the hosts.  Hosts include "trusted"
   terminals (i.e., terminals which allow unauthenticated use such as
   system consoles, operator terminals and terminals dedicated to
   special tasks), and individual microcomputers and workstations,
   especially those connected to your network.  Make sure people's work
   areas mesh well with access restrictions; otherwise they will find
   ways to circumvent your physical security (e.g., jamming doors open).

   Keep original and backup copies of data and programs safe.  Apart
   from keeping them in good condition for backup purposes, they must be
   protected from theft.  It is important to keep backups in a separate
   location from the originals, not only for damage considerations, but
   also to guard against thefts.

   Portable hosts are a particular risk.  Make sure it won't cause
   problems if one of your staff's portable computer is stolen.
   Consider developing guidelines for the kinds of data that should be
   allowed to reside on the disks of portable computers as well as how
   the data should be protected (e.g., encryption) when it is on a
   portable computer.

   Other areas where physical access should be restricted is the wiring
   closets and important network elements like file servers, name server
   hosts, and routers.

4.5.2  Walk-up Network Connections

   By "walk-up" connections, we mean network connection points located
   to provide a convenient way for users to connect a portable host to
   your network.

   Consider whether you need to provide this service, bearing in mind
   that it allows any user to attach an unauthorized host to your
   network.  This increases the risk of attacks via techniques such as

noToC RFC2196 - Page 31

   IP address spoofing, packet sniffing, etc.  Users and site management
   must appreciate the risks involved.  If you decide to provide walk-up
   connections, plan the service carefully and define precisely where
   you will provide it so that you can ensure the necessary physical
   access security.

   A walk-up host should be authenticated before its user is permitted
   to access resources on your network.  As an alternative, it may be
   possible to control physical access. For example, if the service is
   to be used by students, you might only provide walk-up connection
   sockets in student laboratories.

   If you are providing walk-up access for visitors to connect back to
   their home networks (e.g., to read e-mail, etc.) in your facility,
   consider using a separate subnet that has no connectivity to the
   internal network.

   Keep an eye on any area that contains unmonitored access to the
   network, such as vacant offices.  It may be sensible to disconnect
   such areas at the wiring closet, and consider using secure hubs and
   monitoring attempts to connect unauthorized hosts.

4.5.3  Other Network Technologies

   Technologies considered here include X.25, ISDN, SMDS, DDS and Frame
   Relay.  All are provided via physical links which go through
   telephone exchanges, providing the potential for them to be diverted.
   Crackers are certainly interested in telephone switches as well as in
   data networks!

   With switched technologies, use Permanent Virtual Circuits or Closed
   User Groups whenever this is possible.  Technologies which provide
   authentication and/or encryption (such as IPv6) are evolving rapidly;
   consider using them on links where security is important.

4.5.4  Modems

4.5.4.1  Modem Lines Must Be Managed

   Although they provide convenient access to a site for its users, they
   can also provide an effective detour around the site's firewalls.
   For this reason it is essential to maintain proper control of modems.

   Don't allow users to install a modem line without proper
   authorization.  This includes temporary installations (e.g., plugging
   a modem into a facsimile or telephone line overnight).

noToC RFC2196 - Page 32

   Maintain a register of all your modem lines and keep your register up
   to date.  Conduct regular (ideally automated) site checks for
   unauthorized modems.

4.5.4.2  Dial-in Users Must Be Authenticated

   A username and password check should be completed before a user can
   access anything on your network.  Normal password security
   considerations are particularly important (see section 4.1.1).

   Remember that telephone lines can be tapped, and that it is quite
   easy to intercept messages to cellular phones.  Modern high-speed
   modems use more sophisticated modulation techniques, which makes them
   somewhat more difficult to monitor, but it is prudent to assume that
   hackers know how to eavesdrop on your lines.  For this reason, you
   should use one-time passwords if at all possible.

   It is helpful to have a single dial-in point (e.g., a single large
   modem pool) so that all users are authenticated in the same way.

   Users will occasionally mis-type a password.  Set a short delay - say
   two seconds - after the first and second failed logins, and force a
   disconnect after the third.  This will slow down automated password
   attacks.  Don't tell the user whether the username, the password, or
   both, were incorrect.

4.5.4.3  Call-back Capability

   Some dial-in servers offer call-back facilities (i.e., the user dials
   in and is authenticated, then the system disconnects the call and
   calls back on a specified number).  Call-back is useful since if
   someone were to guess a username and password, they are disconnected,
   and the system then calls back the actual user whose password was
   cracked; random calls from a server are suspicious, at best.  This
   does mean users may only log in from one location (where the server
   is configured to dial them back), and of course there may be phone
   charges associated with there call-back location.

   This feature should be used with caution; it can easily be bypassed.
   At a minimum, make sure that the return call is never made from the
   same modem as the incoming one.  Overall, although call-back can
   improve modem security, you should not depend on it alone.

4.5.4.4  All Logins Should Be Logged

   All logins, whether successful or unsuccessful should be logged.
   However, do not keep correct passwords in the log. Rather, log them
   simply as a successful login attempt.  Since most bad passwords are

noToC RFC2196 - Page 33

   mistyped by authorized users, they only vary by a single character
   from the actual password.  Therefore if you can't keep such a log
   secure, don't log it at all.

   If Calling Line Identification is available, take advantage of it by
   recording the calling number for each login attempt.  Be sensitive to
   the privacy issues raised by Calling Line Identification.  Also be
   aware that Calling Line Identification is not to be trusted (since
   intruders have been known to break into phone switches and forward
   phone numbers or make other changes); use the data for informational
   purposes only, not for authentication.

4.5.4.5  Choose Your Opening Banner Carefully

   Many sites use a system default contained in a message of the day
   file for their opening banner. Unfortunately, this often includes the
   type of host hardware or operating system present on the host.  This
   can provide valuable information to a would-be intruder. Instead,
   each site should create its own specific login banner, taking care to
   only include necessary information.

   Display a short banner, but don't offer an "inviting" name (e.g.,
   University of XYZ, Student Records System).  Instead, give your site
   name, a short warning that sessions may be monitored, and a
   username/password prompt.  Verify possible legal issues related to
   the text you put into the banner.

   For high-security applications, consider using a "blind" password
   (i.e., give no response to an incoming call until the user has typed
   in a password).  This effectively simulates a dead modem.

4.5.4.6  Dial-out Authentication

   Dial-out users should also be authenticated, particularly since your
   site will have to pay their telephone charges.

   Never allow dial-out from an unauthenticated dial-in call, and
   consider whether you will allow it from an authenticated one.  The
   goal here is to prevent callers using your modem pool as part of a
   chain of logins.  This can be hard to detect, particularly if a
   hacker sets up a path through several hosts on your site.

   At a minimum, don't allow the same modems and phone lines to be used
   for both dial-in and dial-out.  This can be implemented easily if you
   run separate dial-in and dial-out modem pools.

noToC RFC2196 - Page 34

4.5.4.7  Make Your Modem Programming as "Bullet-proof" as Possible

   Be sure modems can't be reprogrammed while they're in service.  At a
   minimum, make sure that three plus signs won't put your dial-in
   modems into command mode!

   Program your modems to reset to your standard configuration at the
   start of each new call.  Failing this, make them reset at the end of
   each call.  This precaution will protect you against accidental
   reprogramming of your modems. Resetting at both the end and the
   beginning of each call will assure an even higher level of confidence
   that a new caller will not inherit a previous caller's session.

   Check that your modems terminate calls cleanly.  When a user logs out
   from an access server, verify that the server hangs up the phone line
   properly.  It is equally important that the server forces logouts
   from whatever sessions were active if the user hangs up unexpectedly.

4.6  Auditing

   This section covers the procedures for collecting data generated by
   network activity, which may be useful in analyzing the security of a
   network and responding to security incidents.

4.6.1  What to Collect

   Audit data should include any attempt to achieve a different security
   level by any person, process, or other entity in the network.  This
   includes login and logout, super user access (or the non-UNIX
   equivalent), ticket generation (for Kerberos, for example), and any
   other change of access or status.  It is especially important to note
   "anonymous" or "guest" access to public servers.

   The actual data to collect will differ for different sites and for
   different types of access changes within a site.  In general, the
   information you want to collect includes: username and hostname, for
   login and logout; previous and new access rights, for a change of
   access rights; and a timestamp.  Of course, there is much more
   information which might be gathered, depending on what the system
   makes available and how much space is available to store that
   information.

   One very important note: do not gather passwords.  This creates an
   enormous potential security breach if the audit records should be
   improperly accessed.  Do not gather incorrect passwords either, as
   they often differ from valid passwords by only a single character or
   transposition.

noToC RFC2196 - Page 35

4.6.2  Collection Process

   The collection process should be enacted by the host or resource
   being accessed.  Depending on the importance of the data and the need
   to have it local in instances in which services are being denied,
   data could be kept local to the resource until needed or be
   transmitted to storage after each event.

   There are basically three ways to store audit records: in a
   read/write file on a host, on a write-once/read-many device (e.g., a
   CD-ROM or a specially configured tape drive), or on a write-only
   device (e.g., a line printer).  Each method has advantages and
   disadvantages.

   File system logging is the least resource intensive of the three
   methods and the easiest to configure.  It allows instant access to
   the records for analysis, which may be important if an attack is in
   progress.  File system logging is also the least reliable method.  If
   the logging host has been compromised, the file system is usually the
   first thing to go; an intruder could easily cover up traces of the
   intrusion.

   Collecting audit data on a write-once device is slightly more effort
   to configure than a simple file, but it has the significant advantage
   of greatly increased security because an intruder could not alter the
   data showing that an intrusion has occurred.  The disadvantage of
   this method is the need to maintain a supply of storage media and the
   cost of that media.  Also, the data may not be instantly available.

   Line printer logging is useful in system where permanent and
   immediate logs are required.  A real time system is an example of
   this, where the exact point of a failure or attack must be recorded.
   A laser printer, or other device which buffers data (e.g., a print
   server), may suffer from lost data if buffers contain the needed data
   at a critical instant.  The disadvantage of, literally, "paper
   trails" is the need to keep the printer fed and the need to scan
   records by hand.  There is also the issue of where to store the,
   potentially, enormous volume of paper which may be generated.

   For each of the logging methods described, there is also the issue of
   securing the path between the device generating the log and actual
   logging device (i.e., the file server, tape/CD-ROM drive, printer).
   If that path is compromised, logging can be stopped or spoofed or
   both.  In an ideal world, the logging device would be directly

noToC RFC2196 - Page 36

   attached by a single, simple, point-to-point cable.  Since that is
   usually impractical, the path should pass through the minimum number
   of networks and routers.  Even if logs can be blocked, spoofing can
   be prevented with cryptographic checksums (it probably isn't
   necessary to encrypt the logs because they should not contain
   sensitive information in the first place).

4.6.3  Collection Load

   Collecting audit data may result in a rapid accumulation of bytes so
   storage availability for this information must be considered in
   advance.  There are a few ways to reduce the required storage space.
   First, data can be compressed, using one of many methods. Or, the
   required space can be minimized by keeping data for a shorter period
   of time with only summaries of that data kept in long-term archives.
   One major drawback to the latter method involves incident response.
   Often, an incident has been ongoing for some period of time when a
   site notices it and begins to investigate. At that point in time,
   it's very helpful to have detailed audit logs available. If these are
   just summaries, there may not be sufficient detail to fully handle
   the incident.

4.6.4  Handling and Preserving Audit Data

   Audit data should be some of the most carefully secured data at the
   site and in the backups.  If an intruder were to gain access to audit
   logs, the systems themselves, in addition to the data, would be at
   risk.

   Audit data may also become key to the investigation, apprehension,
   and prosecution of the perpetrator of an incident.  For this reason,
   it is advisable to seek the advice of legal council when deciding how
   audit data should be treated.  This should happen before an incident
   occurs.

   If a data handling plan is not adequately defined prior to an
   incident, it may mean that there is no recourse in the aftermath of
   an event, and it may create liability resulting from improper
   treatment of the data.

4.6.5  Legal Considerations

   Due to the content of audit data, there are a number of legal
   questions that arise which might need to be addressed by your legal
   counsel. If you collect and save audit data, you need to be prepared
   for consequences resulting both from its existence and its content.

noToC RFC2196 - Page 37

   One area concerns the privacy of individuals.  In certain instances,
   audit data may contain personal information.  Searching through the
   data, even for a routine check of the system's security, could
   represent an invasion of privacy.

   A second area of concern involves knowledge of intrusive behavior
   originating from your site.  If an organization keeps audit data, is
   it responsible for examining it to search for incidents?  If a host
   in one organization is used as a launching point for an attack
   against another organization, can the second organization use the
   audit data of the first organization to prove negligence on the part
   of that organization?

   The above examples are meant to be comprehensive, but should motivate
   your organization to consider the legal issues involved with audit
   data.

4.7  Securing Backups

   The procedure of creating backups is a classic part of operating a
   computer system.  Within the context of this document, backups are
   addressed as part of the overall security plan of a site.  There are
   several aspects to backups that are important within this context:

   (1)  Make sure your site is creating backups
   (2)  Make sure your site is using offsite storage for backups. The
        storage site should be carefully selected for both its security
        and its availability.
   (3)  Consider encrypting your backups to provide additional protection
        of the information once it is off-site.  However, be aware that
        you will need a good key management scheme so that you'll be
        able to recover data at any point in the future.  Also, make
        sure you will have access to the necessary decryption programs
        at such time in the future as you need to perform the
        decryption.
   (4)  Don't always assume that your backups are good.  There have been
        many instances of computer security incidents that have gone on
        for long periods of time before a site has noticed the incident.
        In such cases, backups of the affected systems are also tainted.
   (5)  Periodically verify the correctness and completeness of your
        backups.

5.  Security Incident Handling

   This chapter of the document will supply guidance to be used before,
   during, and after a computer security incident occurs on a host,
   network, site, or multi-site environment.  The operative philosophy
   in the event of a breach of computer security is to react according

noToC RFC2196 - Page 38

   to a plan.  This is true whether the breach is the result of an
   external intruder attack, unintentional damage, a student testing
   some new program to exploit a software vulnerability, or a
   disgruntled employee.  Each of the possible types of events, such as
   those just listed, should be addressed in advance by adequate
   contingency plans.

   Traditional computer security, while quite important in the overall
   site security plan, usually pays little attention to how to actually
   handle an attack once one occurs.  The result is that when an attack
   is in progress, many decisions are made in haste and can be damaging
   to tracking down the source of the incident, collecting evidence to
   be used in prosecution efforts, preparing for the recovery of the
   system, and protecting the valuable data contained on the system.

   One of the most important, but often overlooked, benefits for
   efficient incident handling is an economic one.  Having both
   technical and managerial personnel respond to an incident requires
   considerable resources.  If trained to handle incidents efficiently,
   less staff time is required when one occurs.

   Due to the world-wide network most incidents are not restricted to a
   single site.  Operating systems vulnerabilities apply (in some cases)
   to several millions of systems, and many vulnerabilities are
   exploited within the network itself.  Therefore, it is vital that all
   sites with involved parties be informed as soon as possible.

   Another benefit is related to public relations.  News about computer
   security incidents tends to be damaging to an organization's stature
   among current or potential clients.  Efficient incident handling
   minimizes the potential for negative exposure.

   A final benefit of efficient incident handling is related to legal
   issues.  It is possible that in the near future organizations may be
   held responsible because one of their nodes was used to launch a
   network attack.   In a similar vein, people who develop patches or
   workarounds may be sued if the patches or workarounds are
   ineffective, resulting in compromise of the systems, or, if the
   patches or workarounds themselves damage systems.  Knowing about
   operating system vulnerabilities and patterns of attacks, and then
   taking appropriate measures to counter these potential threats, is
   critical to circumventing possible legal problems.

noToC RFC2196 - Page 39

   The sections in this chapter provide an outline and starting point
   for creating your site's policy for handling security incidents.  The
   sections are:

   (1)  Preparing and planning (what are the goals and objectives in
        handling an incident).
   (2)  Notification (who should be contacted in the case of an
        incident).
          - Local managers and personnel
          - Law enforcement and investigative agencies
          - Computer security incidents handling teams
          - Affected and involved sites
          - Internal communications
          - Public relations and press releases
   (3)  Identifying an incident (is it an incident and how serious is
        it).
   (4)  Handling (what should be done when an incident occurs).
          - Notification (who should be notified about the incident)
          - Protecting evidence and activity logs (what records should be
            kept from before, during, and after the incident)
          - Containment (how can the damage be limited)
          - Eradication (how to eliminate the reasons for the incident)
          - Recovery (how to reestablish service and systems)
          - Follow Up (what actions should be taken after the incident)
   (5)  Aftermath (what are the implications of past incidents).
   (6)  Administrative response to incidents.

   The remainder of this chapter will detail the issues involved in each
   of the important topics listed above, and provide some guidance as to
   what should be included in a site policy for handling incidents.

5.1  Preparing and Planning for Incident Handling

   Part of handling an incident is being prepared to respond to an
   incident before the incident occurs in the first place.  This
   includes establishing a suitable level of protections as explained in
   the preceding chapters.  Doing this should help your site prevent
   incidents as well as limit potential damage resulting from them when
   they do occur.  Protection also includes preparing incident handling
   guidelines as part of a contingency plan for your organization or
   site.  Having written plans eliminates much of the ambiguity which
   occurs during an incident, and will lead to a more appropriate and
   thorough set of responses.  It is vitally important to test the
   proposed plan before an incident occurs through "dry runs".  A team
   might even consider hiring a tiger team to act in parallel with the
   dry run.  (Note: a tiger team is a team of specialists that try to
   penetrate the security of a system.)

noToC RFC2196 - Page 40

   Learning to respond efficiently to an incident is important for a
   number of reasons:

   (1)  Protecting the assets which could be compromised
   (2)  Protecting resources which could be utilized more
        profitably if an incident did not require their services
   (3)  Complying with (government or other) regulations
   (4)  Preventing the use of your systems in attacks against other
        systems (which could cause you to incur legal liability)
   (5)  Minimizing the potential for negative exposure

   As in any set of pre-planned procedures, attention must be paid to a
   set of goals for handling an incident.  These goals will be
   prioritized differently depending on the site.  A specific set of
   objectives can be identified for dealing with incidents:

   (1)  Figure out how it happened.
   (2)  Find out how to avoid further exploitation of the same
          vulnerability.
   (3)  Avoid escalation and further incidents.
   (4)  Assess the impact and damage of the incident.
   (5)  Recover from the incident.
   (6)  Update policies and procedures as needed.
   (7)  Find out who did it (if appropriate and possible).

   Due to the nature of the incident, there might be a conflict between
   analyzing the original source of a problem and restoring systems and
   services.  Overall goals (like assuring the integrity of critical
   systems) might be the reason for not analyzing an incident.  Of
   course, this is an important management decision; but all involved
   parties must be aware that without analysis the same incident may
   happen again.

   It is also important to prioritize the actions to be taken during an
   incident well in advance of the time an incident occurs.  Sometimes
   an incident may be so complex that it is impossible to do everything
   at once to respond to it; priorities are essential.  Although
   priorities will vary from institution to institution, the following
   suggested priorities may serve as a starting point for defining your
   organization's response:

   (1)  Priority one -- protect human life and people's
        safety; human life always has precedence over all
        other considerations.

   (2)  Priority two -- protect classified and/or sensitive
        data.  Prevent exploitation of classified and/or
        sensitive systems, networks or sites.  Inform affected

noToC RFC2196 - Page 41

        classified and/or sensitive systems, networks or sites
        about already occurred penetrations.
        (Be aware of regulations by your site or by government)

   (3)  Priority three -- protect other data, including
        proprietary, scientific, managerial and other data,
        because loss of data is costly in terms of resources.
        Prevent exploitations of other systems, networks or
        sites and inform already affected systems, networks or
        sites about successful penetrations.

   (4)  Priority four -- prevent damage to systems (e.g., loss
        or alteration of system files, damage to disk drives,
        etc.).  Damage to systems can result in costly down
        time and recovery.

   (5)  Priority five -- minimize disruption of computing
        resources (including processes).  It is better in many
        cases to shut a system down or disconnect from a network
        than to risk damage to data or systems. Sites will have
        to evaluate the trade-offs between shutting down and
        disconnecting, and staying up. There may be service
        agreements in place that may require keeping systems
        up even in light of further damage occurring. However,
        the damage and scope of an incident may be so extensive
        that service agreements may have to be over-ridden.

   An important implication for defining priorities is that once human
   life and national security considerations have been addressed, it is
   generally more important to save data than system software and
   hardware.  Although it is undesirable to have any damage or loss
   during an incident, systems can be replaced. However, the loss or
   compromise of data (especially classified or proprietary data) is
   usually not an acceptable outcome under any circumstances.

   Another important concern is the effect on others, beyond the systems
   and networks where the incident occurs.  Within the limits imposed by
   government regulations it is always important to inform affected
   parties as soon as possible.  Due to the legal implications of this
   topic, it should be included in the planned procedures to avoid
   further delays and uncertainties for the administrators.

   Any plan for responding to security incidents should be guided by
   local policies and regulations.  Government and private sites that
   deal with classified material have specific rules that they must
   follow.

noToC RFC2196 - Page 42

   The policies chosen by your site on how it reacts to incidents will
   shape your response.  For example, it may make little sense to create
   mechanisms to monitor and trace intruders if your site does not plan
   to take action against the intruders if they are caught.  Other
   organizations may have policies that affect your plans.  Telephone
   companies often release information about telephone traces only to
   law enforcement agencies.

   Handling incidents can be tedious and require any number of routine
   tasks that could be handled by support personnel. To free the
   technical staff it may be helpful to identify support staff who will
   help with tasks like: photocopying, fax'ing, etc.

5.2  Notification and Points of Contact

   It is important to establish contacts with various personnel before a
   real incident occurs.  Many times, incidents are not real
   emergencies. Indeed, often you will be able to handle the activities
   internally. However, there will also be many times when others
   outside your immediate department will need to be included in the
   incident handling.  These additional contacts include local managers
   and system administrators, administrative contacts for other sites on
   the Internet, and various investigative organizations.  Getting to
   know these contacts before incidents occurs will help to make your
   incident handling process more efficient.

   For each type of communication contact, specific "Points of Contact"
   (POC) should be defined.  These may be technical or administrative in
   nature and may include legal or investigative agencies as well as
   service providers and vendors.  When establishing these contact, it
   is important to decide how much information will be shared with each
   class of contact. It is especially important to define, ahead of
   time, what information will be shared with the users at a site, with
   the public (including the press), and with other sites.

   Settling these issues are especially important for the local person
   responsible for handling the incident, since that is the person
   responsible for the actual notification of others.  A list of
   contacts in each of these categories is an important time saver for
   this person during an incident.  It can be quite difficult to find an
   appropriate person during an incident when many urgent events are
   ongoing.  It is strongly recommended that all relevant telephone
   numbers (also electronic mail addresses and fax numbers) be included
   in the site security policy.  The names and contact information of
   all individuals who will be directly involved in the handling of an
   incident should be placed at the top of this list.

noToC RFC2196 - Page 43

5.2.1  Local Managers and Personnel

   When an incident is under way, a major issue is deciding who is in
   charge of coordinating the activity of the multitude of players.  A
   major mistake that can be made is to have a number of people who are
   each working independently, but are not working together.  This will
   only add to the confusion of the event and will probably lead to
   wasted or ineffective effort.

   The single POC may or may not be the person responsible for handling
   the incident.  There are two distinct roles to fill when deciding who
   shall be the POC and who will be the person in charge of the
   incident.  The person in charge of the incident will make decisions
   as to the interpretation of policy applied to the event.  In
   contrast, the POC must coordinate the effort of all the parties
   involved with handling the event.

   The POC must be a person with the technical expertise to successfully
   coordinate the efforts of the system managers and users involved in
   monitoring and reacting to the attack. Care should be taken when
   identifying who this person will be.  It should not necessarily be
   the same person who has administrative responsibility for the
   compromised systems since often such administrators have knowledge
   only sufficient for the day to day use of the computers, and lack in
   depth technical expertise.

   Another important function of the POC is to maintain contact with law
   enforcement and other external agencies to assure that multi-agency
   involvement occurs.  The level of involvement will be determined by
   management decisions as well as legal constraints.

   A single POC should also be the single person in charge of collecting
   evidence, since as a rule of thumb, the more people that touch a
   potential piece of evidence, the greater the possibility that it will
   be inadmissible in court. To ensure that evidence will be acceptable
   to the legal community, collecting evidence should be done following
   predefined procedures in accordance with local laws and legal
   regulations.

   One of the most critical tasks for the POC is the coordination of all
   relevant processes.  Responsibilities may be distributed over the
   whole site, involving multiple independent departments or groups.
   This will require a  well coordinated effort in order to achieve
   overall success.  The situation becomes even more complex if multiple
   sites are involved.  When this happens, rarely will a single POC at
   one site be able to adequately coordinate the handling of the entire
   incident.  Instead, appropriate incident response teams should be
   involved.

noToC RFC2196 - Page 44

   The incident handling process should provide some escalation
   mechanisms.  In order to define such a mechanism, sites will need to
   create an internal classification scheme for incidents. Associated
   with each level of incident will be the appropriate POC and
   procedures.  As an incident is escalated, there may be a change in
   the POC which will need to be communicated to all others involved in
   handling the incident. When a change in the POC occurs, old POC
   should brief the new POC in all background information.

   Lastly, users must know how to report suspected incidents. Sites
   should establish reporting procedures that will work both during and
   outside normal working hours. Help desks are often used to receive
   these reports during normal working hours, while beepers and
   telephones can be used for out of hours reporting.

5.2.2  Law Enforcement and Investigative Agencies

   In the event of an incident that has legal consequences, it is
   important to establish contact with investigative agencies (e.g, the
   FBI and Secret Service in the U.S.) as soon as possible.  Local law
   enforcement, local security offices, and campus police departments
   should also be informed as appropriate.   This section describes many
   of the issues that will be confronted, but it is acknowledged that
   each organization will have its own local and governmental laws and
   regulations that will impact how they interact with law enforcement
   and investigative agencies. The most important point to make is that
   each site needs to work through these issues.

   A primary reason for determining these point of contact well in
   advance of an incident is that once a major attack is in progress,
   there is little time to call these agencies to determine exactly who
   the correct point of contact is.  Another reason is that it is
   important to cooperate with these agencies in a manner that will
   foster a good working relationship, and that will be in accordance
   with the working procedures of these agencies.  Knowing the working
   procedures in advance, and the expectations of your point of contact
   is a big step in this direction.  For example, it is important to
   gather evidence that will be admissible in any subsequent legal
   proceedings, and this will require prior knowledge of how to gather
   such evidence.  A final reason for establishing contacts as soon as
   possible is that it is impossible to know the particular agency that
   will assume jurisdiction in any given incident.  Making contacts and
   finding the proper channels early on will make responding to an
   incident go considerably more smoothly.

noToC RFC2196 - Page 45

   If your organization or site has a legal counsel, you need to notify
   this office soon after you learn that an incident is in progress.  At
   a minimum, your legal counsel needs to be involved to protect the
   legal and financial interests of your site or organization.  There
   are many legal and practical issues, a few of which are:


   (1)  Whether your site or organization is willing to risk negative
        publicity or exposure to cooperate with legal prosecution
        efforts.

   (2)  Downstream liability--if you leave a compromised system as is so
        it can be monitored and another computer is damaged because the
        attack originated from your system, your site or organization
        may be liable for damages incurred.

   (3)  Distribution of information--if your site or organization
        distributes information about an attack in which another site or
        organization may be involved or the vulnerability in a product
        that may affect ability to market that product, your site or
        organization may again be liable for any damages (including
        damage of reputation).

   (4)  Liabilities due to monitoring--your site or organization may be
        sued if users at your site or elsewhere discover that your site
        is monitoring account activity without informing users.

   Unfortunately, there are no clear precedents yet on the liabilities
   or responsibilities of organizations involved in a security incident
   or who might be involved in supporting an investigative effort.
   Investigators will often encourage organizations to help trace and
   monitor intruders.  Indeed, most investigators cannot pursue computer
   intrusions without extensive support from the organizations involved.
   However, investigators cannot provide protection from liability
   claims, and these kinds of efforts may drag out for months and may
   take a lot of effort.

   On the other hand, an organization's legal council may advise extreme
   caution and suggest that tracing activities be halted and an intruder
   shut out of the system.  This, in itself, may not provide protection
   from liability, and may prevent investigators from identifying the
   perpetrator.

   The balance between supporting investigative activity and limiting
   liability is tricky. You'll need to consider the advice of your legal
   counsel and the damage the intruder is causing (if any) when making
   your decision about what to do during any particular incident.

noToC RFC2196 - Page 46

   Your legal counsel should also be involved in any decision to contact
   investigative agencies when an incident occurs at your site.  The
   decision to coordinate efforts with investigative agencies is most
   properly that of your site or organization.  Involving your legal
   counsel will also foster the multi-level coordination between your
   site and the particular investigative agency involved, which in turn
   results in an efficient division of labor.  Another result is that
   you are likely to obtain guidance that will help you avoid future
   legal mistakes.

   Finally, your legal counsel should evaluate your site's written
   procedures for responding to incidents.  It is essential to obtain a
   "clean bill of health" from a legal perspective before you actually
   carry out these procedures.

   It is vital, when dealing with investigative agencies, to verify that
   the person who calls asking for information is a legitimate
   representative from the agency in question.  Unfortunately, many well
   intentioned people have unknowingly leaked sensitive details about
   incidents, allowed unauthorized people into their systems, etc.,
   because a caller has masqueraded as a representative of a government
   agency. (Note: this word of caution actually applies to all external
   contacts.)

   A similar consideration is using a secure means of communication.
   Because many network attackers can easily re-route electronic mail,
   avoid using electronic mail to communicate with other agencies (as
   well as others dealing with the incident at hand). Non-secured phone
   lines (the phones normally used in the business world) are also
   frequent targets for tapping by network intruders, so be careful!

   There is no one established set of rules for responding to an
   incident when the local government becomes involved.  Normally (in
   the U.S.), except by legal order, no agency can force you to monitor,
   to disconnect from the network, to avoid telephone contact with the
   suspected attackers, etc. Each organization will have a set of local
   and national laws and regulations that must be adhered to when
   handling incidents. It is recommended that each site be familiar with
   those laws and regulations, and identify and get know the contacts
   for agencies with jurisdiction well in advance of handling an
   incident.

5.2.3  Computer Security Incident Handling Teams

   There are currently a number of of Computer Security Incident
   Response teams (CSIRTs) such as the CERT Coordination Center, the
   German DFN-CERT, and other teams around the globe.  Teams exist for
   many major government agencies and large corporations.  If such a

noToC RFC2196 - Page 47

   team is available, notifying it should be of primary consideration
   during the early stages of an incident.  These teams are responsible
   for coordinating computer security incidents over a range of sites
   and larger entities.  Even if the incident is believed to be
   contained within a single site, it is possible that the information
   available through a response team could help in fully resolving the
   incident.

   If it is determined that the breach occurred due to a flaw in the
   system's hardware or software, the vendor (or supplier) and a
   Computer Security Incident Handling team should be notified as soon
   as possible.  This is especially important because many other systems
   are vulnerable, and these vendor and response team organizations can
   help disseminate help to other affected sites.

   In setting up a site policy for incident handling, it may be
   desirable to create a subgroup, much like those teams that already
   exist, that will be responsible for handling computer security
   incidents for the site (or organization).  If such a team is created,
   it is essential that communication lines be opened between this team
   and other teams.  Once an incident is under way, it is difficult to
   open a trusted dialogue between other teams if none has existed
   before.

5.2.4  Affected and Involved Sites

   If an incident has an impact on other sites, it is good practice to
   inform them.  It may be obvious from the beginning that the incident
   is not limited to the local site, or it may emerge only after further
   analysis.

   Each site may choose to contact other sites directly or they can pass
   the information to an appropriate incident response team. It is often
   very difficult to find the responsible POC at remote sites and the
   incident response team will be able to  facilitate contact by making
   use of already established channels.

   The legal and liability issues arising from a security incident will
   differ from site to site.  It is important to define a policy for the
   sharing and logging of information about other sites before an
   incident occurs.

   Information about specific people is especially sensitive, and may be
   subject to privacy laws.  To avoid problems in this area, irrelevant
   information should be deleted and a statement of how to handle the
   remaining information should be included.  A clear statement of how
   this information is to be used is essential.  No one who informs a
   site of a security incident wants to read about it in the public

noToC RFC2196 - Page 48

   press.  Incident response teams are valuable in this respect.  When
   they pass information to responsible POCs, they are able to protect
   the anonymity of the original source. But, be aware that, in many
   cases, the analysis of logs and information at other sites will
   reveal addresses of your site.

   All the problems discussed above should be not taken as reasons not
   to involve other sites.  In fact, the experiences of existing teams
   reveal that most sites informed about security problems are not even
   aware that their site had been compromised.  Without timely
   information, other sites are often unable to take action against
   intruders.

5.2.5  Internal Communications

   It is crucial during a major incident to communicate why certain
   actions are being taken, and how the users (or departments) are
   expected to behave. In particular, it should be made very clear to
   users what they are allowed to say (and not say) to the outside world
   (including other departments). For example, it wouldn't be good for
   an organization if users replied to customers with something like,
   "I'm sorry the systems are down, we've had an intruder and we are
   trying to clean things up." It would be much better if they were
   instructed to respond with a prepared statement like, "I'm sorry our
   systems are unavailable, they are being maintained for better service
   in the future."

   Communications with customers and contract partners should be handled
   in a sensible, but sensitive way. One can prepare for the main issues
   by preparing a checklist. When an incident occurs, the checklist can
   be used with the addition of a sentence or two for the specific
   circumstances of the incident.

   Public relations departments can be very helpful during incidents.
   They should be involved in all planning and can provide well
   constructed responses for use when contact with outside departments
   and organizations is necessary.

5.2.6  Public Relations - Press Releases

   There has been a tremendous growth in the amount of media coverage
   dedicated to computer security incidents in the United States. Such
   press coverage is bound to extend to other countries as the Internet
   continues to grow and expand internationally.  Readers from countries
   where such media attention has not yet occurred, can learn from the
   experiences in the U.S. and should be forwarned and prepared.

noToC RFC2196 - Page 49

   One of the most important issues to consider is when, who, and how
   much to release to the general public through the press.  There are
   many issues to consider when deciding this particular issue.  First
   and foremost, if a public relations office exists for the site, it is
   important to use this office as liaison to the press.  The public
   relations office is trained in the type and wording of information
   released, and will help to assure that the image of the site is
   protected during and after the incident (if possible).  A public
   relations office has the advantage that you can communicate candidly
   with them, and provide a buffer between the constant press attention
   and the need of the POC to maintain control over the incident.

   If a public relations office is not available, the information
   released to the press must be carefully considered.  If the
   information is sensitive, it may be advantageous to provide only
   minimal or overview information to the press.  It is quite possible
   that any information provided to the press will be quickly reviewed
   by the perpetrator of the incident.  Also note that misleading the
   press can often backfire and cause more damage than releasing
   sensitive information.

   While it is difficult to determine in advance what level of detail to
   provide to the press, some guidelines to keep in mind are:

   (1)  Keep the technical level of detail low.  Detailed
        information about the incident may provide enough
        information for others to launch similar attacks on
        other sites, or even damage the site's ability to
        prosecute the guilty party once the event is over.

   (2)  Keep the speculation out of press statements.
        Speculation of who is causing the incident or the
        motives are very likely to be in error and may cause
        an inflamed view of the incident.

   (3)  Work with law enforcement professionals to assure that
        evidence is protected.  If prosecution is involved,
        assure that the evidence collected is not divulged to
        the press.

   (4)  Try not to be forced into a press interview before you are
        prepared.  The popular press is famous for the "2 am"
        interview, where the hope is to catch the interviewee off
        guard and obtain information otherwise not available.

   (5)  Do not allow the press attention to detract from the
        handling of the event.  Always remember that the successful
        closure of an incident is of primary importance.

(next page on part 3)