RFC 1244

Site Security Handbook

Pages: 101
Obsoleted by: 2196

Part 3 of 4 – Pages 56 to 81

noToC RFC1244 - Page 56 prevText

4.  Types of Security Procedures

4.1  System Security Audits

   Most businesses undergo some sort of annual financial auditing as a
   regular part of their business life.  Security audits are an
   important part of running any computing environment.  Part of the
   security audit should be a review of any policies that concern system
   security, as well as the mechanisms that are put in place to enforce
   them.

   4.1.1   Organize Scheduled Drills

      Although not something that would be done each day or week,
      scheduled drills may be conducted to determine if the procedures
      defined are adequate for the threat to be countered.  If your
      major threat is one of natural disaster, then a drill would be
      conducted to verify your backup and recovery mechanisms.  On the
      other hand, if your greatest threat is from external intruders
      attempting to penetrate your system, a drill might be conducted to
      actually try a penetration to observe the effect of the policies.

      Drills are a valuable way to test that your policies and
      procedures are effective.  On the other hand, drills can be time-
      consuming and disruptive to normal operations.  It is important to
      weigh the benefits of the drills against the possible time loss
      which may be associated with them.

   4.1.2  Test Procedures

      If the choice is made to not to use scheduled drills to examine
      your entire security procedure at one time, it is important to
      test individual procedures frequently.  Examine your backup
      procedure to make sure you can recover data from the tapes.  Check
      log files to be sure that information which is supposed to be
      logged to them is being logged to them, etc..

      When a security audit is mandated, great care should be used in
      devising tests of the security policy.  It is important to clearly
      identify what is being tested, how the test will be conducted, and
      results expected from the test.  This should all be documented and
      included in or as an adjunct to the security policy document
      itself.

      It is important to test all aspects of the security policy, both
      procedural and automated, with a particular emphasis on the
      automated mechanisms used to enforce the policy.  Tests should be
      defined to ensure a comprehensive examination of policy features,

noToC RFC1244 - Page 57

      that is, if a test is defined to examine the user logon process,
      it should be explicitly stated that both valid and invalid user
      names and passwords will be used to demonstrate proper operation
      of the logon program.

      Keep in mind that there is a limit to the reasonableness of tests.
      The purpose of testing is to ensure confidence that the security
      policy is being correctly enforced, and not to "prove" the
      absoluteness of the system or policy.  The goal should be to
      obtain some assurance that the reasonable and credible controls
      imposed by your security policy are adequate.

4.2  Account Management Procedures

   Procedures to manage accounts are important in preventing
   unauthorized access to your system.  It is necessary to decide
   several things: Who may have an account on the system?  How long may
   someone have an account without renewing his or her request?  How do
   old accounts get removed from the system?  The answers to all these
   questions should be explicitly set out in the policy.

   In addition to deciding who may use a system, it may be important to
   determine what each user may use the system for (is personal use
   allowed, for example).  If you are connected to an outside network,
   your site or the network management may have rules about what the
   network may be used for.  Therefore, it is important for any security
   policy to define an adequate account management procedure for both
   administrators and users.  Typically, the system administrator would
   be responsible for creating and deleting user accounts and generally
   maintaining overall control of system use.  To some degree, account
   management is also the responsibility of each system user in the
   sense that the user should observe any system messages and events
   that may be indicative of a policy violation.  For example, a message
   at logon that indicates the date and time of the last logon should be
   reported by the user if it indicates an unreasonable time of last
   logon.

4.3  Password Management Procedures

   A policy on password management may be important if your site wishes
   to enforce secure passwords.  These procedures may range from asking
   or forcing users to change their passwords occasionally to actively
   attempting to break users' passwords and then informing the user of
   how easy it was to do.  Another part of password management policy
   covers who may distribute passwords - can users give their passwords
   to other users?

   Section 2.3 discusses some of the policy issues that need to be

noToC RFC1244 - Page 58

   decided for proper password management.  Regardless of the policies,
   password management procedures need to be carefully setup to avoid
   disclosing passwords.  The choice of initial passwords for accounts
   is critical.  In some cases, users may never login to activate an
   account; thus, the choice of the initial password should not be
   easily guessed.  Default passwords should never be assigned to
   accounts: always create new passwords for each user.  If there are
   any printed lists of passwords, these should be kept off-line in
   secure locations; better yet, don't list passwords.

   4.3.1  Password Selection

      Perhaps the most vulnerable part of any computer system is the
      account password.  Any computer system, no matter how secure it is
      from network or dial-up attack, Trojan horse programs, and so on,
      can be fully exploited by an intruder if he or she can gain access
      via a poorly chosen password.  It is important to define a good
      set of rules for password selection, and distribute these rules to
      all users.  If possible, the software which sets user passwords
      should be modified to enforce as many of the rules as possible.

      A sample set of guidelines for password selection is shown below:

         - DON'T use your login name in any form (as-is,
           reversed, capitalized, doubled, etc.).

         - DON'T use your first, middle, or last name in any form.

         - DON'T use your spouse's or child's name.

         - DON'T use other information easily obtained about you.
           This includes license plate numbers, telephone numbers,
           social security numbers, the make of your automobile,
           the name of the street you live on, etc..

         - DON'T use a password of all digits, or all the same
           letter.

         - DON'T use a word contained in English or foreign
           language dictionaries, spelling lists, or other
           lists of words.

         - DON'T use a password shorter than six characters.

         - DO use a password with mixed-case alphabetics.

         - DO use a password with non-alphabetic characters (digits
           or punctuation).

noToC RFC1244 - Page 59

         - DO use a password that is easy to remember, so you don't
           have to write it down.

         - DO use a password that you can type quickly, without
           having to look at the keyboard.

      Methods of selecting a password which adheres to these guidelines
      include:

         - Choose a line or two from a song or poem, and use the
           first letter of each word.

         - Alternate between one consonant and one or two vowels, up
           to seven or eight characters.  This provides nonsense
           words which are usually pronounceable, and thus easily
           remembered.

         - Choose two short words and concatenate them together with
           a punctuation character between them.

      Users should also be told to change their password periodically,
      usually every three to six months.  This makes sure that an
      intruder who has guessed a password will eventually lose access,
      as well as invalidating any list of passwords he/she may have
      obtained.  Many systems enable the system administrator to force
      users to change their passwords after an expiration period; this
      software should be enabled if your system supports it [5, CURRY].

      Some systems provide software which forces users to change their
      passwords on a regular basis.  Many of these systems also include
      password generators which provide the user with a set of passwords
      to choose from.  The user is not permitted to make up his or her
      own password.  There are arguments both for and against systems
      such as these.  On the one hand, by using generated passwords,
      users are prevented from selecting insecure passwords.  On the
      other hand, unless the generator is good at making up easy to
      remember passwords, users will begin writing them down in order to
      remember them.

   4.3.2  Procedures for Changing Passwords

      How password changes are handled is important to keeping passwords
      secure.  Ideally, users should be able to change their own
      passwords on-line.  (Note that password changing programs are a
      favorite target of intruders.  See section 4.4 on configuration
      management for further information.)

      However, there are exception cases which must be handled

noToC RFC1244 - Page 60

      carefully.  Users may forget passwords and not be able to get onto
      the system.  The standard procedure is to assign the user a new
      password.  Care should be taken to make sure that the real person
      is requesting the change and gets the new password.  One common
      trick used by intruders is to call or message to a system
      administrator and request a new password. Some external form of
      verification should be used before the password is assigned.  At
      some sites, users are required to show up in person with ID.

      There may also be times when many passwords need to be changed.
      If a system is compromised by an intruder, the intruder may be
      able to steal a password file and take it off the system.  Under
      these circumstances, one course of action is to change all
      passwords on the system.  Your site should have procedures for how
      this can be done quickly and efficiently.  What course you choose
      may depend on the urgency of the problem.  In the case of a known
      attack with damage, you may choose to forcibly disable all
      accounts and assign users new passwords before they come back onto
      the system.  In some places, users are sent a message telling them
      that they should change their passwords, perhaps within a certain
      time period.  If the password isn't changed before the time period
      expires, the account is locked.

      Users should be aware of what the standard procedure is for
      passwords when a security event has occurred.  One well-known
      spoof reported by the Computer Emergency Response Team (CERT)
      involved messages sent to users, supposedly from local system
      administrators, requesting them to immediately change their
      password to a new value provided in the message [24].  These
      messages were not from the administrators, but from intruders
      trying to steal accounts.  Users should be warned to immediately
      report any suspicious requests such as this to site
      administrators.

4.4  Configuration Management Procedures

   Configuration management is generally applied to the software
   development process.  However, it is certainly applicable in a
   operational sense as well.  Consider that the since many of the
   system level programs are intended to enforce the security policy, it
   is important that these be "known" as correct.  That is, one should
   not allow system level programs (such as the operating system, etc.)
   to be changed arbitrarily.  At very least, the procedures should
   state who is authorized to make changes to systems, under what
   circumstances, and how the changes should be documented.

   In some environments, configuration management is also desirable as
   applied to physical configuration of equipment.  Maintaining valid

noToC RFC1244 - Page 61

   and authorized hardware configuration should be given due
   consideration in your security policy.

   4.4.1  Non-Standard Configurations

      Occasionally, it may be beneficial to have a slightly non-standard
      configuration in order to thwart the "standard" attacks used by
      some intruders.  The non-standard parts of the configuration might
      include different password encryption algorithms, different
      configuration file locations, and rewritten or functionally
      limited system commands.

      Non-standard configurations, however, also have their drawbacks.
      By changing the "standard" system, these modifications make
      software maintenance more difficult by requiring extra
      documentation to be written, software modification after operating
      system upgrades, and, usually, someone with special knowledge of
      the changes.

      Because of the drawbacks of non-standard configurations, they are
      often only used in environments with a "firewall" machine (see
      section 3.9.1).  The firewall machine is modified in non-standard
      ways since it is susceptible to attack, while internal systems
      behind the firewall are left in their standard configurations.

5.  Incident Handling

5.1  Overview

   This section of the document will supply some guidance to be applied
   when a computer security event is in progress on a machine, network,
   site, or multi-site environment.  The operative philosophy in the
   event of a breach of computer security, whether it be an external
   intruder attack or a disgruntled employee, is to plan for adverse
   events in advance.  There is no substitute for creating contingency
   plans for the types of events described above.

   Traditional computer security, while quite important in the overall
   site security plan, usually falls heavily on protecting systems from
   attack, and perhaps monitoring systems to detect attacks.  Little
   attention is usually paid for how to actually handle the attack when
   it occurs.  The result is that when an attack is in progress, many
   decisions are made in haste and can be damaging to tracking down the
   source of the incident, collecting evidence to be used in prosecution
   efforts, preparing for the recovery of the system, and protecting the
   valuable data contained on the system.

noToC RFC1244 - Page 62

   5.1.1  Have a Plan to Follow in Case of an Incident

      Part of handling an incident is being prepared to respond before
      the incident occurs.  This includes establishing a suitable level
      of protections, so that if the incident becomes severe, the damage
      which can occur is limited.  Protection includes preparing
      incident handling guidelines or a contingency response plan for
      your organization or site.  Having written plans eliminates much
      of the ambiguity which occurs during an incident, and will lead to
      a more appropriate and thorough set of responses.  Second, part of
      protection is preparing a method of notification, so you will know
      who to call and the relevant phone numbers.  It is important, for
      example, to conduct "dry runs," in which your computer security
      personnel, system administrators, and managers simulate handling
      an incident.

      Learning to respond efficiently to an incident is important for
      numerous reasons.  The most important benefit is directly to human
      beings--preventing loss of human life.  Some computing systems are
      life critical systems, systems on which human life depends (e.g.,
      by controlling some aspect of life-support in a hospital or
      assisting air traffic controllers).

      An important but often overlooked benefit is an economic one.
      Having both technical and managerial personnel respond to an
      incident requires considerable resources, resources which could be
      utilized more profitably if an incident did not require their
      services.  If these personnel are trained to handle an incident
      efficiently, less of their time is required to deal with that
      incident.

      A third benefit is protecting classified, sensitive, or
      proprietary information.  One of the major dangers of a computer
      security incident is that information may be irrecoverable.
      Efficient incident handling minimizes this danger.  When
      classified information is involved, other government regulations
      may apply and must be integrated into any plan for incident
      handling.

      A fourth benefit is related to public relations.  News about
      computer security incidents tends to be damaging to an
      organization's stature among current or potential clients.
      Efficient incident handling minimizes the potential for negative
      exposure.

      A final benefit of efficient incident handling is related to legal
      issues.  It is possible that in the near future organizations may
      be sued because one of their nodes was used to launch a network

noToC RFC1244 - Page 63

      attack.  In a similar vein, people who develop patches or
      workarounds may be sued if the patches or workarounds are
      ineffective, resulting in damage to systems, or if the patches or
      workarounds themselves damage systems.  Knowing about operating
      system vulnerabilities and patterns of attacks and then taking
      appropriate measures is critical to circumventing possible legal
      problems.

   5.1.2  Order of Discussion in this Session Suggests an Order for
          a Plan

      This chapter is arranged such that a list may be generated from
      the Table of Contents to provide a starting point for creating a
      policy for handling ongoing incidents.  The main points to be
      included in a policy for handling incidents are:

         o Overview (what are the goals and objectives in handling the
           incident).
         o Evaluation (how serious is the incident).
         o Notification (who should be notified about the incident).
         o Response (what should the response to the incident be).
         o Legal/Investigative (what are the legal and prosecutorial
           implications of the incident).
         o Documentation Logs (what records should be kept from before,
           during, and after the incident).

      Each of these points is important in an overall plan for handling
      incidents.  The remainder of this chapter will detail the issues
      involved in each of these topics, and provide some guidance as to
      what should be included in a site policy for handling incidents.

      5.1.3  Possible Goals and Incentives for Efficient Incident
             Handling

      As in any set of pre-planned procedures, attention must be placed
      on a set of goals to be obtained in handling an incident.  These
      goals will be placed in order of importance depending on the site,
      but one such set of goals might be:

         Assure integrity of (life) critical systems.
         Maintain and restore data.
         Maintain and restore service.
         Figure out how it happened.
         Avoid escalation and further incidents.
         Avoid negative publicity.
         Find out who did it.
         Punish the attackers.

noToC RFC1244 - Page 64

      It is important to prioritize actions to be taken during an
      incident well in advance of the time an incident occurs.
      Sometimes an incident may be so complex that it is impossible to
      do everything at once to respond to it; priorities are essential.
      Although priorities will vary from institution-to-institution, the
      following suggested priorities serve as a starting point for
      defining an organization's response:

         o Priority one -- protect human life and people's
           safety; human life always has precedence over all
           other considerations.

         o Priority two -- protect classified and/or sensitive
           data (as regulated by your site or by government
           regulations).

         o Priority three -- protect other data, including
           proprietary, scientific, managerial and other data,
           because loss of data is costly in terms of resources.

         o Priority four -- prevent damage to systems (e.g., loss
           or alteration of system files, damage to disk drives,
           etc.); damage to systems can result in costly down
           time and recovery.

         o Priority five -- minimize disruption of computing
           resources; it is better in many cases to shut a system
           down or disconnect from a network than to risk damage
           to data or systems.

      An important implication for defining priorities is that once
      human life and national security considerations have been
      addressed, it is generally more important to save data than system
      software and hardware.  Although it is undesirable to have any
      damage or loss during an incident, systems can be replaced; the
      loss or compromise of data (especially classified data), however,
      is usually not an acceptable outcome under any circumstances.

      Part of handling an incident is being prepared to respond before
      the incident occurs.  This includes establishing a suitable level
      of protections so that if the incident becomes severe, the damage
      which can occur is limited.  Protection includes preparing
      incident handling guidelines or a contingency response plan for
      your organization or site.  Written plans eliminate much of the
      ambiguity which occurs during an incident, and will lead to a more
      appropriate and thorough set of responses.  Second, part of
      protection is preparing a method of notification so you will know
      who to call and how to contact them.  For example, every member of

noToC RFC1244 - Page 65

      the Department of Energy's CIAC Team carries a card with every
      other team member's work and home phone numbers, as well as pager
      numbers.  Third, your organization or site should establish backup
      procedures for every machine and system.  Having backups
      eliminates much of the threat of even a severe incident, since
      backups preclude serious data loss.  Fourth, you should set up
      secure systems.  This involves eliminating vulnerabilities,
      establishing an effective password policy, and other procedures,
      all of which will be explained later in this document.  Finally,
      conducting training activities is part of protection.  It is
      important, for example, to conduct "dry runs," in which your
      computer security personnel, system administrators, and managers
      simulate handling an incident.

   5.1.4  Local Policies and Regulations Providing Guidance

      Any plan for responding to security incidents should be guided by
      local policies and regulations.  Government and private sites that
      deal with classified material have specific rules that they must
      follow.

      The policies your site makes about how it responds to incidents
      (as discussed in sections 2.4 and 2.5) will shape your response.
      For example, it may make little sense to create mechanisms to
      monitor and trace intruders if your site does not plan to take
      action against the intruders if they are caught.  Other
      organizations may have policies that affect your plans.  Telephone
      companies often release information about telephone traces only to
      law enforcement agencies.

      Section 5.5 also notes that if any legal action is planned, there
      are specific guidelines that must be followed to make sure that
      any information collected can be used as evidence.

5.2  Evaluation

   5.2.1  Is It Real?

      This stage involves determining the exact problem.  Of course
      many, if not most, signs often associated with virus infections,
      system intrusions, etc., are simply anomalies such as hardware
      failures.  To assist in identifying whether there really is an
      incident, it is usually helpful to obtain and use any detection
      software which may be available.  For example, widely available
      software packages can greatly assist someone who thinks there may
      be a virus in a Macintosh computer.  Audit information is also
      extremely useful, especially in determining whether there is a
      network attack.  It is extremely important to obtain a system

noToC RFC1244 - Page 66

      snapshot as soon as one suspects that something is wrong.  Many
      incidents cause a dynamic chain of events to occur, and an initial
      system snapshot may do more good in identifying the problem and
      any source of attack than most other actions which can be taken at
      this stage.  Finally, it is important to start a log book.
      Recording system events, telephone conversations, time stamps,
      etc., can lead to a more rapid and systematic identification of
      the problem, and is the basis for subsequent stages of incident
      handling.

      There are certain indications or "symptoms" of an incident which
      deserve special attention:

         o System crashes.
         o New user accounts (e.g., the account RUMPLESTILTSKIN
           has unexplainedly been created), or high activity on
           an account that has had virtually no activity for
           months.
         o New files (usually with novel or strange file names,
           such as data.xx or k).
         o Accounting discrepancies (e.g., in a UNIX system you
           might notice that the accounting file called
           /usr/admin/lastlog has shrunk, something that should
           make you very suspicious that there may be an
           intruder).
         o Changes in file lengths or dates (e.g., a user should
           be suspicious if he/she observes that the .EXE files in
           an MS DOS computer have unexplainedly grown
           by over 1800 bytes).
         o Attempts to write to system (e.g., a system manager
           notices that a privileged user in a VMS system is
           attempting to alter RIGHTSLIST.DAT).
         o Data modification or deletion (e.g., files start to
           disappear).
         o Denial of service (e.g., a system manager and all
           other users become locked out of a UNIX system, which
           has been changed to single user mode).
         o Unexplained, poor system performance (e.g., system
           response time becomes unusually slow).
         o Anomalies (e.g., "GOTCHA" is displayed on a display
           terminal or there are frequent unexplained "beeps").
         o Suspicious probes (e.g., there are numerous
           unsuccessful login attempts from another node).
         o Suspicious browsing (e.g., someone becomes a root user
           on a UNIX system and accesses file after file in one
           user's account, then another's).

      None of these indications is absolute "proof" that an incident is

noToC RFC1244 - Page 67

      occurring, nor are all of these indications normally observed when
      an incident occurs.  If you observe any of these indications,
      however, it is important to suspect that an incident might be
      occurring, and act accordingly.  There is no formula for
      determining with 100 percent accuracy that an incident is
      occurring (possible exception: when a virus detection package
      indicates that your machine has the nVIR virus and you confirm
      this by examining contents of the nVIR resource in your Macintosh
      computer, you can be very certain that your machine is infected).
      It is best at this point to collaborate with other technical and
      computer security personnel to make a decision as a group about
      whether an incident is occurring.

   5.2.2  Scope

      Along with the identification of the incident is the evaluation of
      the scope and impact of the problem.  It is important to correctly
      identify the boundaries of the incident in order to effectively
      deal with it.  In addition, the impact of an incident will
      determine its priority in allocating resources to deal with the
      event.  Without an indication of the scope and impact of the
      event, it is difficult to determine a correct response.

      In order to identify the scope and impact, a set of criteria
      should be defined which is appropriate to the site and to the type
      of connections available.  Some of the issues are:

         o Is this a multi-site incident?
         o Are many computers at your site effected by this
           incident?
         o Is sensitive information involved?
         o What is the entry point of the incident (network,
           phone line, local terminal, etc.)?
         o Is the press involved?
         o What is the potential damage of the incident?
         o What is the estimated time to close out the incident?
         o What resources could be required
           to handle the incident?

5.3  Possible Types of Notification

   When you have confirmed that an incident is occurring, the
   appropriate personnel must be notified.  Who and how this
   notification is achieved is very important in keeping the event under
   control both from a technical and emotional standpoint.

noToC RFC1244 - Page 68

   5.3.1  Explicit

      First of all, any notification to either local or off-site
      personnel must be explicit.  This requires that any statement (be
      it an electronic mail message, phone call, or fax) provides
      information about the incident that is clear, concise, and fully
      qualified.  When you are notifying others that will help you to
      handle an event, a "smoke screen" will only divide the effort and
      create confusion.  If a division of labor is suggested, it is
      helpful to provide information to each section about what is being
      accomplished in other efforts.  This will not only reduce
      duplication of effort, but allow people working on parts of the
      problem to know where to obtain other information that would help
      them resolve a part of the incident.

   5.3.2  Factual

      Another important consideration when communicating about the
      incident is to be factual.  Attempting to hide aspects of the
      incident by providing false or incomplete information may not only
      prevent a successful resolution to the incident, but may even
      worsen the situation.  This is especially true when the press is
      involved.  When an incident severe enough to gain press attention
      is ongoing, it is likely that any false information you provide
      will not be substantiated by other sources.  This will reflect
      badly on the site and may create enough ill-will between the site
      and the press to damage the site's public relations.

   5.3.3  Choice of Language

      The choice of language used when notifying people about the
      incident can have a profound effect on the way that information is
      received.  When you use emotional or inflammatory terms, you raise
      the expectations of damage and negative outcomes of the incident.
      It is important to remain calm both in written and spoken
      notifications.

      Another issue associated with the choice of language is the
      notification to non-technical or off-site personnel.  It is
      important to accurately describe the incident without undue alarm
      or confusing messages.  While it is more difficult to describe the
      incident to a non-technical audience, it is often more important.
      A non-technical description may be required for upper-level
      management, the press, or law enforcement liaisons.  The
      importance of these notifications cannot be underestimated and may
      make the difference between handling the incident properly and
      escalating to some higher level of damage.

noToC RFC1244 - Page 69

   5.3.4  Notification of Individuals

         o Point of Contact (POC) people (Technical, Administrative,
           Response Teams, Investigative, Legal, Vendors, Service
           providers), and which POCs are visible to whom.
         o Wider community (users).
         o Other sites that might be affected.

      Finally, there is the question of who should be notified during
      and after the incident.  There are several classes of individuals
      that need to be considered for notification.  These are the
      technical personnel, administration, appropriate response teams
      (such as CERT or CIAC), law enforcement, vendors, and other
      service providers.  These issues are important for the central
      point of contact, since that is the person responsible for the
      actual notification of others (see section 5.3.6 for further
      information).  A list of people in each of these categories is an
      important time saver for the POC during an incident.  It is much
      more difficult to find an appropriate person during an incident
      when many urgent events are ongoing.

      In addition to the people responsible for handling part of the
      incident, there may be other sites affected by the incident (or
      perhaps simply at risk from the incident).  A wider community of
      users may also benefit from knowledge of the incident.  Often, a
      report of the incident once it is closed out is appropriate for
      publication to the wider user community.

   5.3.5  Public Relations - Press Releases

      One of the most important issues to consider is when, who, and how
      much to release to the general public through the press.  There
      are many issues to consider when deciding this particular issue.
      First and foremost, if a public relations office exists for the
      site, it is important to use this office as liaison to the press.
      The public relations office is trained in the type and wording of
      information released, and will help to assure that the image of
      the site is protected during and after the incident (if possible).
      A public relations office has the advantage that you can
      communicate candidly with them, and provide a buffer between the
      constant press attention and the need of the POC to maintain
      control over the incident.

      If a public relations office is not available, the information
      released to the press must be carefully considered.  If the
      information is sensitive, it may be advantageous to provide only
      minimal or overview information to the press.  It is quite
      possible that any information provided to the press will be

noToC RFC1244 - Page 70

      quickly reviewed by the perpetrator of the incident.  As a
      contrast to this consideration, it was discussed above that
      misleading the press can often backfire and cause more damage than
      releasing sensitive information.

      While it is difficult to determine in advance what level of detail
      to provide to the press, some guidelines to keep in mind are:

         o Keep the technical level of detail low.  Detailed
           information about the incident may provide enough
           information for copy-cat events or even damage the
           site's ability to prosecute once the event is over.
         o Keep the speculation out of press statements.
           Speculation of who is causing the incident or the
           motives are very likely to be in error and may cause
           an inflamed view of the incident.
         o Work with law enforcement professionals to assure that
           evidence is protected.  If prosecution is involved,
           assure that the evidence collected is not divulged to
           the press.
         o Try not to be forced into a press interview before you are
           prepared.  The popular press is famous for the "2am"
           interview, where the hope is to catch the interviewee off
           guard and obtain information otherwise not available.
         o Do not allow the press attention to detract from the
           handling of the event.  Always remember that the successful
           closure of an incident is of primary importance.

   5.3.6  Who Needs to Get Involved?

      There now exists a number of incident response teams (IRTs) such
      as the CERT and the CIAC. (See sections 3.9.7.3.1 and 3.9.7.3.4.)
      Teams exists for many major government agencies and large
      corporations.  If such a team is available for your site, the
      notification of this team should be of primary importance during
      the early stages of an incident.  These teams are responsible for
      coordinating computer security incidents over a range of sites and
      larger entities.  Even if the incident is believed to be contained
      to a single site, it is possible that the information available
      through a response team could help in closing out the incident.

      In setting up a site policy for incident handling, it may be
      desirable to create an incident handling team (IHT), much like
      those teams that already exist, that will be responsible for
      handling computer security incidents for the site (or
      organization).  If such a team is created, it is essential that
      communication lines be opened between this team and other IHTs.
      Once an incident is under way, it is difficult to open a trusted

noToC RFC1244 - Page 71

      dialogue between other IHTs if none has existed before.

5.4  Response

   A major topic still untouched here is how to actually respond to an
   event.  The response to an event will fall into the general
   categories of containment, eradication, recovery, and follow-up.

   Containment

      The purpose of containment is to limit the extent of an attack.
      For example, it is important to limit the spread of a worm attack
      on a network as quickly as possible.  An essential part of
      containment is decision making (i.e., determining whether to shut
      a system down, to disconnect from a network, to monitor system or
      network activity, to set traps, to disable functions such as
      remote file transfer on a UNIX system, etc.).  Sometimes this
      decision is trivial; shut the system down if the system is
      classified or sensitive, or if proprietary information is at risk!
      In other cases, it is worthwhile to risk having some damage to the
      system if keeping the system up might enable you to identify an
      intruder.

      The third stage, containment, should involve carrying out
      predetermined procedures.  Your organization or site should, for
      example, define acceptable risks in dealing with an incident, and
      should prescribe specific actions and strategies accordingly.
      Finally, notification of cognizant authorities should occur during
      this stage.

   Eradication

      Once an incident has been detected, it is important to first think
      about containing the incident.  Once the incident has been
      contained, it is now time to eradicate the cause.  Software may be
      available to help you in this effort.  For example, eradication
      software is available to eliminate most viruses which infect small
      systems.  If any bogus files have been created, it is time to
      delete them at this point.  In the case of virus infections, it is
      important to clean and reformat any disks containing infected
      files.  Finally, ensure that all backups are clean.  Many systems
      infected with viruses become periodically reinfected simply
      because people do not systematically eradicate the virus from
      backups.

   Recovery

      Once the cause of an incident has been eradicated, the recovery

noToC RFC1244 - Page 72

      phase defines the next stage of action.  The goal of recovery is
      to return the system to normal.  In the case of a network-based
      attack, it is important to install patches for any operating
      system vulnerability which was exploited.

   Follow-up

      One of the most important stages of responding to incidents is
      also the most often omitted---the follow-up stage.  This stage is
      important because it helps those involved in handling the incident
      develop a set of "lessons learned" (see section 6.3) to improve
      future performance in such situations.  This stage also provides
      information which justifies an organization's computer security
      effort to management, and yields information which may be
      essential in legal proceedings.

      The most important element of the follow-up stage is performing a
      postmortem analysis.  Exactly what happened, and at what times?
      How well did the staff involved with the incident perform?  What
      kind of information did the staff need quickly, and how could they
      have gotten that information as soon as possible?  What would the
      staff do differently next time?  A follow-up report is valuable
      because it provides a reference to be used in case of other
      similar incidents.  Creating a formal chronology of events
      (including time stamps) is also important for legal reasons.
      Similarly, it is also important to as quickly obtain a monetary
      estimate of the amount of damage the incident caused in terms of
      any loss of software and files, hardware damage, and manpower
      costs to restore altered files, reconfigure affected systems, and
      so forth.  This estimate may become the basis for subsequent
      prosecution activity by the FBI, the U.S. Attorney General's
      Office, etc..

   5.4.1  What Will You Do?

      o Restore control.
      o Relation to policy.
      o Which level of service is needed?
      o Monitor activity.
      o Constrain or shut down system.

   5.4.2  Consider Designating a "Single Point of Contact"

      When an incident is under way, a major issue is deciding who is in
      charge of coordinating the activity of the multitude of players.
      A major mistake that can be made is to have a number of "points of
      contact" (POC) that are not pulling their efforts together.  This
      will only add to the confusion of the event, and will probably

noToC RFC1244 - Page 73

      lead to additional confusion and wasted or ineffective effort.

      The single point of contact may or may not be the person "in
      charge" of the incident.  There are two distinct rolls to fill
      when deciding who shall be the point of contact and the person in
      charge of the incident.  The person in charge will make decisions
      as to the interpretation of policy applied to the event.  The
      responsibility for the handling of the event falls onto this
      person.  In contrast, the point of contact must coordinate the
      effort of all the parties involved with handling the event.

      The point of contact must be a person with the technical expertise
      to successfully coordinate the effort of the system managers and
      users involved in monitoring and reacting to the attack.  Often
      the management structure of a site is such that the administrator
      of a set of resources is not a technically competent person with
      regard to handling the details of the operations of the computers,
      but is ultimately responsible for the use of these resources.

      Another important function of the POC is to maintain contact with
      law enforcement and other external agencies (such as the CIA, DoD,
      U.S.  Army, or others) to assure that multi-agency involvement
      occurs.

      Finally, if legal action in the form of prosecution is involved,
      the POC may be able to speak for the site in court.  The
      alternative is to have multiple witnesses that will be hard to
      coordinate in a legal sense, and will weaken any case against the
      attackers.  A single POC may also be the single person in charge
      of evidence collected, which will keep the number of people
      accounting for evidence to a minimum.  As a rule of thumb, the
      more people that touch a potential piece of evidence, the greater
      the possibility that it will be inadmissible in court.  The
      section below (Legal/Investigative) will provide more details for
      consideration on this topic.

5.5  Legal/Investigative

   5.5.1  Establishing Contacts with Investigative Agencies

      It is important to establish contacts with personnel from
      investigative agencies such as the FBI and Secret Service as soon
      as possible, for several reasons.  Local law enforcement and local
      security offices or campus police organizations should also be
      informed when appropriate.  A primary reason is that once a major
      attack is in progress, there is little time to call various
      personnel in these agencies to determine exactly who the correct
      point of contact is.  Another reason is that it is important to

noToC RFC1244 - Page 74

      cooperate with these agencies in a manner that will foster a good
      working relationship, and that will be in accordance with the
      working procedures of these agencies.  Knowing the working
      procedures in advance and the expectations of your point of
      contact is a big step in this direction.  For example, it is
      important to gather evidence that will be admissible in a court of
      law.  If you don't know in advance how to gather admissible
      evidence, your efforts to collect evidence during an incident are
      likely to be of no value to the investigative agency with which
      you deal.  A final reason for establishing contacts as soon as
      possible is that it is impossible to know the particular agency
      that will assume jurisdiction in any given incident.  Making
      contacts and finding the proper channels early will make
      responding to an incident go considerably more smoothly.

      If your organization or site has a legal counsel, you need to
      notify this office soon after you learn that an incident is in
      progress.  At a minimum, your legal counsel needs to be involved
      to protect the legal and financial interests of your site or
      organization.  There are many legal and practical issues, a few of
      which are:

         1. Whether your site or organization is willing to risk
            negative publicity or exposure to cooperate with legal
            prosecution efforts.

         2. Downstream liability--if you leave a compromised system
            as is so it can be monitored and another computer is damaged
            because the attack originated from your system, your site or
            organization may be liable for damages incurred.

         3. Distribution of information--if your site or organization
            distributes information about an attack in which another
            site or organization may be involved or the vulnerability
            in a product that may affect ability to market that
            product, your site or organization may again be liable
            for any damages (including damage of reputation).

         4. Liabilities due to monitoring--your site or organization
            may be sued if users at your site or elsewhere discover
            that your site is monitoring account activity without
            informing users.

      Unfortunately, there are no clear precedents yet on the
      liabilities or responsibilities of organizations involved in a
      security incident or who might be involved in supporting an
      investigative effort.  Investigators will often encourage
      organizations to help trace and monitor intruders -- indeed, most

noToC RFC1244 - Page 75

      investigators cannot pursue computer intrusions without extensive
      support from the organizations involved.  However, investigators
      cannot provide protection from liability claims, and these kinds
      of efforts may drag out for months and may take lots of effort.

      On the other side, an organization's legal council may advise
      extreme caution and suggest that tracing activities be halted and
      an intruder shut out of the system.  This in itself may not
      provide protection from liability, and may prevent investigators
      from identifying anyone.

      The balance between supporting investigative activity and limiting
      liability is tricky; you'll need to consider the advice of your
      council and the damage the intruder is causing (if any) in making
      your decision about what to do during any particular incident.

      Your legal counsel should also be involved in any decision to
      contact investigative agencies when an incident occurs at your
      site.  The decision to coordinate efforts with investigative
      agencies is most properly that of your site or organization.
      Involving your legal counsel will also foster the multi-level
      coordination between your site and the particular investigative
      agency involved which in turn results in an efficient division of
      labor.  Another result is that you are likely to obtain guidance
      that will help you avoid future legal mistakes.

      Finally, your legal counsel should evaluate your site's written
      procedures for responding to incidents.  It is essential to obtain
      a "clean bill of health" from a legal perspective before you
      actually carry out these procedures.

   5.5.2  Formal and Informal Legal Procedures

      One of the most important considerations in dealing with
      investigative agencies is verifying that the person who calls
      asking for information is a legitimate representative from the
      agency in question.  Unfortunately, many well intentioned people
      have unknowingly leaked sensitive information about incidents,
      allowed unauthorized people into their systems, etc., because a
      caller has masqueraded as an FBI or Secret Service agent.  A
      similar consideration is using a secure means of communication.
      Because many network attackers can easily reroute electronic mail,
      avoid using electronic mail to communicate with other agencies (as
      well as others dealing with the incident at hand).  Non-secured
      phone lines (e.g., the phones normally used in the business world)
      are also frequent targets for tapping by network intruders, so be
      careful!

noToC RFC1244 - Page 76

      There is no established set of rules for responding to an incident
      when the U.S. Federal Government becomes involved.  Except by
      court order, no agency can force you to monitor, to disconnect
      from the network, to avoid telephone contact with the suspected
      attackers, etc..  As discussed in section 5.5.1, you should
      consult the matter with your legal counsel, especially before
      taking an action that your organization has never taken.  The
      particular agency involved may ask you to leave an attacked
      machine on and to monitor activity on this machine, for example.
      Your complying with this request will ensure continued cooperation
      of the agency--usually the best route towards finding the source
      of the network attacks and, ultimately, terminating these attacks.
      Additionally, you may need some information or a favor from the
      agency involved in the incident.  You are likely to get what you
      need only if you have been cooperative.  Of particular importance
      is avoiding unnecessary or unauthorized disclosure of information
      about the incident, including any information furnished by the
      agency involved.  The trust between your site and the agency
      hinges upon your ability to avoid compromising the case the agency
      will build; keeping "tight lipped" is imperative.

      Sometimes your needs and the needs of an investigative agency will
      differ.  Your site may want to get back to normal business by
      closing an attack route, but the investigative agency may want you
      to keep this route open.  Similarly, your site may want to close a
      compromised system down to avoid the possibility of negative
      publicity, but again the investigative agency may want you to
      continue monitoring.  When there is such a conflict, there may be
      a complex set of tradeoffs (e.g., interests of your site's
      management, amount of resources you can devote to the problem,
      jurisdictional boundaries, etc.).  An important guiding principle
      is related to what might be called "Internet citizenship" [22,
      IAB89, 23] and its responsibilities.  Your site can shut a system
      down, and this will relieve you of the stress, resource demands,
      and danger of negative exposure.  The attacker, however, is likely
      to simply move on to another system, temporarily leaving others
      blind to the attacker's intention and actions until another path
      of attack can be detected.  Providing that there is no damage to
      your systems and others, the most responsible course of action is
      to cooperate with the participating agency by leaving your
      compromised system on.  This will allow monitoring (and,
      ultimately, the possibility of terminating the source of the
      threat to systems just like yours).  On the other hand, if there
      is damage to computers illegally accessed through your system, the
      choice is more complicated: shutting down the intruder may prevent
      further damage to systems, but might make it impossible to track
      down the intruder.  If there has been damage, the decision about
      whether it is important to leave systems up to catch the intruder

noToC RFC1244 - Page 77

      should involve all the organizations effected.  Further
      complicating the issue of network responsibility is the
      consideration that if you do not cooperate with the agency
      involved, you will be less likely to receive help from that agency
      in the future.

5.6  Documentation Logs

   When you respond to an incident, document all details related to the
   incident.  This will provide valuable information to yourself and
   others as you try to unravel the course of events.  Documenting all
   details will ultimately save you time.  If you don't document every
   relevant phone call, for example, you are likely to forget a good
   portion of information you obtain, requiring you to contact the
   source of information once again.  This wastes yours and others'
   time, something you can ill afford.  At the same time, recording
   details will provide evidence for prosecution efforts, providing the
   case moves in this direction.  Documenting an incident also will help
   you perform a final assessment of damage (something your management
   as well as law enforcement officers will want to know), and will
   provide the basis for a follow-up analysis in which you can engage in
   a valuable "lessons learned" exercise.

   During the initial stages of an incident, it is often infeasible to
   determine whether prosecution is viable, so you should document as if
   you are gathering evidence for a court case.  At a minimum, you
   should record:

      o All system events (audit records).
      o All actions you take (time tagged).
      o All phone conversations (including the person with whom
        you talked, the date and time, and the content of the
        conversation).

   The most straightforward way to maintain documentation is keeping a
   log book.  This allows you to go to a centralized, chronological
   source of information when you need it, instead of requiring you to
   page through individual sheets of paper.  Much of this information is
   potential evidence in a court of law.  Thus, when you initially
   suspect that an incident will result in prosecution or when an
   investigative agency becomes involved, you need to regularly (e.g.,
   every day) turn in photocopied, signed copies of your logbook (as
   well as media you use to record system events) to a document
   custodian who can store these copied pages in a secure place (e.g., a
   safe).  When you submit information for storage, you should in return
   receive a signed, dated receipt from the document custodian.  Failure
   to observe these procedures can result in invalidation of any
   evidence you obtain in a court of law.

noToC RFC1244 - Page 78

6.  Establishing Post-Incident Procedures

6.1  Overview

   In the wake of an incident, several actions should take place.  These
   actions can be summarized as follows:

      1. An inventory should be taken of the systems' assets,
         i.e., a careful examination should determine how the
         system was affected by the incident,

      2. The lessons learned as a result of the incident
         should be included in revised security plan to
         prevent the incident from re-occurring,

      3. A new risk analysis should be developed in light of the
         incident,

      4. An investigation and prosecution of the individuals
         who caused the incident should commence, if it is
         deemed desirable.

   All four steps should provide feedback to the site security policy
   committee, leading to prompt re-evaluation and amendment of the
   current policy.

6.2  Removing Vulnerabilities

   Removing all vulnerabilities once an incident has occurred is
   difficult.  The key to removing vulnerabilities is knowledge and
   understanding of the breach.  In some cases, it is prudent to remove
   all access or functionality as soon as possible, and then restore
   normal operation in limited stages.  Bear in mind that removing all
   access while an incident is in progress will obviously notify all
   users, including the alleged problem users, that the administrators
   are aware of a problem; this may have a deleterious effect on an
   investigation.  However, allowing an incident to continue may also
   open the likelihood of greater damage, loss, aggravation, or
   liability (civil or criminal).

   If it is determined that the breach occurred due to a flaw in the
   systems' hardware or software, the vendor (or supplier) and the CERT
   should be notified as soon as possible.  Including relevant telephone
   numbers (also electronic mail addresses and fax numbers) in the site
   security policy is strongly recommended.  To aid prompt
   acknowledgment and understanding of the problem, the flaw should be
   described in as much detail as possible, including details about how
   to exploit the flaw.

noToC RFC1244 - Page 79

   As soon as the breach has occurred, the entire system and all its
   components should be considered suspect.  System software is the most
   probable target.  Preparation is key to recovering from a possibly
   tainted system.  This includes checksumming all tapes from the vendor
   using a checksum algorithm which (hopefully) is resistant to
   tampering [10].  (See sections 3.9.4.1, 3.9.4.2.)  Assuming original
   vendor distribution tapes are available, an analysis of all system
   files should commence, and any irregularities should be noted and
   referred to all parties involved in handling the incident.  It can be
   very difficult, in some cases, to decide which backup tapes to
   recover from; consider that the incident may have continued for
   months or years before discovery, and that the suspect may be an
   employee of the site, or otherwise have intimate knowledge or access
   to the systems.  In all cases, the pre-incident preparation will
   determine what recovery is possible.  At worst-case, restoration from
   the original manufactures' media and a re-installation of the systems
   will be the most prudent solution.

   Review the lessons learned from the incident and always update the
   policy and procedures to reflect changes necessitated by the
   incident.

   6.2.1  Assessing Damage

      Before cleanup can begin, the actual system damage must be
      discerned.  This can be quite time consuming, but should lead into
      some of the insight as to the nature of the incident, and aid
      investigation and prosecution.  It is best to compare previous
      backups or original tapes when possible; advance preparation is
      the key.  If the system supports centralized logging (most do), go
      back over the logs and look for abnormalities.  If process
      accounting and connect time accounting is enabled, look for
      patterns of system usage.  To a lesser extent, disk usage may shed
      light on the incident.  Accounting can provide much helpful
      information in an analysis of an incident and subsequent
      prosecution.

   6.2.2  Cleanup

      Once the damage has been assessed, it is necessary to develop a
      plan for system cleanup.  In general, bringing up services in the
      order of demand to allow a minimum of user inconvenience is the
      best practice.  Understand that the proper recovery procedures for
      the system are extremely important and should be specific to the
      site.

      It may be necessary to go back to the original distributed tapes
      and recustomize the system.  To facilitate this worst case

noToC RFC1244 - Page 80

      scenario, a record of the original systems setup and each
      customization change should be kept current with each change to
      the system.

   6.2.3  Follow up

      Once you believe that a system has been restored to a "safe"
      state, it is still possible that holes and even traps could be
      lurking in the system.  In the follow-up stage, the system should
      be monitored for items that may have been missed during the
      cleanup stage.  It would be prudent to utilize some of the tools
      mentioned in section 3.9.8.2 (e.g., COPS) as a start.  Remember,
      these tools don't replace continual system monitoring and good
      systems administration procedures.

   6.2.4  Keep a Security Log

      As discussed in section 5.6, a security log can be most valuable
      during this phase of removing vulnerabilities.  There are two
      considerations here; the first is to keep logs of the procedures
      that have been used to make the system secure again.  This should
      include command procedures (e.g., shell scripts) that can be run
      on a periodic basis to recheck the security.  Second, keep logs of
      important system events.  These can be referenced when trying to
      determine the extent of the damage of a given incident.

6.3  Capturing Lessons Learned

   6.3.1  Understand the Lesson

      After an incident, it is prudent to write a report describing the
      incident, method of discovery, correction procedure, monitoring
      procedure, and a summary of lesson learned.  This will aid in the
      clear understanding of the problem.  Remember, it is difficult to
      learn from an incident if you don't understand the source.

   6.3.2  Resources

      6.3.2.1  Other Security Devices, Methods

         Security is a dynamic, not static process.  Sites are dependent
         on the nature of security available at each site, and the array
         of devices and methods that will help promote security.
         Keeping up with the security area of the computer industry and
         their methods will assure a security manager of taking
         advantage of the latest technology.

noToC RFC1244 - Page 81

      6.3.2.2  Repository of Books, Lists, Information Sources

         Keep an on site collection of books, lists, information
         sources, etc., as guides and references for securing the
         system.  Keep this collection up to date.  Remember, as systems
         change, so do security methods and problems.

      6.3.2.3  Form a Subgroup

         Form a subgroup of system administration personnel that will be
         the core security staff.  This will allow discussions of
         security problems and multiple views of the site's security
         issues.  This subgroup can also act to develop the site
         security policy and make suggested changes as necessary to
         ensure site security.

6.4  Upgrading Policies and Procedures

   6.4.1  Establish Mechanisms for Updating Policies, Procedures,
          and Tools

      If an incident is based on poor policy, and unless the policy is
      changed, then one is doomed to repeat the past.  Once a site has
      recovered from and incident, site policy and procedures should be
      reviewed to encompass changes to prevent similar incidents.  Even
      without an incident, it would be prudent to review policies and
      procedures on a regular basis.  Reviews are imperative due to
      today's changing computing environments.

   6.4.2  Problem Reporting Procedures

      A problem reporting procedure should be implemented to describe,
      in detail, the incident and the solutions to the incident.  Each
      incident should be reviewed by the site security subgroup to allow
      understanding of the incident with possible suggestions to the
      site policy and procedures.

(page 81 continued on part 4)