RFC 1147

FYI on a Network Management Tool Catalog: Tools for Monitoring and Debugging TCP/IP Internets and Interconnected Devices

Pages: 177
Obsoleted by: 1470

Part 5 of 5 – Pages 150 to 177

noToC RFC1147 - Page 150 prevText

          NAME
               XNETMON -- an X windows based SNMP network management
               station from SNMP Research.

          KEYWORDS
               alarm, control, manager, map, routing, security,
               status; DECnet, ethernet, IP, OSI, ring, star; NMS,
               SNMP, X; DOS, UNIX, VMS; sourcelib.

          ABSTRACT
               The XNETMON application implements a powerful network
               management station based on the X window system.  It
               provides network managers tools for fault management,
               configuration management, performance management, and
               security management.  It can be successfully used with
               many types of networks including those based on various
               LAN media, and wide area networks.  XNETMON has been
               used with multiprotocol devices including those which
               support TCP/IP, DECnet, and OSI protocols.  The fault
               management tool displays the map of the network confi-
               guration with node and link state indicated in one of
               several colors to indicate current status.  Alarms may
               be enabled to alert the operator of events occurring in
               the network.  Events are logged to disk.  The confi-
               guration management tool may be used to edit the net-
               work management information base stored in the network
               management station to reflect changes occurring in the
               network.  Other features include graphs and tabular
               tools for use in fault and performance management and
               mechanisms by which additional variables, such as
               vendor-specific variables, may be added.  The XNETMON
               application comes complete with source code including a
               powerful set of portable libraries for generating and
               parsing SNMP messages.  Output data from XNETMON may be
               transferred via flat files for additional report gen-
               eration by a variety of statistical packages.

          MECHANISM
               The XNETMON application is based on the Simple Network
               Management Protocol (SNMP).  Polling is performed via
               the powerful SNMP get-next operator and the SNMP get
               operator.  Trap directed polling is used to regulate
               the focus and intensity of the polling.

          CAVEATS
               None.

          BUGS

noToC RFC1147 - Page 151

               None known.

          LIMITATIONS
               The monitored and managed nodes must implement the SNMP
               over UDP per RFC 1098 or must be reachable via a proxy
               agent.

          HARDWARE REQUIRED
               X windows workstation with UDP socket library.  Mono-
               chrome is acceptable but color is far superior.

          SOFTWARE REQUIRED
               X windows version 11 release 3 or later.

          AVAILABILITY
               This is a commercial product available under license
               from:

                    SNMP Research
                    P.O. Box 8593
                    Knoxville, TN 37996-4800
                    (615) 573-1434 (Voice)
                    (615) 573-9197 (FAX)
                    Attn:  Dr. Jeff Case

noToC RFC1147 - Page 152

          NAME
               xnetperfmon -- a graphical network performance and
               fault management tool from SNMP Research.

          KEYWORDS
               manager, status; DECnet, ethernet, IP, OSI, ring, star;
               NMS, SNMP, X; DOS, UNIX, VMS; sourcelib.

          ABSTRACT
               Xnetperfmon may be used to plot SNMP variables as a
               graphical display.  These graphs are often useful for
               fault and performance management.  Variables may be
               plotted as gauges versus time.  Alternatively, counters
               may be plotted as delta count/delta time (rates).  The
               user may easily customize the variables to be plotted,
               labels, step size, update interval, and the like.  The
               scales automatically adjust whenever a point to be
               plotted would go off scale.

          MECHANISM
               The xnetperfmon application communicates with remote
               agents or proxy agents via the Simple Network Manage-
               ment Protocol (SNMP).

          CAVEATS
               All plots for a single invocation of xnetperfmon must
               be for variables provided by a single network manage-
               ment agent.  However, multiple invocations of xnetperf-
               mon may be active on a single display simultaneously or
               proxy agents may be used to summarize information at a
               common point.

          BUGS
               None known.

          LIMITATIONS
               None reported.

          HARDWARE REQUIRED
               Systems supporting X windows.

          SOFTWARE REQUIRED
               X Version 11 release 2 or later.

noToC RFC1147 - Page 153

          AVAILABILITY
               This is a commercial product available under license
               from:

                    SNMP Research
                    P.O. Box 8593
                    Knoxville, TN 37996-4800
                    (615) 573-1434 (Voice)
                    (615) 573-9197 (FAX)
                    Attn:  Dr. Jeff Case

noToC RFC1147 - Page 154

          NAME
               xup

          KEYWORDS
               status; ping, X; HP.

          ABSTRACT
               Xup uses the X-Windows to display the status of an
               "interesting" set of hosts.

          MECHANISM
               Xup uses ping to determine host status.

          CAVEATS
               Polling for status increases network load.

          BUGS
               None known.

          LIMITATIONS
               None reported.

          HARDWARE REQUIRED
               Runs only on HP series 300 and 800 workstations.

          SOFTWARE REQUIRED
               Version 10 of X-Windows.

          AVAILABILITY
               A standard command for the HP 300 & 800 Workstations.

noToC RFC1147 - Page 155

Appendix: Network Management Tutorial


          This tutorial is an overview of the practice of network
          management.  Reading this section is no substitute for know-
          ing your system, and knowing how it is used.  Do not wait
          until things break to learn what they ought to do or how
          they usually work: a crisis is not the time for determining
          how "normal" packet traces should look.  Furthermore, it
          takes little imagination to realize that you do not want to
          be digging through manuals while your boss is screaming for
          network service to be restored.

          We assume an acquaintance with the TCP/IP protocol suite and
          the Internet architecture.  There are many available refer-
          ences on these topics, several of which are listed below in
          Section 7.

          Since many of the details of network management are system-
          specific, this tutorial is a bit superficial.  There is,
          however, a more fundamental problem in prescribing network
          management practices: network management is not a well-
          understood endeavor.  At present, the cutting edge of net-
          work management is the use of distributed systems to collect
          and exchange status information, and then to display the
          data as histograms or trend lines.  It is not clear that we
          know what data should be collected, how to analyze it when
          we get it, or how to structure our collection systems.  For
          now, automated, real-time control of internets is an aspira-
          tion, rather than a reality.  The communications systems
          that we field are apparently more complex than we can
          comprehend, which no doubt accounts in part for their fre-
          quently surprising behavior.

          The first section of this tutorial lists the overall goals
          and functions of network management.  It presents several
          aspects of network management, including system monitoring,
          fault detection and isolation, performance testing, confi-
          guration management, and security.  These discussions are
          followed by a bibliographic section.  The tutorial closes
          with some final advice for network managers.

1. Network Management Goals and Functions

          An organization's view of network management goals is shaped
          by two factors:

noToC RFC1147 - Page 156

               1.   people in the organization depend on the system
                    working,

               2.   LANs, routers, lines, and other communications
                    resources have costs.

          From the organizational vantage point, the ultimate goal of
          network management is to provide a consistent, predictable,
          acceptable level of service from the available data communi-
          cations resources.  To achieve this, a network manager must
          first be able to perform fault detection, isolation, and
          correction.  He must also be able to effect configuration
          changes with a minimum of disruption, and measure the utili-
          zation of system components.

          People actually managing networks have a different focus.
          Network managers are usually evaluated by the availability
          and performance of their communications systems, even though
          many factors of net performance are beyond their control.
          To them, the most important requirement of a network manage-
          ment tool is that it allows the detection and diagnosis of
          faults before users can call to complain: users (and bosses)
          can often be placated just by knowing that a network problem
          has been diagnosed.  Another vital network management func-
          tion is the ability to collect data that justify current or
          future expenditures for the data communications plant and
          staff.

          Following a section on system monitoring, this tutorial
          addresses fault, performance, configuration, and security
          management.  By fault management, we mean the detection,
          diagnosis, and correction of network malfunctions.  Under
          the subject of performance management, we include support
          for predictable, efficient service, as well as capacity
          planning and capacity testing.  Configuration management
          includes support for orderly configuration changes (usually,
          system growth), and local administration of component names
          and addresses.  Security management includes both protecting
          system components from damage and protecting sensitive
          information from unintentional or malicious disclosure or
          corruption.

          Readers familiar with the ISO management standards and
          drafts will note both that we have borrowed heavily from the
          "OSI Management Framework," except that we have omitted the
          "account management" function.  Account management seems a
          bit out of place with the other network management

noToC RFC1147 - Page 157

          functions.  The logging required by account management is
          likely to be done by specialized, dedicated subsystems that
          are distinct from other network management components.
          Hence, this tutorial does not cover account management.
          Rest assured, however, that account management, if required,
          will be adequately supported and staffed.

          For those with a DoD background, security may also seem out
          of place as a subtopic of network management.  Without
          doubt, communications security is an important issue that
          should be considered in its own right.  Because of the
          requirements of trust for security mechanisms, security com-
          ponents will probably not be integrated subcomponents of a
          larger network management system.  Nevertheless, because a
          network manager has a responsibility to protect his system
          from undue security risks, this tutorial includes a discus-
          sion on internet security.

2. System Monitoring

          System monitoring is a fundamental aspect of network manage-
          ment.  One can divide system monitoring into two rough
          categories: error detection and baseline monitoring.

          System errors, such as misformatted frames or dropped pack-
          ets, are not in themselves cause for concern.  Spikes in
          error rates, however, should be investigated.  It is sound
          practice to log error rates over time, so that increases can
          be recognized.  Furthermore, logging error rates as a func-
          tion of traffic rates can be used to detect congestion.
          Investigate unusual error rates and other anomalies as they
          are detected, and keep a notebook to record your
          discoveries.

          Day-to-day traffic should be monitored, so that the opera-
          tional baselines of a system and its components can be
          determined.  As well as being essential for performance
          management, baseline determination and traffic monitoring
          are the keys to early fault detection.

          A preliminary step to developing baseline measurements is
          construction of a system map: a graphical representation of
          the system components and their interfaces.  Then, measure-
          ments of utilization (i.e., use divided by capacity) are
          needed.  Problems are most likely to arise, and system tun-
          ing efforts are most likely to be beneficial, at highly
          utilized components.

noToC RFC1147 - Page 158

          It is worthwhile to develop a source/destination traffic
          matrix, including a breakdown of traffic between the local
          system and other internet sites.  Both volume and type of
          traffic should be logged, along with its evolution over
          time.  Of particular interest for systems with diskless
          workstations is memory swapping and other disk server
          access.  For all systems, broadcast traffic and routing
          traffic should be monitored.  Sudden increases in the vari-
          ance of delay or the volume of routing traffic may indicate
          thrashing or other soft failures.

          In monitoring a system, long-term averages are of little
          use.  Hourly averages are a better indicator of system use.
          Variance in utilization and delay should also be tracked.
          Sudden spikes in variance are tell-tale signs that a problem
          is looming or exists.  So, too, are trends of increased
          packet or line errors, broadcasts, routing traffic, or
          delay.

3. Fault Detection and Isolation

          When a system fails, caution is in order.  A net manager
          should make an attempt to diagnose the cause of a system
          crash before rebooting.  In many cases, however, a quick
          diagnosis will not be possible.  For some high priority
          applications, restoring at least some level of service will
          have priority over fault repair or even complete fault diag-
          nosis.  This necessitates prior planning.  A net manager
          must know the vital applications at his site.  If applica-
          tions require it, he must also have a fall-back plan for
          bringing them online.  Meanwhile, repeated crashes or
          hardware failures are unambiguous signs of a problem that
          must be corrected.

          A network manager should prepare for fault diagnosis by
          becoming familiar with how diagnostic tools respond to net-
          work failure.  In times of relative peace, a net manager
          should occasionally unplug the network connection from an
          unused workstation and then "debug" the problem.

          When diagnosing a fault or anomaly, it is vital to proceed
          in an orderly manner, especially since network faults will
          usually generate spurious as well as accurate error mes-
          sages.  Remember to keep in mind that the network itself is
          failing.  Do not place too much trust in anything obtained
          remotely.  Furthermore, it is unlikely to be significant
          that remote information such as DNS names or NFS files can-
          not be obtained.

noToC RFC1147 - Page 159

          Even spurious messages can be revealing, because they pro-
          vide clues to the problem.  From the data at hand, develop
          working hypotheses about probable causes of the problems you
          detect.  Direct your further data gathering efforts so that
          the information you get will either refute or support your
          hypotheses.

          An orderly approach to debugging is facilitated if it is
          guided by a model of network behavior.  The following por-
          tions of this section present such a model, along with a
          procedure for checking network connectivity.  The section
          concludes with  some hints for diagnosing a particularly
          tricky class of connectivity problem.

3.1 A Network Model as a Diagnostic Framework

          The point of having a model of how things work is to have a
          basis for developing educated guesses about how things go
          wrong.  The problem of cascading faults -- faults generating
          other faults -- makes use of a conceptual model a virtual
          necessity.

          In general, only problems in a component's hardware or
          operating system will generate simultaneous faults in multi-
          ple protocol layers.  Otherwise, faults will propagate vert-
          ically (up the protocol stack) or horizontally (between
          peer-level communications components).  Applying a concep-
          tual model that includes the architectural relations of net-
          work components can help to order an otherwise senseless
          barrage of error messages and symptoms.

          The model does not have to be formal or complex to bring
          structure to debugging efforts.  A useful start is something
          as simple as the following:

               1.   Applications programs use transport services:
                    TCP/UDP.  Before using service, applications that
                    accept host names as parameters must translate the
                    names into IP addresses.  Translation may be based
                    on a static table lookup (/etc/hosts file in UNIX
                    hosts), the DNS, or yellow pages.  Nslookup and
                    DiG are tools for monitoring the activities of the
                    DNS.

               2.   Transport protocol implementations use IP ser-
                    vices.  The local IP module makes the initial
                    decision on forwarding.  An IP datagram is for-
                    warded directly to the destination host if the

noToC RFC1147 - Page 160

                    destination is on the same network as the source.
                    Otherwise, the datagram is forwarded to a gateway
                    attached to the network.  On BSD hosts, the con-
                    tents of a host's routing table are visible by use
                    of the "netstat" command.*

               3.   IP implementations translate the IP address of a
                    datagram's next hop (either the destination host
                    or a gateway) to a local network address.  For
                    ethernets, the Address Resolution Protocol (ARP)
                    is commonly used for this translation.  On BSD
                    systems, an interface's IP address and other con-
                    figuration options can be viewed by use of the
                    "ifconfig" command, while the contents of a host's
                    ARP cache may be viewed by use of "arp" command.

               4.   IP implementations in hosts and gateways route
                    datagrams based on subnet and net identifiers.
                    Subnetting is a means of allocating and preserving
                    IP address space, and of insulating users from the
                    topological details of a multi-network campus.
                    Sites that use subnetting reserve portions of the
                    IP address's host identifier to indicate particu-
                    lar networks at their campus.  Subnetting is
                    highly system-dependent.  The details are a criti-
                    cal, though local, issue.  As for routing between
                    separate networks, a variety of gateway-to-gateway
                    protocols are used.  Traceroute is a useful tool
                    for investigating routing problems.  The tool,
                    "query," can be used to examine RIP routing
                    tables.

          A neophyte network manager should expand the above descrip-
          tion so that it accurately describes his particular system,
          _________________________
          * Initial forwarding may actually be complex and
          vulnerable to multiple points of failure.  For example,
          when sending an IP datagram, 4.3BSD hosts first look
          for a route to the particular host.  If none has been
          specified for the destination, then a search is made
          for a route to the network of the destination.  If this
          search also fails, then as a last resort, a search is
          made for a route to a "default" gateway.  Routes to
          hosts, networks, and the "default" gateway may be stat-
          ic, loaded at boot time and perhaps updated by operator
          commands.  Alternatively, they may be dynamic, loaded
          from redirects and routing protocol updates.

noToC RFC1147 - Page 161

          and learn the tools and techniques for monitoring the opera-
          tions at each of the above stages.

3.2 A Simple Procedure for Connectivity Check

          In this section, we describe a procedure for isolating a
          TCP/IP connectivity problem.** In this procedure, a series
          of tests methodically examine connectivity from a host,
          starting with nearby resources and working outward. The
          steps in our connectivity-testing procedure are:

          1.   As an initial sanity check, ping your own IP address
               and the loopback address.

          2.   Next, try to ping other IP hosts on the local subnet.
               Use numeric addresses when starting off, since this
               eliminates the name resolvers and host tables as poten-
               tial sources of problems.  The lack of an answer may
               indicate either that the destination host did not
               respond to ARP (if it is used on your LAN), or that a
               datagram was forwarded (and hence, the destination IP
               address was resolved to a local media address) but that
               no ICMP Echo Reply was received.  This could indicate a
               length-related problem, or misconfigured IP Security.

          3.   If an IP router (gateway) is in the system, ping both
               its near and far-side addresses.

          4.   Make sure that your local host recognizes the gateway
               as a relay.  (For BSD hosts, use netstat.)

          5.addresses
               Still using numeric IP addresses, try to ping hosts
               beyond the gateway.  If you get no response, run hop-
               check or traceroute, if available.  Note whether your
               packets even go to the gateway on their way to the des-
               tination.  If not, examine the methods used to instruct
               your host to use this gateway to reach the specified
               destination net (e.g., is the default route in place?
               Alternatively, are you successfully wire-tapping the
               IGP messages broadcast on the net you are attached to?)

          _________________________
          ** Thanks to James VanBokkelen, president of FTP
          Software, for sharing with us a portion of a PC/TCP
          support document, the basis for the above connectivity
          procedure.

noToC RFC1147 - Page 162

               If traceroute is not available, ping, netstat, arp, and
               a knowledge of the IP addresses of all the gateway's
               interfaces can be used to isolate the cause of the
               problem.  Use netstat to determine your next hop to the
               destination.  Ping that IP address to ensure the router
               is up.  Next, ping the router interface on the far sub-
               net.  If the router returns "network unreachable" or
               other errors, investigate the router's routing tables
               and interface status.  If the pings succeed, ping the
               close interface of the succeeding next hop gateway, and
               so on.  Remember the routing along the outbound and
               return paths may be different.

          6.   Once ping is working with numeric addresses, use ping
               to try to reach a few remote hosts by name.  If ping
               fails when host names are used, check the operation of
               the local name-mapping system (i.e., with nslookup or
               DiG).  If you want to use "shorthand" forms ("myhost"
               instead of "myhost.mydomain.com"), be sure that the
               alias tables are correctly configured.

          7.   Once basic reachability has been established with ping,
               try some TCP-based applications: FTP and TELNET are
               supported on almost all IP hosts, but FINGER is a
               simpler protocol.  The Berkeley-specific protocols
               (RSH, RCP, REXEC and LPR) require extra configuration
               on the server host before they can work, and so are
               poor choices for connectivity testing.

          If problems arise in steps 2-7 above, rerunning the tests
          while executing a line monitor (e.g., etherfind, netwatch,
          or tcpdump) can help to pinpoint the problem.

          The above procedure is sound and useful, especially if lit-
          tle is known about the cause of the connectivity problem.
          It is not, however, guaranteed to be the shortest path to
          diagnosis.  In some cases, a binary search on the problem
          might be more effective (i.e., try a test "in the middle,"
          in a spot where the failure modes are well defined).  In
          other cases, available information might so strongly suggest
          a particular failure that immediately testing for it is in
          order.  This last "approach," which might be called "hunting
          and pecking," should be used with caution: chasing one will
          o' the wisp after another can waste much time and effort.

          Note that line problems are still among the most common
          causes of connectivity loss.  Problems in transmission
          across local media are outside the scope of this tutorial.

noToC RFC1147 - Page 163

          But, if a host or workstation loses or cannot establish con-
          nectivity, check its physical connection.

3.3 Limited Connectivity

          An interesting class of problems can result in a particu-
          larly mysterious failure: TELNET or other low-volume TCP
          connections work, but large file transfers fail.  FTP
          transfers may start, but then hang.  There are several pos-
          sible culprits in this problem.  The most likely suspects
          are IP implementations that cannot fragment or reassemble
          datagrams, and TCP implementations that do not perform
          dynamic window sizing (a.k.a. Van Jacobson's "Slow Start"
          algorithm).  Another possibility is mixing incompatible
          frame formats on an ethernet.

          Even today, some IP implementations in the Internet cannot
          correctly handle fragmentation or reassembly.  They will
          work fine for small packets, but drop all large packets.

          The problem can also be caused by buffer exhaustion at gate-
          ways that connect interfaces of widely differing bandwidth.
          Datagrams from a TCP connection that traverses a bottleneck
          will experience queue delays, and will be dropped if buffer
          resources are depleted.  The congestion can be made worse if
          the TCP implementation at the traffic source does not use
          the recommended algorithms for computing retransmission
          times, since spuriously retransmitted datagrams will only
          add to the congestion.* Fragmentation, even if correctly
          implemented, will compound this problem, since processing
          delays and congestion will be increased at the bottleneck.

          Serial Line Internet Protocol (SLIP) links are especially
          vulnerable to this and other congestion problems.  SLIP
          lines are typically an order of magnitude slower than other
          gateway interfaces.  Also, the SLIP lines are at times con-
          figured with MTUs (Maximum Transfer Unit, the maximum length
          of an IP datagram for a particular subnet) as small as 256
          _________________________
          * To avoid this problem, TCP implementations on the In-
          ternet must use "exponential backoff" between succes-
          sive retransmissions, Karn's algorithm for filtering
          samples used to estimate round-trip delay between TCP
          peers, and Jacobson's algorithm for incorporating vari-
          ance into the "retransmission time-out" computation for
          TCP segments.  See Section 4.2.3.1 of RFC 1122, "Re-
          quirements for Internet Hosts -- Communication Layers."

noToC RFC1147 - Page 164

          bytes, which virtually guarantees fragmentation.

          To alleviate this problem, TCP implementations behind slow
          lines should advertise small windows.  Also, if possible,
          SLIP lines should be configured with an MTU no less than 576
          bytes.  The tradeoff to weigh is whether interactive traffic
          will be penalized too severly by transmission delays of
          lengthy datagrams from concurrent file transfers.

          Misuse of ethernet trailers can also cause the problem of
          hanging file transfers.  "Trailers" refers to an ethernet
          frame format optionally employed by BSD systems to minimize
          buffer copying by system software.  BSD systems with ether-
          net interfaces can be configured to send large frames so
          that their address and control data are at the end of a
          frame (hence, a "trailer" instead of a "header").  After a
          memory page is allocated and loaded with a received ethernet
          frame, the ethernet data will begin at the start of the
          memory page boundary.  Hence, the ethernet control informa-
          tion can be logically stripped from the end merely by
          adjusting the page's length field.  By manipulating virtual
          memory mapping, this same page (sans ethernet control infor-
          mation), can then be passed to the local IP module without
          additional allocation and loading of memory.  The disadvan-
          tage in using trailers is that it is non-standard.  Many
          implementations cannot parse trailers.

          The hanging FTP problem will appear if a gateway is not con-
          figured to recognize trailers, but a host or gateway immedi-
          ately "upstream" on an ethernet uses them.  Short datagrams
          will not be formatted with trailers, and so will be pro-
          cessed correctly.  When the bulk data transfer starts, how-
          ever, full-sized frames will be sent, and will use the
          trailer format.  To the gateway that receives them, they
          appear simply as misformatted frames, and are quietly
          dropped.  The solution, obviously, is to insure that all
          hosts and gateways on an ethernet are consistent in their
          use of trailers.  Note that RFC 1122, "Internet Host
          Requirements," places very strict restrictions on the use of
          trailers.

4. Performance Testing

          Performance management encompasses two rather different
          activities.  One is passive system monitoring to detect
          problems and determine operational baselines.  The goal is
          to measure system and component utilization and so locate
          bottlenecks, since bottlenecks should receive the focus of

noToC RFC1147 - Page 165

          performance tuning efforts.  Also, performance data is usu-
          ally required by upper level management to justify the costs
          of communications systems.  This is essentially identical to
          system monitoring, and is addressed at greater length in
          Section 2, above.

          Another aspect of performance management is active perfor-
          mance testing and capacity planning.  Some work in this area
          can be based on analysis.  For example, a rough estimate of
          gateway capacity can be deduced from a simple model given by
          Charles Hedrick in his "Introduction to Administration of an
          Internet-based Local Network," which is

               per-packet processing time =
                         switching time +
                                   (packet size) * (transmission bps).

          Another guideline for capacity planning is that in order to
          avoid excessive queuing delays, a system should be sized at
          about double its expected load.  In other words, system
          capacity should be so high that utilization is no greater
          than 50%.

          Although there are more sophisticated analytic models of
          communications systems than those above, their added com-
          plexity does not usually gain a corresponding accuracy.
          Most analytic models of communications nets require assump-
          tions about traffic load distributions and service rates
          that are not merely problematic, but are patently false.
          These errors tend to result in underestimating queuing
          delays.  Hence, it is often necessary to actually load and
          measure the performance of a real communications system if
          one is to get accurate performance predictions.  Obviously,
          this type of testing is performed on isolated systems or
          during off hours.  The results can be used to evaluate
          parameter settings or predict performance during normal
          operations.

          Simulations can be used to supplement the testing of real
          systems.  To be believable, however, simulations require
          validation, which, in turn, requires measurements from a
          real system.  Whether testing or simulating a system's per-
          formance, actual traffic traces should be incorporated as
          input to traffic generators.  The performance of a communi-
          cations system will be greatly influenced by its load
          characteristics (burstiness, volume, etc.), which are them-
          selves highly dependent on the applications that are run.

noToC RFC1147 - Page 166

          When tuning a net, in addition to the usual configuration
          parameters, consider the impact of the location of gateways
          and print and file servers.  A few rules of thumb can guide
          the location of shared system resources.  First, there is
          the principle of locality: a system will perform better if
          most traffic is between nearby destinations.  The second
          rule is to avoid creating bottlenecks.  For example, multi-
          ple diskservers may be called for to support a large number
          of workstations.  Furthermore, to avoid LAN and diskserver
          congestion, workstations should be configured with enough
          memory to avoid frequent swapping.

          As a final note on performance management, proceed cau-
          tiously if your ethernet interface allows you to customize
          its collision recovery algorithm.  This is almost always a
          bad idea.  The best that it can accomplish is to give a few
          favored hosts a disproportionate share of the ethernet
          bandwidth, perhaps at the cost of a reduction in total sys-
          tem throughput.  Worse, it is possible that differing colli-
          sion recovery algorithms may exhibit a self-synchronizing
          behavior, so that excess collisions are generated.

5. Configuration Management

          Configuration management is the setting, collecting, and
          storing of the state and parameters of network resources.
          It overlaps all other network management functions.  Hence,
          some aspects of configuration management have already been
          addressed (e.g., tuning for performance).  In this section,
          we will focus on configuration management activities needed
          to "hook up" a net or campus to a larger internet.  We will
          not, of course, include specific details on installing or
          maintaining internetted communications systems.  We will,
          however, skim over some of the TCP/IP configuration
          highlights.

          Configuration management includes "name management" -- the
          control and allocation of system names and addresses, and
          the translation between names and addresses.  Name-to-
          address translation is performed by "name servers." We con-
          clude this section with a few strictures on the simultaneous
          use of two automated name-servers, the Domain Name System
          (DNS), and Yellow Pages (YP).

5.1 Required Host Configuration Data for TCP/IP internets

          In a TCP/IP internet, each host needs several items of
          information for internet communications.  Some will be

noToC RFC1147 - Page 167

          host-specific, while other information will be common for
          all hosts on a subnet.  In a soon to be published RFC docu-
          ment,* R. Droms identifies the following configuration data
          required by internet hosts:

               +    An IP address, a host specific value that can be
                    hard-coded or obtained via BOOTP, the Reverse
                    Address Resolution Protocol (RARP) or Dynamic RARP
                    (DRARP).

               +    Subnet properties, such as the subnet mask and the
                    Maximum Transmission Unit (MTU); obviously, these
                    values are not host-specific.

               +    Addresses of "entry" gateways to the internet;
                    addresses of default gateways are usually hard-
                    coded; though the ICMP "redirect" message can be
                    used to refine a host's routing tables, there is
                    currently no dynamic TCP/IP mechanism or protocol
                    for a host to locate a gateway; an IETF working
                    group is busy on this problem.

               +    For hosts in internets using the Domain Name Sys-
                    tem (DNS) for name-to-address translation, the
                    location of a local DNS server is needed; this
                    information is not host-specific, and usually
                    hard-coded;

               +    Host name (domain name, for hosts using DNS);
                    obviously host-specific; either hard-coded or
                    obtained in a boot procedure.

               +    For diskless hosts, various boot services.  BOOTP
                    is the standard Internet protocol for downloading
                    boot configuration information.  The Trivial File
                    Transfer Protocol (TFTP) is typically used for
                    downloading boot images.  Sun computers use the
                    "bootparams" RPC mechanism for downloading initial
                    configuration data to a host.

          There are ongoing developments, most notably the work of the
          Dynamic Host Configuration Working Group of the IETF, to
          support dynamic, automatic gathering of the above data.  In
          the meantime, most systems will rely on hand-crafted confi-
          guration files.
          _________________________
          * Draft "Dynamic Configuration of Internet Hosts."

noToC RFC1147 - Page 168

5.3 Connecting to THE Internet

          The original TCP/IP Internet (spelled with an upper-case
          "I") is still active, and still growing.  An interesting
          aspect of the Internet is that it spans many independently
          administered systems.

          Connection to the Internet requires: a registered network
          number, for use in IP addresses; a registered autonomous
          system number (ASN), for use in internet routing; and, a
          registered domain name.  Fielding a primary and backup DNS
          server is a condition for registering a domain name.

          The Defense Data Network (DDN) Network Information Center
          (NIC) is responsible for registering network numbers, auto-
          nomous system numbers, and domain names.  Regional nets will
          have their own policies and requirements for Internet con-
          nections, but all use the NIC for this registration service.
          Contact the NIC for further information, at:

               DDN Network Information Center
               SRI International, Room EJ291
               333 Ravenswood Avenue
               Menlo Park, CA  94025

               Email:   HOSTMASTER@NIC.DDN.MIL
               Phone:   1-415-859-3695
                        1-800-235-3155 (toll-free hotline)

5.4 YP and DNS: Dueling name servers.

          The Domain Name System (DNS) provides name service: it
          translates host names into IP addresses (this mapping is
          also called "resolution").  Two widespread DNS implementa-
          tions are "bind" and "named."  The Sun Yellow Pages (YP)
          system can be configured to provide an identical service, by
          providing remote, keyed access to the "hosts.byname" map.
          Unfortunately, if both DNS and the YP hosts.byname map are
          installed, they can interact in disruptive ways.

          The problem has been noted in systems in which DNS is used
          as a fallback, to resolve hostnames that YP cannot.  If DNS
          is slow in responding, the timeout in program ypserv may
          expire, which triggers a repeated request.  This can result
          in disaster if DNS was initially slow because of congestion:
          the slower things get, the more requests are generated,
          which slows things even more.  A symptom of this problem is
          that failures by the DNS server or network will trigger

noToC RFC1147 - Page 169

          numerous requests to DNS.

          Reportedly, the bug in YP that results in the avalanche of
          DNS requests has been repaired in SunOS 4.1.  The problem,
          however, is more fundamental than an implementation error.
          The YP map hosts.byname and the DNS contain the same class
          of information.  One can get an answer to the same query
          from each system.  These answers may well be different:
          there is not a mechanism to maintain consistency between the
          systems.  More critical, however, is the lack of a mechanism
          or procedure to establish which system is authoritative.
          Hence, running the DNS and YP name services in parallel is
          pointless.  If the systems stay consistent, then only one is
          needed.  If they differ, there is no way to choose which is
          correct.

          The YP hosts.byname service and DNS are comparable, but
          incompatible.  If possible, a site should not run both ser-
          vices.  Because of Internet policy, sites with Internet con-
          nections MUST use the DNS.  If YP is also used, then it
          should be configured with YP-to-DNS disabled.

          Hacking a system so that it uses DNS rather than the YP
          hosts.byname map is not trivial, and should not be attempted
          by novices.  The approach is to rebuild the shared C link-
          library, so that system calls to gethostbyname() and
          gethostbyaddr() will use DNS rather than YP.  To complete
          the change, programs that do not dynamically link the shared
          C library (rcp, arp, etc.)  must also be rebuilt.

          Modified shared C libraries for Sun 3s and Sun 4s are avail-
          able via anonymous FTP from host uunet.uu.net, in the sun-
          fixes directory.  Note that use of DNS routines rather than
          YP for general name resolution is not a supported SunOS
          feature at this time.

6. Internet Security

          The guidelines and advice in this section pertain to enhanc-
          ing the protection of data that are merely "sensitive."  By
          themselves, these measures are insufficient for protecting
          "classified" data.  Implementing the policies required to
          protect classified data is subject to stringent, formal
          review procedures, and is regulated by agencies such as the
          Defense Investigative Service (DIS) and the National Secu-
          rity Agency (NSA).

          A network manager must realize that he is responsible for

noToC RFC1147 - Page 170

          protecting his system and its users.  Furthermore, though
          the Internet may appear to be a grand example of a coopera-
          tive joint enterprise, recent incidents have made it clear
          that not all Internet denizens are benign.

          A network manager should be aware that the network services
          he runs have a large impact on the security risks to which
          his system is exposed.  The prudent network manager will be
          very careful as to what services his site provides to the
          rest of the Internet, and what access restrictions are
          enforced.  For example, the protocol "finger" may provide
          more information about a user than should be given to the
          world at large.  Worse, most implementations of the protocol
          TFTP give access to all world-readable files.

          This section highlights several basic security considera-
          tions for Internet sites.  It then lists several sources of
          information and advice on improving the security of systems
          connected to the Internet.

6.1 Basic Internet Security

          Two major Internet security threats are denial of service
          and unauthorized access.

          Denial of service threats often take the form of protocol
          spoofers or other malicious traffic generators.  These prob-
          lems can be detected through system monitoring logs.  If an
          attack is suspected, immediately contact your regional net
          office (e.g., SURANET, MILNET).  In addition, DDN users
          should contact SCC, while other Internet users should con-
          tact CERT (see below).  A cogent description of your
          system's symptoms will be needed.

          At your own site, be prepared to isolate the problems (e.g.,
          by limiting disk space available to the message queue of a
          mail system under attack).  As a last resort, coping with an
          attack may require taking down an Internet connection.  It
          is better, however, not to be too quick to quarantine your
          site, since information for coping with the attack may come
          via the Internet.

          Unauthorized access is a potentially more ominous security
          threat.  The main avenues are attacks against passwords and
          attacks against privileged system processes.

          An appallingly common means of gaining entry to systems is
          by use of the initial passwords to root, sysdiag, and other

noToC RFC1147 - Page 171

          management accounts that systems are shipped with.  Only
          slightly less vulnerable are common or trivial passwords,
          since these are readily subverted by dictionary attacks.*
          Obvious steps can reduce the risk of password attacks: pass-
          words should be short-lived, at least eight characters long,
          with a mix of upper and lower case, and preferably random.
          The distasteful aspect of memorizing a random string can be
          alleviated if the password is pronounceable.

          Improving passwords does not remove all risks.  Passwords
          transmitted over an ethernet are visible to all attached
          systems.  Furthermore, gateways have the potential to inter-
          cept passwords used by any FTP or TELNET connections that
          traverse them.  It is a bad idea for the root account to be
          accessed by FTP or TELNET if the connections must cross
          untrusted elements.

          Attacks against system processes are another avenue of unau-
          thorized access.  The principle is that by subverting a sys-
          tem process, the attacker can then gain its access
          privileges.

          One approach to reducing this risk is to make system pro-
          grams harder to subvert.  For example, the widespread attack
          in November 1988 by a self-replicating computer program
          ("worm," analogous to a tapeworm) subverted the "fingerd"
          process, by loading an intrusive bootstrap program (known
          variously as a "grappling hook" or "vector" program), and
          then corrupting the stack space so that a subroutine's
          return address was overwritten with the address of the
          bootstrap program.** The security hole in fingerd consisted
          of an input routine that did not have a length check.  Secu-
          rity fixes to "fingerd" include the use of a revised input
          routine.

          A more general protection is to apply the principle of
          "least privilege."  Where possible, system routines should
          run under separate user IDs, and should have no more
          privilege than is necessary for them to function.
          _________________________
          * Exotic fantasy creatures and women's names are well
          represented in most password dictionaries.
          ** An early account of the Internet Worm incident of
          November 1988 is given by Eugene Spafford in the Janu-
          ary 89 issue of "Computer Communications Review."
          Several other articles on the worm incident are in the
          June 89 issue of the "Communications of the ACM."

noToC RFC1147 - Page 172

          To further protect against attacks on system processes, sys-
          tem managers should regularly check their system programs to
          ensure that they have not been tampered with or modified in
          any way.  Checksums should be used for this purpose.  Using
          the operating system to check a file's last date of modifi-
          cation is insufficient, since the date itself can be
          compromised.

          Finally, to avoid the unauthorized replacement of system
          code, care should be exercised in assigning protection to
          its directory paths.

          Some system programs actually have "trap doors" that facili-
          tate subversion.  A trap door is the epitome of an undocu-
          mented feature: it is a hidden capability of a system pro-
          gram that allows a knowledgeable person to gain access to a
          system.  The Internet Worm exploited what was essentially a
          trap door in the BSD sendmail program.

          Ensuring against trap doors in software as complex as send-
          mail may be infeasible.  In an ideal world, the BSD sendmail
          program would be replaced by an entire mail subsystem (i.e.,
          perhaps including mail user agents, mail transfer agents,
          and text preparation and filing programs).  Any site using
          sendmail should at least obtain the less vulnerable,
          toughened distribution from ucbarpa.berkeley.edu, in file
          ~ftp/4.3/sendmail.tar.Z.  Sites running SunOS should note
          that the 4.0.3 release closed the security holes exploited
          by the Internet Worm.  Fixes for a more obscure security
          hole in SunOS are available from host uunet.uu.net in
          ~ftp/sun-fixes; these improvements have been incorporated in
          SunOS 4.1.

          Sendmail has problems other than size and complexity.  Its
          use of root privileges, its approach to alias expansion, and
          several other design characteristics present potential ave-
          nues of attack.  For UNIX sites, an alternative mail server
          to consider is MMDF, which is now at version 2.  MMDF is
          distributed as part of the SCO UNIX distribution, and is
          also available in the user contributed portion of 4.3BSD.
          Though free, MMDF is licensed, and resale is restricted.
          Sites running MMDF should be on the mmdf email list;
          requests to join this list should be sent to:
               mmdf2-request@relay.cs.net.

          Programs that masquerade as legitimate system code but which
          contain trap doors or other aides to unauthorized access are
          known as trojan horses.  Computer "viruses," intrusive

noToC RFC1147 - Page 173

          software that infects seemingly innocent programs and pro-
          pagates when the infected programs are executed or copied,
          are a special case of trojan horse programs.*

          To guard against trojan horse attacks, be wary of programs
          downloaded from remote sources.  At minimum, do not download
          executables from any but the most trusted sources.  Also, as
          noted above, to avoid proliferation of "infected" software,
          checksums should be computed, recorded, and periodically
          verified.

6.2 Security Information Clearing-Houses

          The Internet community can get security assistance from the
          Computer Emergency Response Team (CERT), established by
          DARPA in November 1988.  The Coordination Center for the
          CERT (CERT/CC) is located at the Software Engineering Insti-
          tute at Carnegie Mellon University.  The CERT is intended to
          respond to computer security threats such as the November
          '88 worm attack that invaded many defense and research com-
          puters.  Consult RFC 1135 (Reynolds, J., "The Helminthiasis
          of the Internet", USC/ISI, December 1989), for further
          information.

          CERT assists Internet sites in response to security attacks
          or other emergency situations.  It can immediately tap
          experts to diagnose and solve the problems, as well as
          establish and maintain communications with the affected com-
          puter users and with government authorities as appropriate.
          Specific responses will be taken in accordance with the
          nature of the problem and the magnitude of the threat.

          CERT is also an information clearing-house for the identifi-
          cation and repair of security vulnerabilities, informal
          assessments of existing systems in the research community,
          improvement to emergency response capability, and both ven-
          dor and user security awareness.  This security information
          is distributed by periodic bulletins, and is posted to the
          USENET news group comp.security.announce.  In addition, the
          security advisories issued by CERT, as well as other useful
          security-related information, are available via anonymous
          FTP from cert.sei.cmu.edu.

          For immediate response to attacks or incidents, CERT mans a
          _________________________
          * Virus attacks have been seen against PCs, but as yet
          have rarely been directed agains UNIX systems.

noToC RFC1147 - Page 174

          24-hour hotline at (412) 268-7090.  To subscribe to CERT's
          security announcement bulletin, or for further information,
          contact:

               CERT
               Software Engineering Institute
               Carnegie Mellon University
               Pittsburgh, PA  15213-3890

               (412) 268-7080
               cert@cert.sei.cmu.edu.

          For DDN users, the Security Coordination Center (SCC) serves
          a function similar to CERT.  The SCC is the DDN's clearing-
          house for host/user security problems and fixes, and works
          with the DDN Network Security Officer.  The SCC also distri-
          butes the DDN Security Bulletin, which communicates informa-
          tion on network and host security exposures, fixes, and con-
          cerns to security and management personnel at DDN facili-
          ties.  It is available online, via kermit or anonymous FTP,
          from nic.ddn.mil, in SCC:DDN-SECURITY-yy-nn.TXT (where "yy"
          is the year and "nn" is the bulletin number).  The SCC pro-
          vides immediate assistance with DDN-related host security
          problems; call (800) 235-3155 (6:00 a.m. to 5:00 p.m.
          Pacific Time) or send e-Mail to SCC@NIC.DDN.MIL.  For 24
          hour coverage, call the MILNET Trouble Desk (800) 451-7413
          or AUTOVON 231-1713.

          The CERT/CC and the SCC communicate on a regular basis and
          support each other when problems occur.  These two organiza-
          tions are examples of the incident response centers that are
          forming; each serving their own constituency or focusing on
          a particular area of technology.

          Other network groups that discuss security issues are:
          comp.protocols.tcp-ip, comp.virus (mostly PC-related, but
          occasionally covers Internet topics), misc.security, and the
          BITNET Listserv list called VIRUS-L.

7. Internet Information

          There are many available references on the TCP/IP protocol
          suite, the internet architecture, and the DDN Internet.  A
          soon to be published FYI RFC document, "Where to Start: A
          Bibliography of General Internetworking Information." pro-
          vides a bibliography of online and hard copy documents,
          reference materials, and multimedia training tools that
          address general networking information and "how to use the

noToC RFC1147 - Page 175

          Internet."  It presents a representative collection of
          materials that will help the reader become familiar with the
          concepts of internetworking.  Inquires on the current status
          of this document can be sent to user-doc@nnsc.nsf.net or by
          postal mail to:

               Corporation for National Research Initiatives
               1895 Preston White, Suite 100
               Reston, VA  22091
               Attn: IAB Secretariat.

          Two texts on networking are especially noteworthy.  Inter-
          networking With TCP/IP, by Douglas Comer, is an informative
          description of the TCP/IP protocol suite and its underlying
          architecture.  The UNIX System Administration Handbook, by
          Nemeth, Snyder, and Seebass, is a "must have" for system
          administrators who are responsible for UNIX hosts.  In addi-
          tion to covering UNIX, it provides a wealth of tutorial
          material on networking, the Internet, and network manage-
          ment.

          A great deal of information on the Internet is available
          online.  An automated, online reference service is available
          from CSNET.  To obtain a bibliography of their online offer-
          ings, send the email message

               request: info
               topic: help
               request: end

          to info-server@sh.cs.net.

          The DDN NIC also offers automated access to many NIC docu-
          ments, online files, and WHOIS information via electronic
          mail.  To use the service, send an email message with your
          request specified in the SUBJECT field of the message.  For
          a sampling of the type of offerings available through this
          service, send the following message

               To: SERVICE@NIC.DDN.MIL
               Subject: help
               Msg: <none>


          The DDN Protocol Implementations and Vendors Guide, pub-
          lished by the DDN Network Information Center (DDN NIC),* is
          _________________________
          * Products mentioned in the guide are not specifically

noToC RFC1147 - Page 176

          an online reference to products and implementations associ-
          ated with the DoD Defense Data Network (DDN) group of com-
          munication protocols, with emphasis on TCP/IP and OSI proto-
          cols.  It contains information on protocol policy and
          evaluation procedures, a discussion of software and hardware
          implementations, and analysis tools with a focus on protocol
          and network analyzers.  To obtain the guide, invoke FTP at
          your local host and connect to host NIC.DDN.MIL (internet
          address 26.0.0.73 or 10.0.0.51).  Log in using username
          'anonymous' with password 'guest' and get the file
          NETINFO:VENDORS-GUIDE.DOC.

          The DDN Protocol Guide is also available in hardcopy form.
          To obtain a hardcopy version of the guide, contact the DDN
          Network Information Center:

               By U.S. mail:
                       SRI International
                       DDN Network Information Center
                       333 Ravenswood Avenue, Room EJ291
                       Menlo Park, CA 94025

               By e-mail:
                       NIC@NIC.DDN.MIL

               By phone:
                       1-415-859-3695
                       1-800-235-3155 (toll-free hotline)

          For further information about the guide, or for information
          on how to list a product in a subsequent edition of the
          guide, contact the DDN NIC.

          There are many additional online sources on Internet Manage-
          ment.  RFC 1118, "A Hitchhiker's Guide to the Internet," by
          Ed Krol, is a useful introduction to the Internet routing
          algorithms.  For more of the nitty-gritty on laying out and
          configuring a campus net, see Charles Hedrick's "Introduc-
          tion to Administration of an Internet-based Local Network,"
          available via anonymous FTP from cs.rutgers.edu (sometimes
          listed in host tables as aramis.rutgers.edu), in subdirec-
          tory runet, file tcp-ip-admin.  Finally, anyone responsible
          for systems connected to the Internet must be thoroughly
          versed in the Host Requirements RFCs (RFC 1122 and RFC 1123)
          _________________________
          endorsed or recommended by the Defense Communications
          Agency (DCA).

noToC RFC1147 - Page 177

          and "Requirements for Internet Gateways," RFC 1009.

8. The Final Words on Internet Management

          Keep smiling, no matter how bad things may seem.  You are
          the expert.  They need you.

9. Security Considerations

          Security issues are discussed in Section 6.

          10. Author's Address

          Robert H. Stine
          SPARTA, Inc.
          7926 Jones Branch Drive
          Suite 1070
          McLean, VA 22102

          EMail: STINE@SPARTA.COM