February 1, 1990 Revised September 21, 1990 Proposal to Restructure the Network Supporting BITNET Historically the physical and logical networks supporting BITNET have been structured on leased communications lines connecting member sites. New sites connect to the network based upon an existing BITNET member's willingness and ability to support the request for connection. The cost of the leased communications line, which is paid for by the new BITNET site, is also a consideration in choosing which existing BITNET sites are possible connection points. The cost of the communications line may be different based on factors of distance and tariff considerations. This method of connecting new sites has worked quite well, but the lack of a formal structure for the network is creating limitations which impair the efficiency and growth of BITNET. The lack of a structure increases the complexity of routing table generation and hampers efforts to implement network management tools. The ability to create BITNET links by using the national and regional IP networks increases the confusion over where BITNET connections should be made. The idea of reorganizing BITNET into a more formal structure has been discussed for a number of years, but the costs of the additional leased lines needed to implement such a plan made the project economically infeasible. Now, BITNET has the ability to use the national and regional IP networks; a plan to restructure BITNET by using the IP networks appears to be cost effective and implementable. This plan could also be implemented using private communications facilities and IP routers instead of using the national and regional networks. The costs of an implementation using a private IP network would need to be carefully considered. Restructuring Proposal The restructuring is based on the concept of 'regionalization', the separation of the network into geographic areas or regions. Each region would have two 'core' sites. Each core site would have RSCS-over-IP connections to every other core site. The core sites would form a 'backbone'. The national and regional IP networks are the physical facilities that would be used by the core sites to form the BITNET backbone. By generating appropriate BITNET routing tables, the number of nodes and the amount of traffic handled by the core sites for a given region can be statically balanced. Within a region, the core sites could connect to a number of 'mid-level' sites, again by use of RSCS-over-IP. This type of structure has the ability to provide an alternate path into a region if a core site were out of service. The member sites or 'end' sites within a region could connect to the mid-level sites. Traditional leased line connections may exist at any level within the structure but these types of connections will continue to have limitations. That is, if a host with traditional leased lines is down, no other path may exist to the sites supported by that leased line. The following diagram attempts to present graphically the structure of the regional model. Diagram for Regional Model (Example of a single Region) +--------+ | end | | site | +---||---+ RSCS/IP RSCS/IP +---------+ +---||---+ ========== core | +---------+ BSC | end | | site | | mid- ----------- site | RSCS/IP | | RSCS/IP | level | +--------+ ========== =========== site | | | | | RSCS/IP +--------+ RSCS/IP | | | =========== end | ========== | +---||----+ | site | +---||----+ || +--------+ || || RSCS/IP RSCS/IP || || RSCS/IP +---||----+ || +--------+ ========== core | +---||----+ BSC | end | | site | | mid- ----------- site | RSCS/IP | | RSCS/IP | level | +--------+ ========== =========== site | | | | | RSCS/IP +--------+ RSCS/IP | | | =========== end | ========== | +---||----+ | site | +----|----+ RSCS/IP +--------+ | || | +---||---+ +--------+ | | end | BSC | end | | | site ------------ site | | +--------+ +--------+ | | +--------+ | BSC | end | |----------------------------------- site | +--------+ In the diagram, the end site shown as a single node can be a collection of nodes connected by a mixture of leased lines and IP. Benefits of the restructuring The purpose of the regionalization is to impose a structure on the logical network supporting BITNET. This structure will reduce the burden on the current hub sites by decreasing the number of files which must transit these sites. Overall network service will be improved because the number of 'hops' a file must take to reach its destination will be reduced or be no greater than in the current BITNET topology. The impact on BITNET when a key BITNET node or Internet connectivity fails will be reduced because of the increased number of connections between core sites. As the intra-regional mid-level structure develops, the ability of the core sites to dynamically reroute traffic around a disabled core node will provide improved network access. The three level (core, mid-level, end site) structure of the region can be expanded to include additional levels and paths as needed within the region, thereby providing for dynamic rerouting within a region as well. Implementation Plan The current BITNET network would be divided into seven regions. Each region would have the required two core sites. No attempt has been made to define the mid-level sites within the regions. The mid-level structure will develop at different rates within the different regions. The rate will depend on many factors including the level of IP connectivity, amount of BITNET traffic generated, and amount of network expertise available. The regions and core sites can be implemented without having the mid-level structure defined. The proposed region and associated core sites are as follows: Region 1 - NorthEast MITVMA: Massachusetts Institute of Technology YALEVM: Yale University Region 2 - MidEast CORNELLC: Cornell University CUNYVMV2: City University of New York Region 3 - MidAtlantic PUNFSV2: Princeton University PSUVM: Penn State University Region 4 - SouthEast UMDD: University of Maryland VTBIT: Virginia Polytechnic Institute and University Region 5 - MidWest UICVM: University of Illinois at Chicago UGA: University of Georgia 1 Region 6 - MidSouth RICEVM1: Rice University UIUCVMD: University of Illinois at Urbana-Champaign Region 7 - West UCBCMSA: University of California at Berkeley USCVM: University of Southern California Core Site Guidelines For a machine to be considered for use as a core site it should meet the following requirements: 1. The machine must run VM with FAL, VMNET, and RSCS Version 2. 2. The system must have the capacity to support the number of RSCS-over-IP connections needed for connectivity to other core sites. This capacity should take into account IP connectivity capacity, processor capacity, spool capacity and manpower requirements. 3. Core sites may need to add additional software facilities and modifications to improve throughput and overall network management. An example of these might be installation of the BITNET II transmission algorithm to improve throughput of RSCS over IP. 4. One if not both core sites in a region should run a LISTSERV which is part of the LISTSERV backbone. Because LISTSERV files are a large component of BITNET traffic, having the core sites on the LISTSERV backbone will enhance the operation of this pervasive application. 5. Core sites are encouraged to consider running NETSERV and acting as INTERBIT sites. Running these facilities will reduce network load and provide greater service to the region. 6. With BITNET making use of the national and regional IP networks, the core sites will need to establish procedures for dealing with their IP service and its supplier. These procedures should include out-of-hours procedures, local operating procedures, and problem determination procedures. These same general guidelines can be scaled appropriately and applied to the mid-level sites. All the suggested core sites meet the guidelines with minor exceptions. All of the core sites chosen are VM machines in the 308x class or larger. All core sites chosen run FAL or WISCNET, with all but one of the WISCNET sites planning conversion to FAL in the near future. All of the core sites expect to run RSCS Version 2 as soon as feasible. All core sites chosen are accessible via the regional and national IP networks. Clearly the BITNET core sites will assume a leadership role in their regions. Expertise in RSCS, FAL, and TCP/IP, as well as operating systems will be of value to the core site in problem solving. This expertise will also be useful in guiding the development of networking within the core site's BITNET region. One of the goals of this plan is to minimize the disruption to the functioning of BITNET. Current leased line connections would not be required to be altered. All current BITNET sites would be assigned to one of the regions. This is true for the international connections as well. The international links would be associated with the region of the host system. New RSCS-over-IP connections would be made only within a region; inter-regional traffic flow would be via the BITNET backbone. Some of the current leased line connections would no longer be used. For example, George Washington University (GWUVM) has leased line connections to Penn State (PSUVM) and University of Maryland (UMDD). Under this regionalization plan, GWUVM is part of Region 4 and its outbound traffic would flow toward the core site at UMDD. GWUVM's connection to PSUVM would be unused. Another example of a change in the current BITNET topology would be the RSCS-over-IP link between Princeton (PUNFSV2) and Columbia University (CUVMB). This link would be discontinued because it would be inter- regional. CUVMB would be part of Region 2 and PUNFSV2 would be part of Region 3. Connection of Cooperating Networks Cooperating networks, such as EARN and NetNorth, may be connected to BITNET in different ways under this proposal. A cooperating network might be connected to BITNET using a traditional point-to-point connection to one BITNET site and appearing as an extension of the region that the cooperating network physically connects to. An example of this type of connection would be the cooperating network EARN appearing as an extension of Region 2 because of FRMOP22's physical connection to CUNYVMV2. Alternatively, a cooperating network might connect by having multiple RSCS-over-IP connections to BITNET core sites. This could be accomplished in three ways: A designated cooperating network site could have multiple connections to different core sites. An example of this might be UTORVM of NetNorth being the designated NetNorth site and having connections to CORNELLC in Region 1, RICEVM1 in Region 6, and UCBCMSA in Region 7. The second way of accomplishing the connection of a cooperating network is to have multiple sites in the cooperating network connected to multiple core sites. An example of this method of connection might be to have UTORVM of NetNorth connected to CORNELLC in Region 2, MCGILLB of NetNorth connected to RICEVM1 in Region 6, and UALTAVM of NetNorth connected to UCBCMSA in Region 7. This second method of connection should provide greater bandwidth and multiple access between BITNET and the cooperating network. RSCS-over-IP connections between cooperating network sites and non-core site BITNET members would not be permitted. The third way a cooperating network might connect to BITNET would be as a new region or regions. An example of this might be that NetNorth would designate UTORVM and UALTAVM as BITNET core sites. NetNorth would form Region 8 and all BITNET connections from NetNorth would be via these core sites. The connection of a cooperating network to BITNET should be handled on a network-by-network basis because each cooperating network has different requirements and resources. Which implementation is chosen will depend on the structure, traffic flow, and policies of the cooperating network. Conclusion If this proposal is adopted and the implementation plan accepted, the restructing of BITNET's network can be complete in a short period of time, estimated at three months. Every attempt would be made to keep the disruption of BITNET services to an absolute minimum during the restructuring. The restructed BITNET would provide better service to all BITNET members and allow for orderly growth. Restructuring Proposal for BITNET - Additional Comments Some clarification of certain points of the original proposal have been requested. The following is an attempt to provide additional information which may be of value in reading the proposal. General information and definition of terms BITNET is a store-and-forward network based on the IBM NJE protocols. The machine resources needed to process files increases as the traffic load increases. The amount of traffic handled by a site depends, in part, on its position in the network topology. Sites having a large number of BITNET connections have been referred to as 'hub' sites. Sites with only one network connection are referred to as 'leaf' sites. BITNET traffic is transported through the network based on static routing tables which indicate what the next 'hop' is for a given destination. Programs are used to generate customized routing table for each BITNET host. Information about BITNET network topology is updated periodically (monthly) and new static routing tables are distributed to BITNET member sites. The route generation programs attempt to maintain a symmetric path between any two sites. To demonstrate this, assume that A, B, C, and D are BITNET sites and the links connecting them are of equal speed and function. ----- B ----- | | A D | | ----- C ----- The route table generation programs might choose to route files from A to D via C. On the reverse path, files from D to A go via C, not B. This is what is meant by a symmetric path. Symmetric paths are not required by RSCS. Symmetric paths are maintained to reduce confusion for the users and to help avoid routing loops. Another topology found in BITNET is: A <- B <- C <- D where A is a hub, D is a leaf and the -> indicates the flow of traffic to reach other sites. Assume that D makes an additional connection with E, which is another hub site. Because of the algorithms used in the routing table generation programs and the requirement for symmetric links, the connection of D and E can reverse the flow of traffic to some or all of the other nodes. A -> B -> C -> D -> E Because resource requirements increase with traffic load, D may be unable or unwilling to expend the resources and the result will be that overall network service is decreased. If BITNET connections are established without analysis and understanding, this type of problem can occur. It is true that this class of problem has always existed within BITNET. The use of NJE over TCP/IP allows for a large increase in the number of possible connections. Advantages of the Regional Structure The proposed regional structure for BITNET attempts to address a number of the problems created or aggravated by using TCP/IP connections, in conjunction with the NFSnet, for BITNET transport. The regional structure allows for easier and quicker analysis of network routing because of only having to work with a subset of the total nodes in the network. The multi-level structure of the regional network makes the placement of hosts, based on their ability to expend resources, clearer. The regional structure provides a 'fire wall' to limit routing damage should a routing error, such as path reversal occur. The fact that all inter-regional traffic is via the regional core sites allows regions to have very different internal structures which can better meet the needs of that region. The regional structure allows for the spliting of existing regions and the addition of new regions in an orderly way to handle growth and new technical development. The regional structure provides a way of doing overall network management, such as collection of traffic statistics in a meaningful way. The overall performance of the network is improved by the structure because no nodes, in general, are farther away, measured in hops, then before and many are closer. The Future and a full mesh BITNET "Why not connect every BITNET site directly to every other?" This is the full mesh concept and most, if not all, sites are assumed to have national network connectivity. This is not yet true and may not be true for several years, even at the current rate of growth. The current reality is that the current implementation of TCP/IP (FAL), RSCS and VMNET have limitations. These limits prevent more than a few hundred TCP connection from being open at any one time. "Why not open and close TCP connections as needed?" The overhead of opening and closing a TCP/NJE connection to transport a single RSCS message is very high and would lead to degraded network service. "Why not use UDP in place of TCP to lower the overhead?" The discussion of UDP versus TCP will go on for as long as they both exist. What is needed are some prototypes for testing the differences and performance of the various implementations under true network conditions. There is no agreement on one versus the other even among the the current crop of network experts. The FRED software being developed by David Lippke at UT DALLAS is a UDP-based implementation of NJE over IP (and more). The design documents for FRED are expected to be released shortly. FRED offers the hope of a feasible full mesh BITNET (for those BITNET nodes which have IP connectivity), but there are issues of performance and throughput which can only be measured in a true BITNET environment. "Why not use SMTP for everything?" The question is interesting, but doesn't address the service provided by RSCS messages and commands or the user ease of use for file transfer. The future mail systems such as X.400 may address these issues; only time will tell. Conclusion For the near term, a BITNET regional structure provides a manageable and usable network topology. The regional structure allows for the incorporation of IP services, when available, to increase bandwidth and reduce costs. The regional structure provides for growth in an orderly way and provides a method for the introduction of new features and technologies. The regional structure provides BITNET with the ability to grow and deploy new technologies while maintaining a consistently high level of service. Lee Varian (LVARIAN@PUCC) Peter Olenick (Q0239@PUCC) Michael Gettes (GETTES@PUCC)