CERCS logo
ASAN Home
Project Description
People
Publications and Presentations
Software
Facilities
Internal Resources
Active Systems Area Networks
ASAN> Project Description

Project Description

Motivation
An Example
ASAN Computing Model
ASAN Components
 

Motivation

Two major trends provide the context and motivation for the proposed research. The first is that modern network-based applications must use systems whose architectures are based on a CPU-centric model using node architectures optimized for interaction with the memory hierarchies in uniprocessor or small-scale multiprocessor applications. Consequently, network- based applications that produce, transport, and process large data sets suffer substantial losses in performance when these data sets must be moved through the memory and I/O hierarchies of multiple nodes.

The second major trend is the evolution of computational environments driven by collaborative large scale scientific and engineering computations. The hardware base for such applications is heterogeneous and at any given time an application may utilize geographically distributed clusters of workstations/PCs, high end graphics engines, specialized multiprocessors, terabit storage facilities, and various distinct end user platforms. Both trends point to several unique features about data intensive network applications that challenge system architects.
 

  • The generation, processing, and display of the data may take place at distinct points in a WAN or SAN and thus data streams must be transmitted to and processed at multiple hosts while sustaining a minimum frame rate.
  • Processing and display deadlines cannot be met without network service guarantees.
  • Applications and the requirements they are asked to meet are dynamic and therefore the resources used by the applications must be coordinated and managed at run-time.
  • Heterogeneous systems will use multiple communication substrates with different properties and service guarantees.
  • An Example


    Scientific applications such as the interactive, parallel atmospheric modeling application depicted above form a significant class of motivating applications. For example, one transformation necessary for visualization translates data from the spectral to the grid domain, since the model itself performs its computations in one domain, whereas end users wish to view data in the other. Other transformations filter the data stream prior to transmission while display calculations such as dithering must process graphical data prior to display. Such applications make effective use of computational clusters of workstations accessed remotely by end users.

    Pipelined solutions to such stream oriented computations are well known. The key idea here is that such computations can be performed within the network interfaces as the data is being transmitted thereby avoiding costly traversals of the cache and I/O hierarchy.

    The implications of processing data as they pass through the network interfaces are three fold

    • Speed: The NIs must be capable of processing the data fast enough
    • Control: The NIs must be capable of operating autonomously and concurrently with the host CPUs
    • Configurable: The NIs must re-programmable so as to enable computations to be dynamically placed within the NIs.
    We expect to address each of these capabilities as follows.
    • Speed: The use of Field Programmable Gate Arrays (FPGAs) in the NIs.
    • Control: The use of embedded microprocessors in the NI
    • Configurable: We are developing a run-time reconfiguration infrastructure and associated API to support the dynamic reconfiguration (hardware and software) of the NIs.

    Computing Model

    The ASAN computing model is based on the following view operational view.

    It is straightforward to envision support for data stream computations in these active NIs. For example, consider an image stream that is captured at one host in a network and processed at multiple locations. In general, the image processing operations may be performed on the source CPU, source NI, destination NI, or destination CPU as identified by points 1-4 the figure. Inter-processor communication takes place by first establishing a virtual connection between source and destination. The connection abstraction provides a convenient means to specify the computation requirements of the stream and permit the placement of these computations at some point along the path. This problem is similar to the mapping actions performed on parallel programs at the time they are initialized. Thus, the establishment of a connection concerns not just the allocation of communication resources, but (1) the identification of the computations performed on stream data, (2) the allocation of appropriate computational resources, in this case configurable hardware, so that the data may be operated on in real-time, and (3) the identification of QoS parameters that will be used to guide the scheduling algorithm in the NIs. When the computations are allocated to the NIs at connection establishment, the NI configurable hardware must be programmed to implement the computations as the data is streamed through the interface.

    The key idea is to operate on data as it "passes through" the interface. The principal benefits are the ability to perform selected computations in the NIs "close" to and synchronously with the data streams they affect, thereby (1) avoiding unnecessary intermediate and host resident storage of data items, (2) decreasing loads on hosts (CPU, memory, I/O infrastructure) and/or reducing costly host CPU interruptions, and (3) avoiding disturbances of host caches. Our current design calls for the use of FPGA devices for the implementation of the configurable hardware component.

    ASAN Components

    The ASAN Project is exploring several different aspects of SAN functionality. We are currently developing the following major components.

    QUIC

    This component is the Quality of Service (QoS) infrastructure that provides communication scheduling services. The QUIC layer moves scheduling services into the network interfaces, is dynamically extensible and can provide several classes of delivery guarantees. QUIC also supports selected application-specific computations into the NIs.

    GRIM

    Generally Reliable Inter-processor Messages (GRIM) is a lightweight, low level reliable message layer that encompasses optimistic flow control, reliable multicast, and dynamic reconfiguration in the presence of faults.

    RAPID

    Reconfigurable Architecture for Processor Interconnect Devices (RAPID) is a network interface for SANs that combines reconfigurable logic in the form of field programmable gate arrays (FPGAs) and embedded microprocessors. The first version of the NI will be comprised of Compaq PCI Pamette cards with Myrinet Mezzanine cards. The initial testbed will use 8 such interfaces in a 16 node quad Pentium Pro cluster. Experiences with this testbed will be used for a second generation custom RAPID card optimizedfor the support of streaming computations.

    VCM

    The Virtual Communication Machine (VCM) software architecture aims to provide adequate support for dynamically placing and executing application-specific code called "extension modules" in the NI. Extension modules can be used for interfacing to, dynamically programming, and controlling the FPGA resources from within the RAPID NI rather than from the host CPU. Overall, the VCM provides an extensible environment with a common set of abstractions for controlling both FPGA and processor and memory resources within an NI.

     


    Center for Experimental Research in Computer Systems