3.5   Loader Architecture

The loader has a three-layered structure. At the generic level is a library of routines that are common to all configurations. At the object-module format level are several OMF managers to handle different object-module formats. At the architecture level are several RUs (relocation units), each supporting a specific architecture. Figure 3-1 illustrates the relationship between the three levels.   

 

3.5.1   Object-Module Configuration in Target Memory

Since the object-file sections hold a program (a set of routines), and since this set of routines must be installed in the real target memory, there is a strong relationship between section configuration in the file and in the target memory. How the loader locates the sections in memory depends on whether the file type is relocatable or fully linked.

Relocatable Files

The loader uses a three-segment model for loading relocatable modules. It coalesces the module sections into three segments in target memory: a text segment, a data segment, and a bss segment. It combines sections of the same or compatible type into one segment (see Figure 3-2). Note that the order of coalescence is strictly the order in which the text and literal sections exist in the file.

Because the loader creates only three segments, it handles literal sections in the same way as text sections. If literal sections are present in a relocatable object module, the loader coalesces them with the text sections. The sections are similar in that contents of both types of sections must not be modified while the program is executing.


*

NOTE: The RU also treats other constant data sections, such as ELF .rodata sections, as literal sections and coalesces them into the data segment.

Fully Linked Files

The loader retains the file configuration for fully linked files. Each section is represented separately in target memory. This allows an application to take advantage of a target system with non-contiguous RAM or with high-speed static RAM and dynamic RAM.

3.5.2   Module Management

The generic module management library keeps track of what modules have been loaded in memory, where their segments are, how much memory they use, and which symbols belong to which modules.

During loading, a temporary data structure, SEG_INFO, describes the contents of an object module. The SEG_INFO structure contains the following elements:

When the loader finishes loading the object module, it transfers the SEG_INFO contents to a more permanent representation. Each module representation is a member of a linked list and is located by means of a MODULE_ID handle. The module representation contains the following information:

Each segment in the linked list has a representation that contains:

Type definitions for both MODULE_ID and SEG_INFO are in module.h.

The ability to unload a module is an important outgrowth of module management. Unloading a module implies freeing the memory associated with the module segments and removing the module symbols from the target server symbol table.

When creating a new OMF manager, you can use the code from existing loaders as a model for providing module management.

3.5.3   Symbol Management

The loader processes the object-module symbols by reading the file symbol table. Part of this processing is required for relocation. Additional processing is necessary to add the module symbols to the target server symbol table.

The target server symbol table holds all the symbols related to the remote system, including symbols from the target agent and symbols from dynamically loaded user modules. Maintaining this table within the target server on the host saves memory on the target and gives host tools access to all public symbols (and local symbols, if desired) that are present on the target system. The target server symbol table holds one entry per symbol, with the following information:

The target server symbol table is filled at target server start-up time with the core-file symbol table. (For more information on this process, see Bootstrapping the Target Server Symbol Table.)

Symbol Processing

When a relocatable file is loaded, its symbol table is analyzed and the symbols are processed depending on the categories they belong to:

  • Defined Symbols
  • These symbols refer to objects (routines, variables, or constants) that are present in the module text, data, or bss segments. They may be static (their scope limited to the module itself) or global (accessible to objects outside the module). Defined symbols need no relocation and may be added directly to the target server symbol table if desired.

  • Undefined Symbols
  • These symbols are referred to in the module text or data but are not present in the module. They must be global. For such symbols, the loader must locate the symbol in the target server symbol table in order to relocate the module. If the symbol is not found, it is considered unknown and its name is added to a list that is returned to the tool requesting the load.


    *

    WARNING: While a module containing undefined symbols is downloaded anyway, the consequence is that the module may be partly or totally unusable since no relocation can be done on the unknown symbols. This partial linkage permits testing pieces of an application during development as long as the tested pieces do not hold references to undefined symbols.

  • Common Symbols
  • These symbols are handled as a special case. Consider the example below:

        #include <stdio.h> 
     
        int willBeCommon; 
     
        void main (void) {} 
            { 
            ... 
            }

    The symbol willBeCommon is uninitialized, so it is technically an undefined symbol. Such a definition becomes common. ANSI C allows multiple files to use uninitialized symbols to refer to the same variable and expects the loader to resolve these references consistently.

    It is often helpful in an incremental environment to be able to share a symbol among several files and to load the files in any order; however, it is also extremely risky to treat common variables in this way. When linked, a common definition could resolve to any defined symbol in the symbol table depending on what symbols are defined in already-loaded modules. The default for Tornado treats common symbols as undefined. However, you can set the loader options to permit common symbols to be loaded; then only the first instance of the symbol will be added to the symbol table and other instances will use the same address. For more information, see 3.5.4 Loader Options.


    *

    NOTE: External symbols are public (visible to everyone both within and outside the file that declares them). The term external is not used in this document to mean "declared outside the file" (as opposed to "declared inside the file").

    Symbol Type Definitions

    The loader defines a set of symbol types (SYM_TYPE) in symbol.h in order to handle the symbols more easily. The loader uses the types to classify the module symbols. The symbol types defined by the loader are:

    SYM_UNDEF
    The symbol is not defined in the module.

    SYM_LOCAL
    The scope of the symbol is limited to the module.

    SYM_GLOBAL
    The symbol is accessible to anyone.

    SYM_ABS
    The symbol has an absolute value that must be kept as it is.

    SYM_TEXT
    The symbol belongs to the text segment.

    SYM_DATA
    The symbol belongs to the data segment.

    SYM_BSS
    The symbol belongs to the bss segment.

    SYM_COMM
    The symbol is a common symbol.

    3.5.4   Loader Options

    The loader behavior may be tuned by using options. Many of these options may be combined, although some are mutually exclusive. The option names given below are the names used internally by the loader. They are defined in loadlib.h.

    Symbol Scope

    As explained in 3.5.3 Symbol Management, any defined symbol may be added to the target server symbol table. Several options allow the user to specify exactly which symbols should be added to the table:

    LOAD_LOCAL_SYMBOLS
    Add only local symbols to the target server symbol table. Note that this may include some debugging symbols, depending on the OMF.

    LOAD_GLOBAL_SYMBOLS
    Add only global symbols to the target server symbol table (the default).

    LOAD_ALL_SYMBOLS
    Add both local and global symbols to the target server symbol table.

    LOAD_NO_SYMBOLS
    Add no symbols to the target server symbol table. (This option is mutually exclusive with others.) With this option, there is no way to know the contents of the module.


    *

    WARNING: These options affect not only the visibility of the symbols but also the load operations that occur after the module is loaded. For instance, if you load a module with the LOAD_NO_SYMBOLS flag set, none of its global symbols is added to the target server symbol table. Not only are they hidden from the other modules, but they are also unreachable; no reference is possible to these symbols. If you load another object module that refers to a symbol (for example, a routine) in the previous module, the loader cannot find the reference and the symbol is considered unknown. It is also impossible to call a routine within such a module symbolically from the shell. The only way to call the routine is by using its address.

    Module Visibility

    When a module is loaded, the target server normally records it. Any tool can request and receive information related to the module. This default behavior can be changed by using the following option:

    LOAD_HIDDEN_MODULE
    Do not keep the module information. Note that this option does not hide the module symbols. To do so, it must be combined with LOAD_NO_SYMBOLS.

    Module Type

    The loader default considers all files relocatable object-modules requiring a relocation stage (see 3.5.1 Object-Module Configuration in Target Memory). To load files that do not require a relocation stage, the loader must be told that the file type is fully linked. Two options are available for fully linked files:

    LOAD_FULLY_LINKED
    Required for any fully linked file. When it is used, no relocation is done.

    LOAD_NO_DOWNLOAD
    This option suppresses downloading the file to target memory. The relocation is done in the target server cache only.

    LOAD_CORE_FILE
    It is used exclusively at connection time to load the core file. This is a combination of LOAD_NO_DOWNLOAD and LOAD_FULLY_LINKED.


    *

    CAUTION: The loader does not allocate memory on the target system when a fully-linked file is loaded as it does for relocatable files. The file segments are located at the addresses described in the file header; this depends on the OMF.

    Common Symbols

    Three mutually exclusive options provide the ability to specify how the loader handles common symbols. They are listed in order of decreasing strictness:

    LOAD_COMMON_MATCH_NONE
    Keep common symbols isolated, visible from the object module only. This option prevents any matching with already-existing symbols (in other words, the relocations that refer to these symbols are kept local). Common symbols are added to the symbol table unless LOAD_NO_SYMBOLS is set. This is the default option.

    LOAD_COMMON_MATCH_USER
    Seek a matching symbol in the target server symbol table, but consider only symbols in user modules. If no matching symbol exists, act like LOAD_COMMON_MATCH_NONE.

    LOAD_COMMON_MATCH_ALL
    Seek a matching symbol in the target server symbol table, but consider all symbols. If no matching symbol exists, act like LOAD_COMMON_MATCH_NONE.

    If several matching symbols exist for options LOAD_COMMON_MATCH_USER and LOAD_COMMON_MATCH_ALL, the order of precedence is: symbols in the bss segment, then symbols in the data segment, then symbols in the text segment. If several matching symbols exist within a single segment type, the symbol most recently added to the target server symbol table is used.

    Special Options

    The loader accepts special options, with scope limited to specific cases:

    LOAD_BAL_OPTIM (i960 targets only)
    Replace the i960 CALL opcode with the BAL opcode. This makes the loader slower, and is not a default option.

    LOAD_FILE_OUTPUT
    Dump the module text and data segments to a file instead of to target memory. This option is used only for testing. LOAD_FILE_OUTPUT is mutually exclusive with LOAD_FULLY_LINKED.