ARM shared libraries

About this recipe

In this recipe you will learn:

what an ARM shared library is;
how the shared library mechanism works;
how to instruct the ARM linker to make a shared library;
how to make a toy shared library from the string section of the ANSI C library.

About ARM shared libraries

ARM shared libraries support the sharing of utility, service or library functions between several concurrently executing client applications in a single address space. Such shared code is necessarily reentrant.

If a function is reentrant, each of its concurrently active clients must have a separate copy of the data it manipulates for them. The data cannot be associated with the code itself unless the data is read-only. In the ARM shared library architecture, a dedicated register (called sb) is used to address (indirectly) the static data associated with a client.

An ARM shared library is read only, reentrant and usually position independent. A shared library made exclusively from object code compiled by the ARM C compiler will have all three of these attributes. Library components implemented in ARM Assembly Language need not be reentrant and position independent, but in practice, only position independence is inessential.

A library with all three of these attributes in an ideal candidate for packing into a system ROM.

Some shared library mechanisms associate a shared library's data with the library itself and put only a place holder in the stub. At run time, a copy of the library's initialised static data is copied into the client's place holder by the dynamic linker or by library initialisation code.

The ARM shared library mechanism supports these ways of working provided the data is free of values which require link-time (or run time) relocation. In other words, it can be supported provided the input data areas are free of relocation directives.

How ARM shared libraries work

Stubs and proxy functions

When a client application is linked with a shared library, it is linked not with the library itself but with a stub object containing:

an entry vector;
a copy of the library's static data or a place holder for it.

Each member of the entry vector is a proxy for a function in the matching shared library.

When a client first calls a proxy function, the call is directed to a dynamic linker. This is a small function (typically about 50-60 ARM instructions) which:

locates the matching shared library;
if required, copies an initial image of the library's static data from the library to the place holding area in the stub;
patches the entry vector so each proxy function points at the corresponding library function;
resumes the call.

Once an entry vector has been patched, all future proxy calls proceed directly to the target library function with only minimal indirection delay and no intervention by the dynamic linker.

Of course, making an inter-link-unit call like this is more expensive than making a straightforward local procedure call, but not a lot so. It is also the only supported way to call a function more than 32MBytes away.

Locating a library which matches the stub

Locating a matching shared library is specific to a target system and you must provide code to do the location, but the remainder of the dynamic linking process is generic to all target systems. Consequently, in order to use ARM shared libraries, you have to design and implement a library location mechanism and adapt the dynamic linker to it. In practice, this is quite straightforward:

the ARM Linker provides support for parameterising a location mechanism;
a basic dynamic linker with neither location nor failure reporting mechanisms is a mere 42 ARM instructions.

Please refer to ARM shared library format for a full explanation of parameter blocks.

How the dynamic linker works

The dynamic linker is entered via a proxy call with r0 pointing at the dynamic linker's 16-byte entry stub. Following this stub code is a copy of the parameter block for the shared library.

Stored in the parameter block is the identity of the library - perhaps a 32-bit unique identifier or perhaps a string name. Either way, it can be passed to the library location mechanism. You have to decide how to identify your shared libraries and, hence, what to put in their parameter blocks.

The library location function is required to return the address of the start of the library's offset table.

A primitive location mechanism might be to search a ROM for a matching string. This would identify the start of the parameter block of the matching shared library. Immediately preceding it will be negative offsets to library entry points and a non-negative count word containing the number of entry points. By working backwards through memory and counting, you can be sure you have found the entry vector and can return the address of its count word to the dynamic linker.

More sophisticated location schemes are possible, for example:

You might include in your library a header containing code to execute when the library is first loaded (into RAM) or initialised (in ROM) which registers the library's name with a library manager. Obviously, the library manager has to be locatable without using the library manager, so either it's address has to be known or its function has to be supported by an underlying system call.
Acorn's RISC OS operating system supports a module mechanism which is sometimes used to implement shared libraries. A RISC OS module may, by declaring so in its module header, be called when software interrupts (SWIs) in a declared range occur. When such a module is loaded, it extends the range of SWIs interpreted by RISC OS. We can use this mechanism to locate a shared library by storing the identity of a library location SWI in the library's parameter block and by implementing this SWI in the library module's header.

Instructing the linker to make a shared library

Prerequisites

A shared library can be made from any number of object files, including reentrant stubs of other shared libraries, but two simple rules must be followed:

each object file must conform to a reentrant version of the ARM Procedure Call Standard and each code area must have the REENTRANT attribute;
there may be no unresolved references resulting from linking together the component objects.

An immediate consequence of the second rule is that it is impossible to make two shared libraries which refer to one another: to make the second library and its stub would require the stub of the first, but to make the first and its stub would require the stub of the second.

The first rule is not 100% necessary and is difficult to enforce. The ARM Linker warns you if it finds a non-reentrant code area in the list of objects to be linked into a shared library but it will build the library and its matching stub anyway. You have to decide whether the warning is real, or merely a formality.

Linker outputs

The ARM linker generates a shared library as two files:

a plain binary file containing the read-only, reentrant, usually position independent, shared code;
an AOF format stub file with which client applications can be linked.

The linker can also generate a reentrant stub suitable for inclusion in another shared library.

The library image file contains, in order:

read only code sections from your input objects;
if so requested, a read only copy of the initialised static data from the input objects;
a table of (negative) offsets from the end of the library to its entry points;
if so requested, the size and offset of the static data image;
a copy of the library's parameter block.

You request a copy of the initialised static data to be included in a library when you describe to the linker how to make a shared library. If you request this, the linker writes the length and offset of the data image immediately after the entry vector. During linking, armlink defines symbols SHL$$data$$Size and SHL$$data$$Base to have these values; components of your library may refer to these symbols. Instead of including the static data in the stub armlink includes a zero initialised place holding area of the same size. It also writes the length and (relocatable) address of this place holding, zero initialised stub data area immediately after the dynamic linker's entry veneer, giving the dynamic linker sufficient information to initialise the place holder at run time. During linking, the linker symbols SHL$$data$$Size and $$0$$Base describe this length and relocatable address.

Obviously, any data included in your shared library must be free of relocation directives. Please refer to ARM shared library format for a full explanation of what kind of data can be included in a shared library.

You specify a parameter block when you describe to the linker how to make a shared library. You might, for example, include the name of the library in its parameter block, to aid its location. An identical copy of the parameter block is included in the library's entry vector in the stub file.

Describing a shared library to the linker

To describe a shared library to the linker you have to prepare a file which describes:

the name of the library;
the library parameter block;
what data areas to include;
what entry points to export.

For precise details of how to do this, please refer to ARM shared library format. Below is an intuitive example you can work with and adapt:

; First, give the name of the file to contain the library -
; strlib - and its parameter block - the single word 0x40000...
> strlib \
  0x40000
; ...then include all suitable data areas...
+ ()
; ... finally export all the entry points...
; ... mostly omitted here for brevity of exposition.
memcpy
...
strtok

The name of this file is passed to armlink as the argument to the -SHL command line option (please refer to the chapter The ARM Linker (armlink) for further details).

Making a toy string library

This section refers to the files collected in the strlib subdirectory of the examples directory of the release.

The header files config.h and interns.h let you compile cl/string.c locally. Little-endian code is assumed. If you want to make a big-endian string library you should edit config.h. Similarly, if you want to alter which functions are included or whether static data is initialised by copying from the library, then you should edit config.h. You do not need to edit interns.h. If you use config.h unchanged you will build a little-endian library which includes a data image and which exports all of its functions.

Compiling the string library

To compile string.c, use the following command:

armcc -li -apcs /reent -zps1 -c -I. ../../cl/string.c

The -li flag tells armcc to compile for a little-endian ARM.

The -apcs /reent flag tells armcc to compile reentrant code.

The -zps1 flag turns off software stack limit checking and allows the string library to be independent of all other objects and libraries. With software stack limit checking turned on, the library would depend on the stack limit checking functions which, in turn, depend on other sections of the C run time library. While such dependencies do not much obstruct the construction of full scale, production quality shared libraries, they are major impediments to a simple demonstration of the underlying mechanisms.

The -I. flag tells armcc to look for needed header files in the current directory.

Linking the string library

To make a shared library and matching stub from string.o, use the following linker command:

armlink -o strstub.o -shl strshl -s syms string.o

strlib's stub will be put in strstub.o as directed by the -o option.

The file strshl contains instructions for making a shared library called strlib. A shortened version of it was shown in the earlier section "Describing a shared library to the linker."

The option -s syms asks for a listing of symbol values in a file called syms. You may later need to look up the value of EFT$$Offset (it will be 0xA38 if you have changed nothing). As supplied, the dynamic linker expects a library's extenal function table (EFT) to be at the address 0x40000. So, unless you extend the dynamic linker with a library location mechanism (please refer to the discussion in the earlier section How the dynamic linker works), you will have to load strlib at the address 0x40000-EFT$$Offset.

Making the test program and dynamic linker

Now you should assemble the dynamic linker and compile the test code:

armasm -li dynlink.s dynlink.o
armcc -li -c strtest.c

You can extend the test code to probe lots of string functions, but this is left as an exercise to help you understand what is going on.

To make the test program you must link together the test code, the dynamic linker, the string library stub and the appropriate ARM C library (so that references to library members other than the string functions can be resolved):

armlink -d -o strtest strtest.o dynlink.o strstub.o ../../lib/armlib.32l

Running the test program with the shared string library

Now you are ready to try everything under the control of command-line armsd:

A.R.M. Source-level Debugger version ...
ARMulator V1.30, 4 Gb memory, MMU present, Demon 1.1,...
Object program file strtest
armsd: getfile strlib 0x40000-0xa38
armsd: go

strerror(42) returns unknown shared string-library error 0x0000002A

Program terminated normally at PC = 0x00008354 (__rt_exit + 0x24)
+0024 0x00008354: 0xef000011 .... :    swi      0x11
armsd: q
Quitting

Before starting strtest you must load the shared string library by using:

getfile strlib 0x40000-0xa38

strlib is the name of the file containing the library; 0x40000 is the hard wired address at which the dynamic linker expects to find the external function table; and 0xa38 is the value of EFT$$Offset, the offset of the external function table from the start of the library.

When strtest runs, it calls strerror(42) which causes the dynamic linker to be entered, the static data to be copied, the stub vector to be patched and the call to be resumed. You can watch this is more detail by setting a breakpoint on __rt_dynlink and single stepping.

Suggested further exercises

Library location mechanisms

Locating a library's EFT at 0x40000 is not very satisfactory, so an obvious exercise is to extend the dynamic linker to locate a library by looking for it. Try, for example, adding a header to the start of the library which contains:

offset to the next loaded library or 0
the total length of the library
the offset to the external function table
the string name of the library

Hint: when you link this area with the other library contents you have to ensure that it wil precede all other areas in the library. Please refer to Area placement and sorting rules for further details.

Your dynamic linker could now search a list of libraries loaded at 0x40000 onwards.

Self-loading libraries

You could extend the header mechanism described in the previous subsection so that a library could copy itself to the next free location above 0x40000. This would allow libraries to be loaded at 0x8000 and 'executed' there. Of course, you would want your header to begin with a branch to the code which will copy the library from 0x8000 to its destination above 0x40000.

Multiple shared libraries

Once you have built location and loading mechanisms, you can build more than one shared library. Try making one of your own and linking a test program with the stubs of two or more libraries.

Inter-library calls

Once you have multiple libraries working, you can try making one library call functions in another (but remember that if library A refers to library B then library B may not refer to library A). To do this you will have to make a reentrant stub for the library you wish to refer to and link this into the library making the reference.