Tuesday, October 30, 2018

GCC Linker and Undefined Reference errors

The GNU ld linker is single pass linker and the order of statically linked libraries is important. If libraries are linked in the wrong order an undefined reference to symbol error will be thrown.

Here I describe the problem using an example.

main.cpp

#include <iostream>
#include "cat.h"

class Cat;

int main() {
    Cat cat;
    std::cout << cat.toString();
}

cat.h

#ifndef CLASS_CAT
#define CLASS_CAT

#include <string>


class Cat {
public:
    Cat();
    ~Cat ();
    std::string toString();
};

#endif

cat.cpp

#include "cat.h"
#include "dog.h"

class Dog;

Cat::Cat () {};

Cat::~Cat () {};

std::string Cat::toString() {
    Dog dog;
    return dog.toString();
}

dog.h

#ifndef CLASS_DOG
#define CLASS_DOG

#include <string>


class Dog {
public:
    Dog();
    ~Dog();
    std::string toString();
};

#endif

dog.cpp

#include "dog.h"

Dog::Dog() {};

Dog::~Dog() {};

std::string Dog::toString() {
    return "Dog";
};

Reproducing the problem


We want to manually build and statically link these files and produce an executable. First, we build the two libraries libcat.a and libdog.a.

g++ -c cat.cpp -o cat.o
ar cr libcat.a cat.o

g++ -c dog.cpp -o dog.o
ar cr libdog.a dog.o

Then we invoke the linker to produce the executable with the following command.

g++ main.cpp -L. -ldog -lcat

This ends with four undefined references errors for the following symbols.

Dog::Dog()
Dog::toString()
Dog::~Dog()

Describing the cause


The reason lies in the way the GNU ld linker works. The linker takes the compiled a.cpp, see that the following Cat symbols are missing and add them into the list of its undefined symbols.

Cat::Cat()
Cat::~Cat()
std::string Cat::toString()

When it links the library libdog.a, it searches in there for the missing Cat symbols and it finds none. Additionally, because none of the Dog symbols are used, the Dog symbols are tossed away for good.
When it links the library libcat.a, it finds all Cat symbols it was missing. Unfortunately, the class Cat requires the following additional Dog symbols.

Dog::Dog()
Dog::~Dog()
std::string Dog::toString()

The GNU ld linker is single pass linker and by default it doesn't go back to scan Dog again. Instead it terminates with three undefined symbols errors.
The linker keeps track of the symbols which have been used and the symbols which are still undefined. If a static library is included too early and some of its symbols are not used, they are permanently tossed away.

Proposing solutions


The are several ways to solve the problem.

Solution 1


The best solution is to change the order of the linked libraries using the following command, assuming there are no cyclic dependencies.

g++ main.cpp -L. -lcat -ldog

Solution 2


The second option is to scan the linked libraries multiple times using the options “--start-group” and “--end-group”. This is commonly used to link static libraries with cyclic dependencies.

g++ main.cpp -L. -Wl,--start-group -ldog -lcat -Wl,--end-group

Useful commands


nm


We can use GNU nm (short for “names”) to lists the symbols in an object file. The -C option is to demangle symbols names i.e. show the symbol names as they appeared in the linker errors.

nm -C d.o

In OSX nm does not support the -C option but we can pipe it through the command c++filt which does the same.

nm d.o | c++filt

To display all object files or libraries which refer to a symbol “func”.

nm -C -A *.o | grep "func"
nm -C -A lib*.a | grep "func"

To display only the undefined symbols in a file we can use the following.

nm -C -u dog.o

To display only the defined symbols, we can use the following.

nm -C --defined-only dog.o

ldd


To list all libraries used by an executable run command

ldd main

No comments:

Post a Comment