標頭檔(header)與include

本文整理自 headers and includes: why and how

Why we need header files

  • speeds up compile time
    • if everything is in a single file, then everything must be fully recompiled every time you make any little change
  • keeps your code more organized
    • easier to find the code you are looking for
  • allows you to separate interface from implementation
    • make the interface visible to other .cpp files, while keeping the implementation in its own .cpp file

Compile Process

  1. compiler generates intermediate files(object file) for each compiled source file
    1. compiler will "replace" the #include line with the actual contents of the file you're including when it compiles the file
    2. files with header extensions might be ignored by the compiler if you try to compile them
  2. then links all the object files together, which generates the final binary
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// in myclass.h
class MyClass
{
public:
void foo();
int bar;
};

// in myclass.cpp
#include "myclass.h"
void MyClass::foo()
{
}

//in main.cpp
#include "myclass.h" // defines MyClass
int main()
{
MyClass a; // no longer produces an error, because MyClass is defined
return 0;
}
  • Header files should use a .h__ extension (.h / .hpp / .hxx). Which of those you use doesn't matter
  • C++ Source files should use a .c__ extention (.cpp / .cxx / .cc). Which of those you use doesn't matter
  • C Source files should use .c (.c only)

header files are #included and not compiled, whereas source files are compiled and not #included

The one exception is that it is sometimes (although very rarely) useful to include a source file. This scenario has to do with instantiating templates and is outside the scope of this article

Include guards

include multiple times of the same code would cause error

1
2
3
4
5
6
7
8
9
10
// myclass.h

class MyClass
{
void DoSomething() { }
};

// main.cpp
#include "myclass.h" // define MyClass
#include "myclass.h" // Compiler error - MyClass already defined

There's an Implicit Example

1
2
3
4
5
6
7
8
9
10
11
// x.h
class X { };
// a.h
#include "x.h"
class A { X x; };
// b.h
#include "x.h"
class B { X x; };
// main.cpp
#include "a.h" // also includes "x.h"
#include "b.h" // includes x.h again! ERROR

Because of this scenario, many people are told not to put #include in header files. However this is bad advice and you should not listen to it, But remember

  1. Only #include things you need to include
  2. Guard against incidental multiple includes with include guards
    1. skipping over the entire header if it was already included
1
2
3
4
5
//x.h
#ifndef __X_H_INCLUDED__ // if x.h hasn't been included yet...
#define __X_H_INCLUDED__ // #define this so the compiler knows it has been included
class X { };
#endif

The "right way" to include

aware of following dependencies

  1. stuff that can be forward declared
  2. stuff that needs to be #included

Dedepency that should be used

  • do nothing if
    • A makes no references at all to B
    • The only reference to B is in a friend declaration
  • forward declare B if
    • A contains a B pointer or reference
      • B* myb, B& myb
    • function has B object/pointer/reference as parementer or return type
      • B MyFunction(B myb)
  • include "b.h" if
    • B is a parent class of A
    • A contains a B object
      • B myb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
myclass.h
//=================================
// include guard
#ifndef __MYCLASS_H_INCLUDED__
#define __MYCLASS_H_INCLUDED__

//=================================
// forward declared dependencies
class Foo;
class Bar;

//=================================
// included dependencies
#include <vector>
#include "parent.h"

//=================================
// the actual class
class MyClass : public Parent // Parent object, so #include "parent.h"
{
public:
std::vector<int> avector; // vector object, so #include <vector>
Foo* foo; // Foo pointer, so forward declare Foo
void Func(Bar& bar); // Bar reference, so forward declare Bar

friend class MyFriend; // friend declaration is not a dependency
// don't do anything about MyFriend
};

#endif // __MYCLASS_H_INCLUDED__

Why that is the "right way" to include

  • general idea is that it makes "myclass.h" fully self-contained and doesn't require any other area of the program (other than MyClass's implementation/source file) to know how MyClass works internally
    • If some other class needs to use MyClass, it can just #include "myclass.h" and be done with it!
  • Alternative method: #include all of MyClass's dependencies before #including "myclass.h"
1
2
3
4
5
//  I want to use MyClass
#include "myclass.h" // will always work, no matter what MyClass looks like.
// You're done
// (provided myclass.h follows my outline above and does
// not make unnecessary #includes)

why alternative method is bad:you should fill out all depency header and maintain it's order

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
//  I want to use MyClass
#include "myclass.h"
// ERROR 'Parent' undefined

#include "parent.h"
#include "myclass.h"
// ERROR 'std::vector' undefined

#include "parent.h"
#include <vector>
#include "myclass.h"
// ERROR 'Support' undefined

#include "parent.h"
#include <vector>
#include "support.h"
#include "myclass.h"
// ERROR 'Support' undefined
// "parent.h" uses Support, and therefore you must #include "support.h" before "parent.h"

It is all about encapsulation. Files that want to use MyClass don't need to be aware of what MyClass uses in order for it to work, and don't need to #include any MyClass dependencies. It's all very OO friendly, very easy to use, and very easy to maintain

Circular Dependencies

A circular dependency is when two (or more) classes depend on each other

1
2
3
4
5
6
// a.h
#include "b.h"
class A { B* b; };
// b.h
#include "a.h"
class B { A* a };

That's what circular inclusion does

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// a.cpp
#include "a.h"

The compiler will do the following:
#include "a.h"

// start compiling a.h
#include "b.h"

// start compiling b.h
#include "a.h"

// compilation of a.h skipped because it's guarded

// resume compiling b.h
class B { A* a }; // <--- ERROR, A is undeclared

Even though you're #including "a.h", the compiler is not seeing A class until B class gets compiled.
Solution: forward declare when you're only using a pointer or reference

Situation below is conceptually impossible(not logical). The solution is to have one or both classes contain a pointer or reference to the other, rather than a full object

1
2
3
4
5
6
7
8
9
10
11
12
// a.h (guarded)
#include "b.h"
class A
{
B b; // B is an object, can't be forward declared
};
// b.h (guarded)
#include "a.h"
class B
{
A a; // A is an object, can't be forward declared
};

Function inlining

Inline Function body needs to exist in every cpp file which calls them, otherwise you get linker errors

1
2
3
4
5
6
7
8
9
10
class B
{
public:
void Func(const A& a) // parameter, so forward declare is okay
{
a.DoSomething(); // but now that we've dereferenced it, it
// becomes an #include dependency
// = we now have a potential circular inclusion
}
};

The key is that while inline function need to exist in the header, they do not need to exist in the class definition itself

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// b.h  (assume its guarded)
//------------------
class A; // forward declared dependency
//------------------
class B
{
public:
void Func(const A& a); // okay, A is forward declared
}
//------------------
//now B is already defined
#include "a.h" // can use A with include dependency without error
inline void B::Func(const A& a)
{
a.DoSomething(); // okay! a.h has been included
}

even if a.h includes b.h, the additional #includes don't come up until AFTER class B is fully defined, and they are therefore harmless.

But putting function bodies at the end of my header is ugly. Is there a way to avoid that?

1
2
3
4
5
6
7
// b.h

// blah blah

class B { /* blah blah */ };

#include "b_inline.h" // or I sometimes use "b.hpp"
1
2
3
4
5
6
7
8
9
10
11
// b_inline.h (or b.hpp -- whatever)

#include "a.h"
#include "b.h" // not necessary, but harmless
// you can do this to make this "feel" like a source
// file, even though it isn't

inline void B::Func(const A& a)
{
a.DoSomething();
}

This seperates the interface from the implementation, while still allowing the implementation to be inlined

Forward declaring templates

Forward declaring is pretty straight-forward when it comes to simple classes, but when dealing with template classes, things aren't so simple

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// a.h
#include "b.h" // included dependencies

template <typename T> // the class template
class Tem
{
/*...*/
B b;
};

// class most commonly used with 'int'
typedef Tem<int> A; // typedef as 'A'

// b.h
// forward declared dependencies
class A; // error!

// the class
class B
{
/* ... */
A* ptr;
};

Because 'A' isn't really a class, but rather a typedef, the compiler will bark at you
we can't just #include "a.h" here because of a circular dependency problem

We need to forward typedef A

1
2
template <typename T> class Tem;  // forward declare our template
typedef Tem<int> A; // then typedef 'A' (forward typedef A instead of forward declare A)

A cleaner solution is to create an alternative header which has the forward declarations of your templated classes and their typedefs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
//a.h
#include "b.h"
template <typename T>
class Tem
{
/*...*/
B b;
};


//a_fwd.h
template <typename T> class Tem;
typedef Tem<int> A;

//b.h
#include "a_fwd.h"
class B
{
/*...*/
A* ptr;
};

This allows B to include a header which forward declares A without including the entire class definition