C programming/Structures and databus width

From Teknologisk videncenter
< C programming
Revision as of 15:00, 19 February 2012 by Heth (talk | contribs)
Jump to: navigation, search

When defining structures in C on CPU'es with larger than a 8 bit databus, you could risc wasting large amounts of RAM space. Depending on the CPU platform and C-compiler.

Examples

The same source code is used in the different setups below.

Source code

#include <stdio.h>
int main( void ) {
        
        struct ma1 {
                char c1;
                int i1;
                char c2;
                long l1;
                char c3;
                int i2;
                int i3;
        };
        struct ma2 {
                long l1;
                int i1;
                int i2;
                int i3;
                char c1;
                char c2;
                char c3;
        };

        struct ma1 m1;
        struct ma2 m2;

        printf("Size of m1: %i\n",(int) sizeof(m1) );
        printf("Size of m2: %i\n",(int) sizeof(m2) );
        return(0);
}

64 Bit bus width

GCC

  • Compiler: gcc version 4.4.5
  • OS: Ubuntu 11.04 (GNU/Linux 2.6.35-24-generic x86_64)
  • CPU: Intel(R) Xeon(TM) CPU 3.20GHz stepping 03

The two structs - containing the same amount of data - occupies different sizes of RAM chunks.

heth@mars2:/tmp$ ./ma3
Size of m1: 40
Size of m2: 24

Explanation

64 bit Intel CPU's works natively in 64 bit chunks starting at addr 0x0, 0x8, 0x10.... Historical even 64 bits Intel CPU's can work in 32 bit chunks starting at 0x0, 0x4, 0x8 ....

Best practice for defining structures
Start with the variables that occupy the most RAM and work down to the variables that occupy least RAM.
Use the CPU's native buswidth for variables as far as possible. Using smaller variables may cost performance because it might be necessary to unmask and mask the data when reading from RAM and reading and masking before writing to RAM.

Struct m1

The bad way
Occupies 40 bytes of RAM using only 23 bytes

When defining the structure as ma1:

        struct ma1 {
                char c1; // 4 bytes used: c1 occupies address 0x0 - address 0x1, 0x2 and 0x3 are padding 
                         // (Cant be used because next variable is a 32 bit and must start at 0x0, 0x4, 0x8...)
                int    i1; // 4 bytes used: i1 occupies address 0x4 to 0x7 
                char c2; // 4 bytes used: c2 occupies address 0x8 - address 0x9, 0xa, 0xb are padding
                             // Because next variable is a long and must start 0x0, 0x8 - 0xc to 0xf are padding
                long l1; // 8 bytes used: l1 occupies address 0x10 to 0x17
                char c3; // 4 bytes used: c3 occupies address 0x18 - address 0x19, 0x1a and 0x1b are padding
                int   i2; // 4 bytes used: i1 occupies address 0x1c to 0x1f 
                int   i3; // 4 bytes used: i1 occupies address 0x20 to 0x23
                           // Because this is the end of the struct which must end at 0x0, 0x8 address 0x24 t o0x27 are padding
        };

The ma1 struct occupies from 0x0 to 0x27 = 0x28 byte locations 0x28 = 40 decimal

Struct m2

The better way
Occupies 24 bytes of RAM using only 23 bytes. (Descending order of variables in size)
        struct ma2 {
                long l1; // Occupy from 0x0 to 0x7
                int i1;  // Occupy from 0x8 to 0xb
                int i2;  // Occupy from 0xc to 0xf
                int i3; // Occupy from 0x10 to 0x13
                char c1; // Occupy 0x14 
                char c2; // Occupy 0x15
                char c3; //Occupy 0x16
                      // Padding: 0x17       
 };

Embedded controllers

ARM Cortex M3

CPU: STM32F107VC Compiler: ARM C/C++ Compiler, 4.1 [Build 894] OS: None Cortex M3 is a 32 bit microcontroller, but allocates memory in 8 Bytes chunks behaving as the 64 bit Intel CPU described above.