C programming/Structures and databus width

From Teknologisk videncenter
< C programming
Revision as of 08:58, 22 February 2012 by Heth (talk | contribs) (Source code)
Jump to: navigation, search

When defining structures in C on CPU'es with larger than a 8 bit databus, you could risc wasting large amounts of RAM space. Depending on the CPU platform and C-compiler.

Examples

The same source code is used in the different setups below.

Source code

#include <stdio.h>
int main( void ) {
        
        struct struct_A {
                char c1;
                int i1;
                char c2;
                long long l1;
                char c3;
                int i2;
                int i3;
        };
        struct struct_B {
                long long l1;
                int i1;
                int i2;
                int i3;
                char c1;
                char c2;
                char c3;
        };

        struct struct_A s_A;
        struct struct_B s_B;

        printf("Size of s_A: %i\n",(int) sizeof(m1) );
        printf("Size of s_B: %i\n",(int) sizeof(m2) );
        return(0);
}

64 Bit bus width

GCC

  • Compiler: gcc version 4.4.5
  • OS: Ubuntu 11.04 (GNU/Linux 2.6.35-24-generic x86_64)
  • CPU: Intel(R) Xeon(TM) CPU 3.20GHz stepping 03

The two structs - containing the same amount of data - occupies different sizes of RAM chunks.

heth@mars2:/tmp$ ./show_struct
Size of s_A: 40
Size of s_B: 24

Explanation

64 bit Intel CPU's works natively in 64 bit chunks starting at addr 0x0, 0x8, 0x10.... Historical even 64 bits Intel CPU's can work in 32 bit chunks starting at 0x0, 0x4, 0x8 ....

Best practice for defining structures
Start with the variables that occupy the most RAM and work down to the variables that occupy least RAM.
Use the CPU's native buswidth for variables as far as possible. Using smaller variables may cost performance because it might be necessary to unmask and mask the data when reading from RAM and reading and masking before writing to RAM.

struct_A

The bad way
Occupies 40 bytes of RAM using only 23 bytes

When defining the structure as struct_A:

        struct struct_A {
                char c1; // 4 bytes used: c1 occupies address 0x0 - address 0x1, 0x2 and 0x3 are padding 
                         // (Cant be used because next variable is a 32 bit and must start at 0x0, 0x4, 0x8...)
                int  i1; // 4 bytes used: i1 occupies address 0x4 to 0x7 
                char c2; // 4 bytes used: c2 occupies address 0x8 - address 0x9, 0xa, 0xb are padding
                         // Because next variable is a long and must start 0x0, 0x8 - 0xc to 0xf are padding
                long l1; // 8 bytes used: l1 occupies address 0x10 to 0x17
                char c3; // 4 bytes used: c3 occupies address 0x18 - address 0x19, 0x1a and 0x1b are padding
                int  i2; // 4 bytes used: i1 occupies address 0x1c to 0x1f 
                int  i3; // 4 bytes used: i1 occupies address 0x20 to 0x23
                         // Because this is the end of the struct which must end at 0x0, 0x8 address 0x24 t o0x27 are padding
        };

The struct_A initiatet in s_A occupies from 0x0 to 0x27 = 0x28 byte locations 0x28 = 40 decimal

struct_B

The better way
Occupies 24 bytes of RAM using only 23 bytes. (Descending order of variables in size)
        struct struct_B {
                long l1; // Occupy from 0x0 to 0x7
                int i1;  // Occupy from 0x8 to 0xb
                int i2;  // Occupy from 0xc to 0xf
                int i3;  // Occupy from 0x10 to 0x13
                char c1; // Occupy 0x14 
                char c2; // Occupy 0x15
                char c3; //Occupy 0x16
                         // Padding: 0x17       
 };

Embedded controllers

ARM Cortex M3

  • CPU: STM32F107VC
  • Compiler: ARM C/C++ Compiler, 4.1 [Build 894]
  • OS: None

Cortex M3 is a 32 bit microcontroller, but allocates memory in 8 Bytes chunks behaving as the 64 bit Intel CPU described above.Categori:ARM