C programming/Structures and databus width
When defining structures in C on CPU'es with larger than a 8 bit databus, you could risc wasting large amounts of RAM space. Depending on the CPU platform and C-compiler.
Examples
The same source code is used in the different setups below.
Source code
#include <stdio.h>
int main( void ) {
struct struct_A {
char c1;
int i1;
char c2;
long long l1;
char c3;
int i2;
int i3;
};
struct struct_B {
long long l1;
int i1;
int i2;
int i3;
char c1;
char c2;
char c3;
};
struct struct_A s_A;
struct struct_B s_B;
printf("Size of s_A: %i\n",(int) sizeof(s_A) );
printf("Size of s_B: %i\n",(int) sizeof(s_B) );
return(0);
}
64 Bit bus width
GCC
- Compiler: gcc version 4.4.5
- OS: Ubuntu 11.04 (GNU/Linux 2.6.35-24-generic x86_64)
- CPU: Intel(R) Xeon(TM) CPU 3.20GHz stepping 03
The two structs - containing the same amount of data - occupies different sizes of RAM chunks.
heth@mars2:/tmp$ ./show_struct
Size of s_A: 40
Size of s_B: 24
Explanation
64 bit Intel CPU's works natively in 64 bit chunks starting at addr 0x0, 0x8, 0x10.... Historical even 64 bits Intel CPU's can work in 32 bit chunks starting at 0x0, 0x4, 0x8 ....
- Best practice for defining structures
- Start with the variables that occupy the most RAM and work down to the variables that occupy least RAM.
- Use the CPU's native buswidth for variables as far as possible. Using smaller variables may cost performance because it might be necessary to unmask and mask the data when reading from RAM and reading and masking before writing to RAM.
struct_A
- The bad way
- Occupies 40 bytes of RAM using only 23 bytes
When defining the structure as struct_A:
struct struct_A {
char c1; // 4 bytes used: c1 occupies address 0x0 - address 0x1, 0x2 and 0x3 are padding
// (Cant be used because next variable is a 32 bit and must start at 0x0, 0x4, 0x8...)
int i1; // 4 bytes used: i1 occupies address 0x4 to 0x7
char c2; // 4 bytes used: c2 occupies address 0x8 - address 0x9, 0xa, 0xb are padding
// Because next variable is a long and must start 0x0, 0x8 - 0xc to 0xf are padding
long long l1; // 8 bytes used: l1 occupies address 0x10 to 0x17
char c3; // 4 bytes used: c3 occupies address 0x18 - address 0x19, 0x1a and 0x1b are padding
int i2; // 4 bytes used: i1 occupies address 0x1c to 0x1f
int i3; // 4 bytes used: i1 occupies address 0x20 to 0x23
// Because this is the end of the struct which must end at 0x0, 0x8 address 0x24 t o0x27 are padding
};
The struct_A initiatet in s_A occupies from 0x0 to 0x27 = 0x28 byte locations 0x28 = 40 decimal
struct_B
- The better way
- Occupies 24 bytes of RAM using 23 bytes. (Descending order of variables in size)
struct struct_B {
long long l1; // Occupy from 0x0 to 0x7
int i1; // Occupy from 0x8 to 0xb
int i2; // Occupy from 0xc to 0xf
int i3; // Occupy from 0x10 to 0x13
char c1; // Occupy 0x14
char c2; // Occupy 0x15
char c3; //Occupy 0x16
// Padding: 0x17
};
Embedded controllers
ARM Cortex M3
- CPU: STM32F107VC
- Compiler: ARM C/C++ Compiler, 4.1 [Build 894]
- OS: None
Cortex M3 is a 32 bit microcontroller, but allocates memory in 8 Bytes chunks behaving as the 64 bit Intel CPU described above.
GCC Packet attribute
#include <stdio.h>
int main( void ) {
struct struct_A {
char c1;
int i1;
char c2;
long long l1;
char c3;
int i2;
int i3;
};
struct struct_B {
long long l1;
int i1;
int i2;
int i3;
char c1;
char c2;
char c3;
};
struct struct_C {
char c1;
int i1;
char c2;
long long l1;
char c3;
int i2;
int i3;
}__attribute__((__packed__)) ;
struct struct_A s_A;
struct struct_B s_B;
struct struct_C s_C;
printf("Size of s_A: %i\n",(int) sizeof(s_A) );
printf("Size of s_B: %i\n",(int) sizeof(s_B) );
printf("Size of s_C: %i\n",(int) sizeof(s_C) );
return(0);
}
Running this program:
heth@heth:~/bin/bmp$ gcc struct.c -o struct heth@heth:~/bin/bmp$ ./struct Size of s_A: 40 Size of s_B: 24 Size of s_C: 23