Difference between revisions of "C programming/Structures and databus width"

From Teknologisk videncenter
Jump to: navigation, search
m (Source code)
m
 
(17 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
{{TOCright}}
 
When defining structures in C on CPU'es with larger than a 8 bit databus, you could risc wasting large amounts of RAM space. Depending on the CPU platform and C-compiler.
 
When defining structures in C on CPU'es with larger than a 8 bit databus, you could risc wasting large amounts of RAM space. Depending on the CPU platform and C-compiler.
 
=Examples=
 
=Examples=
The same source code is used in the different setuos below.
+
The same source code is used in the different setups below.
 
==Source code==
 
==Source code==
<source lang=cli>
+
<source lang=c>
 
#include <stdio.h>
 
#include <stdio.h>
 
int main( void ) {
 
int main( void ) {
 
          
 
          
         struct ma1 {
+
         struct struct_A {
 
                 char c1;
 
                 char c1;
 
                 int i1;
 
                 int i1;
 
                 char c2;
 
                 char c2;
                 long l1;
+
                 long long l1;
 
                 char c3;
 
                 char c3;
 
                 int i2;
 
                 int i2;
 
                 int i3;
 
                 int i3;
 
         };
 
         };
         struct ma2 {
+
         struct struct_B {
                 long l1;
+
                 long long l1;
 
                 int i1;
 
                 int i1;
 
                 int i2;
 
                 int i2;
Line 26: Line 27:
 
         };
 
         };
  
         struct ma1 m1;
+
         struct struct_A s_A;
         struct ma2 m2;
+
         struct struct_B s_B;
  
         printf("Size of m1: %i\n",(int) sizeof(m1) );
+
         printf("Size of s_A: %i\n",(int) sizeof(s_A) );
         printf("Size of m2: %i\n",(int) sizeof(m2) );
+
         printf("Size of s_B: %i\n",(int) sizeof(s_B) );
 
         return(0);
 
         return(0);
 
}
 
}
Line 42: Line 43:
 
The two structs - containing the same amount of data - occupies different sizes of RAM chunks.  
 
The two structs - containing the same amount of data - occupies different sizes of RAM chunks.  
 
<source lang=cli>
 
<source lang=cli>
heth@mars2:/tmp$ ./ma3
+
heth@mars2:/tmp$ ./show_struct
Size of m1: 40
+
Size of s_A: 40
Size of m2: 24
+
Size of s_B: 24
 
</source>
 
</source>
 
==Explanation==
 
==Explanation==
 
64 bit Intel CPU's works natively in 64 bit chunks starting at addr 0x0, 0x8, 0x10.... Historical even 64 bits Intel CPU's can work in 32 bit chunks starting at 0x0, 0x4, 0x8 ....
 
64 bit Intel CPU's works natively in 64 bit chunks starting at addr 0x0, 0x8, 0x10.... Historical even 64 bits Intel CPU's can work in 32 bit chunks starting at 0x0, 0x4, 0x8 ....
 
;Best practice for defining structures:Start with the variables that occupy the most RAM and work down to the variables that occupy least RAM.
 
;Best practice for defining structures:Start with the variables that occupy the most RAM and work down to the variables that occupy least RAM.
 
+
:Use the CPU's native buswidth for variables as far as possible. Using smaller variables may cost performance because it might be necessary to unmask and mask the data when reading from RAM and reading and masking before writing to RAM.   
===Struct m1===
+
===struct_A===
 
;The bad way: Occupies 40 bytes of RAM using only 23 bytes
 
;The bad way: Occupies 40 bytes of RAM using only 23 bytes
When defining the structure as ma1:
+
When defining the structure as struct_A:
 
<source lang=c>
 
<source lang=c>
         struct ma1 {
+
         struct struct_A {
 
                 char c1; // 4 bytes used: c1 occupies address 0x0 - address 0x1, 0x2 and 0x3 are padding  
 
                 char c1; // 4 bytes used: c1 occupies address 0x0 - address 0x1, 0x2 and 0x3 are padding  
 
                         // (Cant be used because next variable is a 32 bit and must start at 0x0, 0x4, 0x8...)
 
                         // (Cant be used because next variable is a 32 bit and must start at 0x0, 0x4, 0x8...)
                 int   i1; // 4 bytes used: i1 occupies address 0x4 to 0x7  
+
                 int i1; // 4 bytes used: i1 occupies address 0x4 to 0x7  
 
                 char c2; // 4 bytes used: c2 occupies address 0x8 - address 0x9, 0xa, 0xb are padding
 
                 char c2; // 4 bytes used: c2 occupies address 0x8 - address 0x9, 0xa, 0xb are padding
                            // Because next variable is a long and must start 0x0, 0x8 - 0xc to 0xf are padding
+
                        // Because next variable is a long and must start 0x0, 0x8 - 0xc to 0xf are padding
                 long l1; // 8 bytes used: l1 occupies address 0x10 to 0x17
+
                 long long l1; // 8 bytes used: l1 occupies address 0x10 to 0x17
 
                 char c3; // 4 bytes used: c3 occupies address 0x18 - address 0x19, 0x1a and 0x1b are padding
 
                 char c3; // 4 bytes used: c3 occupies address 0x18 - address 0x19, 0x1a and 0x1b are padding
                 int   i2; // 4 bytes used: i1 occupies address 0x1c to 0x1f  
+
                 int i2; // 4 bytes used: i1 occupies address 0x1c to 0x1f  
                 int   i3; // 4 bytes used: i1 occupies address 0x20 to 0x23
+
                 int i3; // 4 bytes used: i1 occupies address 0x20 to 0x23
                          // Because this is the end of the struct which must end at 0x0, 0x8 address 0x24 t o0x27 are padding
+
                        // Because this is the end of the struct which must end at 0x0, 0x8 address 0x24 t o0x27 are padding
 
         };
 
         };
 
</source>  
 
</source>  
The ma1 struct occupies from 0x0 to 0x27 = 0x28 byte locations 0x28 = 40 decimal
+
The struct_A initiatet in s_A  occupies from 0x0 to 0x27 = 0x28 byte locations 0x28 = 40 decimal
===Struct m2===
+
 
;The better way: Occupies 24 bytes of RAM using only 23 bytes.
+
===struct_B===
<source lang=cli>
+
;The better way: Occupies 24 bytes of RAM using 23 bytes. (Descending order of variables in size)
         struct ma2 {
+
<source lang=c>
                 long l1; // Occupy from 0x0 to 0x7
+
         struct struct_B {
 +
                 long long l1; // Occupy from 0x0 to 0x7
 
                 int i1;  // Occupy from 0x8 to 0xb
 
                 int i1;  // Occupy from 0x8 to 0xb
 
                 int i2;  // Occupy from 0xc to 0xf
 
                 int i2;  // Occupy from 0xc to 0xf
                 int i3; // Occupy from 0x10 to 0x13
+
                 int i3; // Occupy from 0x10 to 0x13
 
                 char c1; // Occupy 0x14  
 
                 char c1; // Occupy 0x14  
 
                 char c2; // Occupy 0x15
 
                 char c2; // Occupy 0x15
 
                 char c3; //Occupy 0x16
 
                 char c3; //Occupy 0x16
                      // Padding: 0x17       
+
                        // Padding: 0x17       
 
  };
 
  };
 
</source>
 
</source>
[[Category:c]]
+
 
 +
==Embedded controllers==
 +
===ARM Cortex M3===
 +
*CPU: STM32F107VC
 +
*Compiler: ARM C/C++ Compiler, 4.1 [Build 894]
 +
*OS: None
 +
Cortex M3 is a 32 bit microcontroller, but allocates memory in 8 Bytes chunks behaving as the 64 bit Intel CPU described above.
 +
=GCC Packet attribute=
 +
<source lang=c>
 +
#include <stdio.h>
 +
int main( void ) {
 +
 
 +
        struct struct_A {
 +
                char c1;
 +
                int i1;
 +
                char c2;
 +
                long long l1;
 +
                char c3;
 +
                int i2;
 +
                int i3;
 +
        };
 +
        struct struct_B {
 +
                long long l1;
 +
                int i1;
 +
                int i2;
 +
                int i3;
 +
                char c1;
 +
                char c2;
 +
                char c3;
 +
        };
 +
 
 +
        struct struct_C {
 +
                char c1;
 +
                int i1;
 +
                char c2;
 +
                long long l1;
 +
                char c3;
 +
                int i2;
 +
                int i3;
 +
        }__attribute__((__packed__)) ;
 +
 
 +
        struct struct_A s_A;
 +
        struct struct_B s_B;
 +
        struct struct_C s_C;
 +
 
 +
        printf("Size of s_A: %i\n",(int) sizeof(s_A) );
 +
        printf("Size of s_B: %i\n",(int) sizeof(s_B) );
 +
        printf("Size of s_C: %i\n",(int) sizeof(s_C) );
 +
        return(0);
 +
}
 +
</source>
 +
Running this program:
 +
heth@heth:~/bin/bmp$ gcc struct.c -o struct
 +
heth@heth:~/bin/bmp$ ./struct
 +
Size of s_A: 40
 +
Size of s_B: 24
 +
Size of s_C: 23
 +
 
 +
[[Category:c]][[Category:ARM]]

Latest revision as of 13:36, 8 July 2019

When defining structures in C on CPU'es with larger than a 8 bit databus, you could risc wasting large amounts of RAM space. Depending on the CPU platform and C-compiler.

Examples

The same source code is used in the different setups below.

Source code

#include <stdio.h>
int main( void ) {
        
        struct struct_A {
                char c1;
                int i1;
                char c2;
                long long l1;
                char c3;
                int i2;
                int i3;
        };
        struct struct_B {
                long long l1;
                int i1;
                int i2;
                int i3;
                char c1;
                char c2;
                char c3;
        };

        struct struct_A s_A;
        struct struct_B s_B;

        printf("Size of s_A: %i\n",(int) sizeof(s_A) );
        printf("Size of s_B: %i\n",(int) sizeof(s_B) );
        return(0);
}

64 Bit bus width

GCC

  • Compiler: gcc version 4.4.5
  • OS: Ubuntu 11.04 (GNU/Linux 2.6.35-24-generic x86_64)
  • CPU: Intel(R) Xeon(TM) CPU 3.20GHz stepping 03

The two structs - containing the same amount of data - occupies different sizes of RAM chunks.

heth@mars2:/tmp$ ./show_struct
Size of s_A: 40
Size of s_B: 24

Explanation

64 bit Intel CPU's works natively in 64 bit chunks starting at addr 0x0, 0x8, 0x10.... Historical even 64 bits Intel CPU's can work in 32 bit chunks starting at 0x0, 0x4, 0x8 ....

Best practice for defining structures
Start with the variables that occupy the most RAM and work down to the variables that occupy least RAM.
Use the CPU's native buswidth for variables as far as possible. Using smaller variables may cost performance because it might be necessary to unmask and mask the data when reading from RAM and reading and masking before writing to RAM.

struct_A

The bad way
Occupies 40 bytes of RAM using only 23 bytes

When defining the structure as struct_A:

        struct struct_A {
                char c1; // 4 bytes used: c1 occupies address 0x0 - address 0x1, 0x2 and 0x3 are padding 
                         // (Cant be used because next variable is a 32 bit and must start at 0x0, 0x4, 0x8...)
                int  i1; // 4 bytes used: i1 occupies address 0x4 to 0x7 
                char c2; // 4 bytes used: c2 occupies address 0x8 - address 0x9, 0xa, 0xb are padding
                         // Because next variable is a long and must start 0x0, 0x8 - 0xc to 0xf are padding
                long long l1; // 8 bytes used: l1 occupies address 0x10 to 0x17
                char c3; // 4 bytes used: c3 occupies address 0x18 - address 0x19, 0x1a and 0x1b are padding
                int  i2; // 4 bytes used: i1 occupies address 0x1c to 0x1f 
                int  i3; // 4 bytes used: i1 occupies address 0x20 to 0x23
                         // Because this is the end of the struct which must end at 0x0, 0x8 address 0x24 t o0x27 are padding
        };

The struct_A initiatet in s_A occupies from 0x0 to 0x27 = 0x28 byte locations 0x28 = 40 decimal

struct_B

The better way
Occupies 24 bytes of RAM using 23 bytes. (Descending order of variables in size)
        struct struct_B {
                long long l1; // Occupy from 0x0 to 0x7
                int i1;  // Occupy from 0x8 to 0xb
                int i2;  // Occupy from 0xc to 0xf
                int i3;  // Occupy from 0x10 to 0x13
                char c1; // Occupy 0x14 
                char c2; // Occupy 0x15
                char c3; //Occupy 0x16
                         // Padding: 0x17       
 };

Embedded controllers

ARM Cortex M3

  • CPU: STM32F107VC
  • Compiler: ARM C/C++ Compiler, 4.1 [Build 894]
  • OS: None

Cortex M3 is a 32 bit microcontroller, but allocates memory in 8 Bytes chunks behaving as the 64 bit Intel CPU described above.

GCC Packet attribute

#include <stdio.h>
int main( void ) {

        struct struct_A {
                char c1;
                int i1;
                char c2;
                long long l1;
                char c3;
                int i2;
                int i3;
        };
        struct struct_B {
                long long l1;
                int i1;
                int i2;
                int i3;
                char c1;
                char c2;
                char c3;
        };

        struct struct_C {
                char c1;
                int i1;
                char c2;
                long long l1;
                char c3;
                int i2;
                int i3;
        }__attribute__((__packed__)) ;

        struct struct_A s_A;
        struct struct_B s_B;
        struct struct_C s_C;

        printf("Size of s_A: %i\n",(int) sizeof(s_A) );
        printf("Size of s_B: %i\n",(int) sizeof(s_B) );
        printf("Size of s_C: %i\n",(int) sizeof(s_C) );
        return(0);
}

Running this program:

heth@heth:~/bin/bmp$ gcc struct.c -o struct 
heth@heth:~/bin/bmp$ ./struct
Size of s_A: 40 
Size of s_B: 24
Size of s_C: 23