For those that write in C/C++, you might already know that unions are a good way to store data efficiently. Using the power of unions, you can leverage their functionality by accessing data more easily. Here’s a Tip o’the Day to show you how.
A Refresher
If you haven’t used unions before, the description you’ll find online will be something like this: unions are are C/C++ data structure that can store items of different types, but can store only one of them at a time.
For me, this is both misleading and a little inaccurate. Calling it a ‘data structure’ is confusing because ‘structures’ (that is to say, structs) already exist. To me, unions are quite different than structs, and as we’ll see in a minute, saying that “only one of them can be stored at a time” is misleading.
The description is true when you have ‘normal’ union use. ‘Normal’ union use is something like this:
typedef union
{
char ch;
int i;
long l;
float f;
double d;
} my_union_type;
my_union_type my_variable;
my_variable.ch = 1;
my_variable.i = 2;
printf("ch = %d\n", my_variable.ch);
When you declare a variable of this type, all of the members occupy the same place in memory. So the output of the above code is this:
ch = 2
Not 1 like you might expect.
When you do a sizeof() on a union’d type, it’s always the largest member of that union. For example, with my_union_type, the size is 8 because sizeof(double) is 8.
Exploitation
You can take advantage of this behavior by creating a special ‘merged’ data type that is nothing but unions. Have a look at the code below to see what I mean:
typedef union
{
u32 m_u32;
s32 m_s32;
void * m_ptr;
struct
{
u16 m_DC;
u16 m_BA;
} m_u16;
struct
{
s16 m_DC;
s16 m_BA;
} m_s16;
struct
{
u8 m_D;
u8 m_C;
u8 m_B;
u8 m_A;
} m_u8;
struct
{
s8 m_D;
s8 m_C;
s8 m_B;
s8 m_A;
} m_s8;
} merged32;
This is a special 32bit data type that I created to make it very easy for me to access certain parts of those 32bits without resorting to bitmasking and shifting (which can be quite error-prone). Here’s an example to show you how easy this becomes:
merged32 my_merged_data;
my_merged_data.m_u32 = 0xC0DEF00D;
printf("m_u32 = 0x%08X\n\n", my_merged_data.m_u32);
printf("m_u8.m_A = 0x%02X\n", my_merged_data.m_u8.m_A);
printf("m_u8.m_B = 0x%02X\n", my_merged_data.m_u8.m_B);
printf("m_u8.m_C = 0x%02X\n", my_merged_data.m_u8.m_C);
printf("m_u8.m_D = 0x%02X\n\n", my_merged_data.m_u8.m_D);
my_merged_data.m_u8.m_C = 0xD0;
printf("m_u32 = 0x%08X\n", my_merged_data.m_u32);
Gives the following output:
m_u32 = 0xC0DEF00D m_u8.m_A = 0xC0 m_u8.m_B = 0xDE m_u8.m_C = 0xF0 m_u8.m_D = 0x0D m_u32 = 0xC0DED00D
Because I have defined a data member in every position of the entire 32bits, I can access any 8 bits very easily.
A Caveat
One thing you have to keep in mind when using unions is the endianness of the hardware the code is running on. In my example, notice that I have to change m_C in order to get D00D. If the code was running on a Motorola CPU, I would have to change m_B because the order of the bytes is reversed.
The Code
Here is the full code for you to use as you wish. I have made a few other typedefs as well as other merged data types.
#if !defined(__MERGEDTYPE_H__)
#define __MERGEDTYPE_H__
#if !defined(s32)
typedef signed long s32;
#endif
#if !defined(u32)
typedef unsigned long u32;
#endif
#if !defined(s16)
typedef signed short s16;
#endif
#if !defined(u16)
typedef unsigned short u16;
#endif
#if !defined(s8)
typedef signed char s8;
#endif
#if !defined(u8)
typedef unsigned char u8;
#endif
typedef union
{
u32 m_u32;
s32 m_s32;
void * m_ptr;
struct
{
u16 m_DC;
u16 m_BA;
} m_u16;
struct
{
s16 m_DC;
s16 m_BA;
} m_s16;
struct
{
u8 m_D;
u8 m_C;
u8 m_B;
u8 m_A;
} m_u8;
struct
{
s8 m_D;
s8 m_C;
s8 m_B;
s8 m_A;
} m_s8;
} merged32;
typedef union
{
u16 m_u16;
s16 m_s16;
struct
{
u8 m_B;
u8 m_A;
} m_u8;
struct
{
s8 m_B;
s8 m_A;
} m_s8;
} merged16;
typedef union
{
u8 m_u8;
s8 m_s8;
} merged8;
#endif // __MERGEDTYPE_H__













Great tip! I’ve done C++ for 15 years and have used unions only a tiny amount. I can think of 5 places in my engine code that this would be very elegant for