Website Logo

Abhilash Meesala

The Mathematics of Optimal Struct Field Ordering

If you’ve spent time writing Go (or C/C++), you’ve probably come across a common piece of advice: “Arrange struct fields from largest to smallest to save memory.”. Does this advice actually work? If so, why?

Its easy to test this advice with a simple example

package main  
  
import (  
    "fmt"  
    "unsafe"  
)  
  
type InefficientOrder struct {  
    A bool  // 1 byte  
    B int16 // 2 bytes  
    C int32 // 4 bytes  
    D int64 // 8 bytes  
    E bool  // 1 byte  
}  
  
type EfficientOrder struct {  
    D int64 // 8 bytes  
    C int32 // 4 bytes  
    B int16 // 2 bytes  
    A bool  // 1 byte  
    E bool  // 1 byte  
}  
  
func main() {  
    fmt.Printf("Size of InefficientOrder struct: %d bytes\n", unsafe.Sizeof(InefficientOrder{}))  
    fmt.Printf("Size of EfficientOrder struct: %d bytes\n", unsafe.Sizeof(EfficientOrder{}))  
}

When you run this code, you’ll see

Size of InefficientOrder struct: 24 bytes
Size of EfficientOrder struct: 16 bytes

A 33% saving in memory - just by reordering fields! So the advice definitely works in this case. But how can we be certain that placing fields in decreasing order of their sizes (actually, alignment requirements) is the optimal way to pack fields all the time?

We’ll take the help of Mathematics here. While there are several ways to approach this proof, the following less ‘pure’ solution is the one that clicked for me.

  1. Prove adjacent pair optimality: Show that swapping any two adjacent fields (where the smaller one comes first) reduces or never increases padding.
  2. Apply Adjacent Pair Optimisation Repeatedly : Show that repeatedly applying this swap leads to an optimally packed struct.

Step 1: Prove Adjacent Pair Optimality

We’re trying to prove that:

If any two adjacent fields AA and BB with alignment requirements aAa_A and aBa_B are out of order (aA<aBa_A \lt a_B), then swapping them reduces or at least does not increase the total padding

Proof

First, lets introduce some notations and definitions

  • MM is the arbitrary offset in memory from which the struct is ‘laid out’

  • axa_x is the alignment requirement of field xx. Typically this equals the size sxs_x of the field, but for our proof we’ll consider them as potentially different.

  • XX (and YY) represents the current offset in memory

  • round(X,a)round(X,a) is the function that returns the next offset value (including XX) at which a field with aa alignment requirement can be placed.

    Mathematically, this is round(X,a)=X+(a(Xmoda))modaround(X,a) = X + (a - (X \mod a)) \mod a . In other words, the smallest number greater than or equal to XX that is a multiple of aa

Now let’s consider the two possible orderings [A, B], [B, A] (remember, aA<aBa_A \lt a_B )

Case 1: [A, B]

Field AA is placed at XA=round(M,aA)X_A = round(M, a_A)

Field BB is placed after A, but since its offset needs to satisfy the alignment requirement aBa_B, its actual offset is:

XB=round(XA+sA,aB)=XA+sA+ΔBX_B = round(X_A + s_A, a_B) = X_A + s_A + \Delta_{B}

where

ΔB=(aB((XA+sA)modaB))modaB\Delta_{B} = (a_B - ((X_A + s_A) \mod a_B)) \mod a_B

Note that 0ΔB<aB0 \leq \Delta_{B} \lt a_B

The total end offset is

TAB=XB+sB=XA+sA+ΔB+sBT_{AB} = X_B + s_B = X_A + s_A + \Delta_{B} + s_B

Case 2: [B, A]

Field BB is now at YB=round(M,aB)Y_B = round(M, a_B)

Field AA is placed after BB satisfying its alignment requirement aAa_A, so it should be placed at:

YA=round(YB+sB,aA)=YB+sB+ΔAY_A = round(Y_B + s_B, a_A) = Y_B + s_B + \Delta_{A}

where

ΔA=(aA((YB+sB)modaA))modaA\Delta_{A} = (a_A - ((Y_B + s_B) \mod a_A)) \mod a_A

Here 0ΔA<aA0 \leq \Delta_{A} \lt a_A

The total end offset is

TBA=YA+sA=YB+sB+ΔA+sAT_{BA} = Y_A + s_A = Y_B + s_B + \Delta_{A} + s_A

Comparing the two orders

TABTBA=(XAYB)+(ΔBΔA) T_{AB} - T_{BA} = (X_A - Y_B) + (\Delta{B} - \Delta{A})

At this point, we can’t make a general conclusion about the equation. All we know is that ΔB<aB\Delta_B \lt a_B and ΔA<aA\Delta_A \lt a_A

However, we can leverage a key insight: alignment requirements are powers of 2. This implies that:

  1. If aA<aBa_A < a_B, then aBa_B is actually a multiple of aAa_A. Specifically, aB=kaAa_B = k \cdot a_A for some integer k>1k > 1.
  2. An address aligned to aBa_B is automatically aligned to aAa_A as well. (Because if a number is divisible by 8, it’s also divisible by 4, 2, and 1.)

Using these insights:

  • YBXAY_B \geq X_A :

    Since aBa_B is a multiple of aAa_A, any address that is a multiple of aBa_B is also a multiple of aAa_A. This means YBY_B is also a multiple of aAa_A, but it might be farther from MM than XAX_A is.

  • The worst-case padding in Case 1 is larger than worst-case padding in Case 2

    The maximum possible value of ΔB\Delta_{B} is (aB1)(a_B - 1), which occurs when (XA+sA)(X_A + s_A) is just one byte past a multiple of aBa_B. The maximum possible value of ΔB\Delta_{B} is (aB1)(a_B - 1), which occurs when (XA+sA)(X_A + s_A) is just one byte past a multiple of aBa_B.

Now, for power-of-2 alignments where aB=kaAa_B = k \cdot a_A:

  • The worst-case increase in starting position (YBXA)(Y_B - X_A) is (aBaA)(a_B - a_A)
  • The worst-case reduction in padding is (aB1)(a_B - 1)

Since (aB1)(aBaA)(a_B - 1) \geq (a_B - a_A) for all aA1a_A \geq 1, we can generally expect:

(ΔBΔA)(YBYA)(\Delta_B - \Delta_A) \geq (Y_B - Y_A)

which means

TABTBA=(XAYB)+(ΔBΔA)0 T_{AB} - T_{BA} = (X_A - Y_B) + (\Delta{B} - \Delta{A}) \geq 0

or equivalently, TBATABT_{BA} \leq T_{AB}. In other words, swapping the two fields (when they are out of order) does not worsen—and typically improves—the overall packing.

Step 2: Apply Adjacent Pair Optimisation Repeatedly

Step 1 proves something powerful: swapping any adjacent pair of fields where a smaller alignment comes before a larger alignment will either improve memory usage or leave it unchanged. Never worse, always better or the same.

So what happens if we want to optimally pack all fields in a struct? We simply need to apply this adjacent pair optimization repeatedly across all fields. For each pair that’s out of order (smaller alignment before larger), we swap them.

If that repeated pair comparison and swapping reminds you of bubble sort - well, you are right.

Just as bubble sort guarantees a sorted array, our repeated swapping guarantees we’ll end up with struct fields arranged in strictly descending order of alignment requirements. And since each swap either reduces the memory footprint or keeps it the same (never increases it), this final arrangement must be optimal. This is why placing larger struct fields first is always the most memory-efficient ordering.


Remember, premature optimisation is evil - or in this case, can lead to less readable code, so apply this technique judiciously where memory efficiency genuinely matters for your application’s performance characteristics.