Protocol Buffers 102

Ashok Dey

15-01-2023

2.5k words • 15 min read
programming › development

go, grpc, proto, web

In the last post we covered the basic elements and the building blocks of the Protobuf. In this post we will continue to the next steps where we will look at how we compile the Proto message for a programming language and how to use the generated classes in the code. Let’s get started.

The ProtoBuf logo

Installation

There are plethora of articles and resources to set-up the Protobuf compiler and related toolings. We will be using the Go programming language and VSCode. Use the links below to setup the required toolings:

Once you finish installation use the following commands to validate the installations:

For protoc, use: protoc --version.
For go, use: go version

Compilation

Before we start with the compilation process, we need perform few more operations:

Create a repo to store the proto file and the associated Go code files. We can name the project learning-protocol-buffers.
Declare the Go module go mod init github.com/username/protos .
Create a folder to store the proto files /ptoros and create the todo.proto file inside /protos.

# create the code repo for both proto & code files 
mkdir -p learning-protocol-buffers/protos learning-protocol-buffers/generated learning-protocol-buffers/code

# declare the go module. I'm using ashokdey.com/protos
go mod init github.com/username/protos 

# create the proto file 
cd learning-protocol-buffers && touch protos/todo.proto

The Todo proto message we created earlier cannot be used directly without compilation. The good thing about the protoc compiler is that it can generate code for the targeted programming language of our choice. It will generate the required classes that our programming language can use to perform various operations with it like encoding, decoding & transporting it over the network.

Note: Don’t forget to copy-paster the content of thee todo.proto file:

syntax = "proto3";

package protos.todos;

import "google/protobuf/timestamp.proto";

option go_package = "ashokdey.com/protos/todos";
option java_package = "com.ashokdey.protos.todos";


message Todo {
  int64 id = 1;
  string title = 2;
  optional string description = 3;
  bool done = 4;

  enum Priority {
    PRIORITY_UNSPECIFIED = 0;
    PRIORITY_LOW = 1;
    PRIORITY_MEDIUM = 2;
    PRIORITY_HIGH = 3;
  }
  Priority priority = 5;  // <-- using the enum as a property

  google.protobuf.Timestamp created_at = 6;
}

As said earlier, our target programming language will be Go. Let me walk you though the compilation process and the different flags that will be required for generating language specific code at the desirable destination.

The todo.proto file for me is located at learning-protos/protos/todo.proto. To compile the .proto file using protoc and generate the code in Go, the command we will use is:

1 2	# Note: Don't forget to copy-paster the content of thee `todo.proto` file protoc -I protos --go_out=paths=source_relative:./protos ./protos/todo.proto

If the compilation is successful then you will see a new file created just where the todo.proto file is located. The file tree will look like:

1
2
3

/protos
├── todo.pb.go # <-- the generated go file
└── todo.proto

Let’s dissect the command and understand each flag:

-I takes the directory name where the proto files are located. For me it was protos.
--go_out=paths=source_relative:./protos is asking the compiler to generate the proto go code file relative to the /protos directory (which also has the todo.proto file).
- ⚠️ Please be cautious, there’s a separator : between --go_out=paths=source_relative and the ./protos (the destination where we want to generate the code file).
./protos/todo.proto is the last argument which is giving the location of the proto file that is to be compiled.

Congratulation on getting everything right! It was overwhelming for me when I did all these for the first time.

Using Makefile

We all can agree that the compilation command for Protobuf is not straightforward, and neither it’s easy to keep in memory. Also typing that again and again can be a frustrating experience if you miss something or add an extra space. Hence we will automate the proto compilation using Makefile.

Makefile is a special file used by the make build tool to automate tasks, one of the most common use is compiling source codes. Instead of typing long commands every time you build a project, we define simple rules in a Makefile, and make handles the rest.

Install `make`

In Debian Linux: sudo apt install build-essential
For MacOS: xcode-select --install
Verify the installation: make --version

Creating the Makefile

We will add the compilation command in the Makefile and give it an alias. let’s create the makefile at the root of the project using the command touch Makefile. Now copy paste the content below to the file just created.

1
2
3

# Makefile (at the root of the project)
todo:
  protoc -I protos --go_out=paths=source_relative:./generated ./protos/todo.proto

Note: We are moving the generated code to a dedicated generated folder. Don’t forget to delete the previously generated file using the command rm ./protos/todo.pb.go.

Running Makefile

Now we can use make to compile the Protobuf instead of typing the long command. To compile using make the command will be make todo (from the root folder i.e. learning-protocol-buffers`)

More on Makefile

You may wonder why the alias in the Makefile is todo followed by the compilation command? it’s because if we have to add another Proto file, we can add another command and then a single combined command that will compile both the protos. For example:

# Makefile (at the root of the project)
.PHONY: all todo user

all: todo user

todo:
  protoc -I protos --go_out=paths=source_relative:./generated ./protos/todo.proto

user:
  protoc -I protos --go_out=paths=source_relative:./generated ./protos/user.proto

Similarly we can add multiple commands like clean for cleanup of files before generating them.

What is .PHONY?

Normally, make checks whether a target name matches an existing file. If a file does exist with that name, make may skip running the rule because it thinks the target is already up-to-date. .PHONY prevents this problem.

1	.PHONY: all todo user

This tells make that all, todo, and user are not files, they should always run when invoked. make should not try to compare timestamps or treat them as build artifacts.

The Generated file

Let’s shift our focus back to the generated file todo.pb.go.

A generated Go file from proto compiler protoc is named something like ---.pb.go. This file contains Go code that represents all the messages, enums, and services defined in our Protobuf schema.

Each message becomes a Go struct with fields that match your proto definitions, along with metadata tags that map field numbers and JSON names. The generated code also includes all serialization and deserialization logic, the methods that lets us marshal our structs into binary protobuf format and unmarshal them back into Go values.

If our proto includes enums, which it does, the generated file defines corresponding Go const values. It also contains reflection information used internally by the Protobuf runtime.

The generated Go file gives us all the Go types and helper functions needed to work with our Protobuf message data letting us construct messages, encode/decode them, send them over the network.

Note: ⚠️ Never change the generated contents and always run go mod tidy to ensure the required packages are installed before using the generated code.

Serialization & Deserialization

Serialization and deserialization are the processes that allow data structures in a program to be converted into a form that can be stored, transmitted, or exchanged, and then reconstructed later.

Serialization

Serialization is the act of converting an in-memory object—such as a Go struct—into a sequence of bytes. This byte sequence can be written to a file, sent over a network, or stored in a database.

In Protobuf, serialization produces a compact binary representation that is efficient in both size and speed

Deserialization

Deserialization is the reverse: taking that byte sequence and reconstructing the original object in memory.

Code in Action

With Protobuf generated Go code, serialization is done using functions like proto.Marshal(message), which returns the binary form of the message, while deserialization is done using proto.Unmarshal(data, &message), which fills the given struct with the decoded values. Together, these two processes allow programs to exchange structured data reliably and efficiently.

The Plan

We will create a Todo using the class (a struct in Go) from the generated code and then we will serialize it in a binary file. Again from the binary file we will read the contents and deserialize. We will print the deserialized content on the terminal (standard out).

Let’s create main.go in the root of the project, and we will add 2 functions which will write to file and read from the file. Here file is actually a binary file.

The Binary Writer

This function provides a simple way to serialize a Protobuf message and store it on disk in its raw binary format. It takes two arguments: a file path and any value that implements proto.Message. The Todo struct generated from the todo.proto file.

func writeBinary(path string, m proto.Message) error {
	b, err := proto.Marshal(m)
	if err != nil {
		return err
	}
	return os.WriteFile(path, b, 0644)
}

The proto.Marshal converts the message into Protobuf’s compact binary wire format, returning a byte slice that represents the encoded message exactly as Protobuf defines it for cross-language compatibility.

The Binary Reader

This function reads a Protobuf message from a binary file that contains its raw binary encoding and reconstructs it into a Go struct. It begins by loading the entire file into memory with os.ReadFile, retrieving the exact bytes previously written using Protobuf’s wire format.

func readBinary(path string) (*pb.Todo, error) {
	b, err := os.ReadFile(path)
	if err != nil {
		return nil, err
	}
	var t pb.Todo
	if err := proto.Unmarshal(b, &t); err != nil {
		return nil, err
	}
	return &t, nil
}

Using proto.Unmarshal, it decodes the binary data back into the struct, filling in all fields according to the Protobuf specification, including nested messages, enums, and default values. If any step fails, the function returns an error otherwise, it returns a pointer to the fully reconstructed pb.Todo instance.

The main.go

Here’s how the code will look like for the file main.go which is located at the root of the project. [It’s not justified to follow the idiomatic Go folder structure for this demonstration.] Please refrain from using the code directly for production usages.

package main

import (
	"fmt"
	"log"
	"os"

	"google.golang.org/protobuf/proto"

	pb "ashokdey.com/protos/generated"
)

func main() {
	// create a Todo
	t := &pb.Todo{
		Id:    101,
		Title: "Complete the Protocol Buffer Blog Series",
	}

	if err := writeBinary("todo.bin", t); err != nil {
		log.Fatalf("writeBinary: %v", err)
	}
	tb, err := readBinary("todo.bin")
	if err != nil {
		log.Fatalf("readBinary: %v", err)
	}
	fmt.Printf("From binary: %+v\n", tb)
}

func writeBinary(path string, m proto.Message) error {
	b, err := proto.Marshal(m)
	if err != nil {
		return err
	}
	return os.WriteFile(path, b, 0644)
}

func readBinary(path string) (*pb.Todo, error) {
	b, err := os.ReadFile(path)
	if err != nil {
		return nil, err
	}
	var t pb.Todo
	if err := proto.Unmarshal(b, &t); err != nil {
		return nil, err
	}
	return &t, nil
}

Let’s run the code using the command go run main.go and this will do two things:

It will create a new file todo.bin which contains the serialized proto message.
It will also read the binary file todo.bin and will print it’s content to the stdout.

Here’s the full file tree of the project:

├── generated
│   └──todo.pb.go   # <-- the go file generated by `protoc`
├── protos
│   └──todo.proto
├── go.mod
├── go.sum
├── main.go
├── Makefile
└── todo.bin        # <- the binary file generated

And the console output should match:

1	From binary: id:101 title:"Complete the Protocol Buffer Blog Series"

Let’s examine the binary data stored inside the todo.bin file our Go code generated.

Using `hexdump`

hexdump is a command line tool used to display the raw bytes of a file in a human-readable hexadecimal format. It’s especially useful for inspecting binary data—such as Protobuf messages, executables, or network captures.

From the root of the project, we can use hexdump for readable version of the binary data.

1	hexdump -C todo.bin

And the output of the hexdump is:

00000000  08 65 12 28 43 6f 6d 70  6c 65 74 65 20 74 68 65  |.e.(Complete the|
00000010  20 50 72 6f 74 6f 63 6f  6c 20 42 75 66 66 65 72  | Protocol Buffer|
00000020  20 42 6c 6f 67 20 53 65  72 69 65 73              | Blog Series|
0000002c

We can also serialize the Protobuf message to JSON or Text using the following protobuf packages from Google.

1 2	"google.golang.org/protobuf/encoding/protojson" "google.golang.org/protobuf/encoding/prototext"

Here’s a repo of mine from 2018 that has example code: Protocol Buffers & Golang

I want to pause for a moment to appreciate your patience in making it to the end. You’ve just gone through a lot of ideas, new jargons, and unfamiliar tools which can be challenging. If you’re still motivated to keep learning, that already says a lot.

Conclusion

When we work on a system that keeps growing, we quickly realize how messy sharing data can get. Passing JSON or quick custom structs around seems fine at first, but things start breaking when different services interpret fields differently or the data gets so large that everything slows down.

Serialization is often where most of the trouble shows up. Formats like JSON or XML can become bulky, unclear, and unpredictable. Especially when someone adds a field without warning.

Protobuf handles this much better by turning your data into a small, efficient binary format that travels quickly and can still adapt as our system changes. It may feel like extra effort in the beginning, but the benefits show up fast. The compiler helps catch mistakes early, all our services follow the same structure, and we finally have one reliable source of truth for your data.

Together, these steps make your system more stable, faster, and easier to grow over time.

Stay healthy, stay blessed!