Robust gRPC communication on Google Cloud Run (but not only!)
Welcome to the third article in the series on building business-oriented applications in Go! In this series, we show you how to build applications that are easy to develop, maintain, and fun to work with in the long term.
In this article, I describe how to build robust internal communication between your services using gRPC. I also cover the extra configuration required to set up authentication and TLS for Cloud Run.
Why gRPC?
Let’s imagine a story that is true for many companies:
Meet Dave. Dave works at a company that spent about two years building their product from scratch. During this time, they found thousands of customers who wanted to use their product. They started developing this application during the biggest microservices “boom.” It was an obvious choice to use that architecture. Now they have more than 50 microservices using HTTP calls to communicate with each other.
Of course, Dave’s company didn’t do everything perfectly. The biggest pain is that all engineers are now afraid to change anything in the HTTP contracts. It’s easy to make incompatible changes or return invalid data. The entire application breaking because of this isn’t rare. “Didn’t we build microservices to avoid that?” is a question the scary voices in Dave’s head ask every day.
Dave already proposed using OpenAPI to generate HTTP server responses and clients. But he quickly found that he could still return invalid data from the API.
Whether this story sounds familiar or not, the solution for Dave’s company is simple and straightforward to implement. You can easily achieve robust contracts between your services by using gRPC.
The way gRPC generates servers and clients is much stricter than OpenAPI. It’s far better than OpenAPI’s client and server, which just copy structures.
Note
gRPC doesn’t solve data quality problems. You can still send data that is not empty but doesn’t make sense.
Make sure your data is valid using robust contracts, contract testing, and end-to-end tests.
Another reason to consider gRPC is performance. Studies show gRPC can be up to 10x faster than REST. When your API handles millions of requests per second, this becomes a cost optimization opportunity. For applications like Wild Workouts, where traffic may be less than 10 requests/sec, it doesn’t matter.
To avoid bias, I tried to find reasons not to use gRPC for internal communication. I couldn’t find any:
- The barrier to entry is low.
- Adding a gRPC server requires no extra infrastructure work: it works on top of HTTP/2.
- It works with many languages like Java, C/C++, Python, C#, JS, and more.
- In theory, you can even use gRPC for frontend communication (I haven’t tested that).
- It’s “Goish”: the compiler ensures you don’t return anything invalid.
Sounds promising? Let’s verify this with the implementation in Wild Workouts!
Note
This is not just another article with random code snippets.
This post is part of a bigger series where we show how to build Go applications that are easy to develop, maintain, and fun to work with in the long term. We are doing it by sharing proven techniques based on many experiments we did with teams we lead and scientific research.
You can learn these patterns by building with us a fully functional example Go web application – Wild Workouts.
We did one thing differently – we included some subtle issues to the initial Wild Workouts implementation. Have we lost our minds to do that? Not yet. 😉 These issues are common for many Go projects. In the long term, these small issues become critical and stop adding new features.
It’s one of the essential skills of a senior or lead developer; you always need to keep long-term implications in mind.
We will fix them by refactoring Wild Workouts. In that way, you will quickly understand the techniques we share.
Do you know that feeling after reading an article about some technique and trying implement it only to be blocked by some issues skipped in the guide? Cutting these details makes articles shorter and increases page views, but this is not our goal. Our goal is to create content that provides enough know-how to apply presented techniques. If you did not read previous articles from the series yet, we highly recommend doing that.
We believe that in some areas, there are no shortcuts. If you want to build complex applications in a fast and efficient way, you need to spend some time learning that. If it was simple, we wouldn’t have large amounts of scary legacy code.
Here’s the full list of 14 articles released so far.
The full source code of Wild Workouts is available on GitHub. Don’t forget to leave a star for our project! ⭐
Generated server
Currently, Wild Workouts doesn’t have many gRPC endpoints. We can update trainer hours availability and user training balance (credits).

Let’s look at the Trainer gRPC service.
To define our gRPC server, we need to create a trainer.proto file.
syntax = "proto3";
package trainer;
import "google/protobuf/timestamp.proto";
service TrainerService {
rpc IsHourAvailable(IsHourAvailableRequest) returns (IsHourAvailableResponse) {}
rpc UpdateHour(UpdateHourRequest) returns (EmptyResponse) {}
}
message IsHourAvailableRequest {
google.protobuf.Timestamp time = 1;
}
message IsHourAvailableResponse {
bool is_available = 1;
}
message UpdateHourRequest {
google.protobuf.Timestamp time = 1;
bool has_training_scheduled = 2;
bool available = 3;
}
message EmptyResponse {}
The .proto definition is converted into Go code using the Protocol Buffer Compiler (protoc).
.PHONY: proto
proto:
protoc --go_out=plugins=grpc:internal/common/genproto/trainer -I api/protobuf api/protobuf/trainer.proto
protoc --go_out=plugins=grpc:internal/common/genproto/users -I api/protobuf api/protobuf/users.proto
Note
To generate Go code from .proto, you need to install protoc and the protoc Go Plugin.
A list of supported types can be found in the Protocol Buffers Version 3 Language Specification. More complex built-in types like Timestamp can be found in the Well-Known Types list.
Here’s an example of a generated model:
type UpdateHourRequest struct {
Time *timestamp.Timestamp `protobuf:"bytes,1,opt,name=time,proto3" json:"time,omitempty"`
HasTrainingScheduled bool `protobuf:"varint,2,opt,name=has_training_scheduled,json=hasTrainingScheduled,proto3" json:"has_training_scheduled,omitempty"`
Available bool `protobuf:"varint,3,opt,name=available,proto3" json:"available,omitempty"`
XXX_NoUnkeyedLiteral struct{} `json:"-"`
// ... more proto garbage ;)
}
And the server:
type TrainerServiceServer interface {
IsHourAvailable(context.Context, *IsHourAvailableRequest) (*IsHourAvailableResponse, error)
UpdateHour(context.Context, *UpdateHourRequest) (*EmptyResponse, error)
}
The difference between HTTP and gRPC is that with gRPC, we don’t need to worry about what to return or how to do it. If I were to compare the level of confidence between HTTP and gRPC, it would be like comparing Python and Go. gRPC is much stricter, and it’s impossible to return or receive invalid values: the compiler will let us know.
Protobuf also has built-in support for field deprecation and backward compatibility. This helps in environments with many independent teams.
Note
Protobuf vs gRPC
Protobuf (Protocol Buffers) is the Interface Definition Language used by default for defining the service interface and payload structure. Protobuf also serializes these models to binary format.
You can find more details about gRPC and Protobuf on the gRPC Concepts page.
Implementing the server works almost the same as HTTP generated by OpenAPI: we need to implement an interface (TrainerServiceServer in this case).
type GrpcServer struct {
db db
}
func (g GrpcServer) IsHourAvailable(ctx context.Context, req *trainer.IsHourAvailableRequest) (*trainer.IsHourAvailableResponse, error) {
timeToCheck, err := grpcTimestampToTime(req.Time)
if err != nil {
return nil, status.Error(codes.InvalidArgument, "unable to parse time")
}
model, err := g.db.DateModel(ctx, timeToCheck)
if err != nil {
return nil, status.Error(codes.Internal, fmt.Sprintf("unable to get data model: %s", err))
}
if hour, found := model.FindHourInDate(timeToCheck); found {
return &trainer.IsHourAvailableResponse{IsAvailable: hour.Available && !hour.HasTrainingScheduled}, nil
}
return &trainer.IsHourAvailableResponse{IsAvailable: false}, nil
}
As you can see, you cannot return anything other than IsHourAvailableResponse, and you can always be sure you’ll receive IsHourAvailableRequest.
For errors, you can return one of the predefined error codes.
These are more modern than HTTP status codes.
Starting the gRPC server works the same as an HTTP server:
server.RunGRPCServer(func(server *grpc.Server) {
svc := GrpcServer{firebaseDB}
trainer.RegisterTrainerServiceServer(server, svc)
})
Internal gRPC client
After our server is running, it’s time to use it. First, we need to create a client instance.
trainer.NewTrainerServiceClient is generated from .proto.
type TrainerServiceClient interface {
IsHourAvailable(ctx context.Context, in *IsHourAvailableRequest, opts ...grpc.CallOption) (*IsHourAvailableResponse, error)
UpdateHour(ctx context.Context, in *UpdateHourRequest, opts ...grpc.CallOption) (*EmptyResponse, error)
}
type trainerServiceClient struct {
cc grpc.ClientConnInterface
}
func NewTrainerServiceClient(cc grpc.ClientConnInterface) TrainerServiceClient {
return &trainerServiceClient{cc}
}
To make the generated client work, we need to pass a few extra options that handle the following:
- Authentication.
- TLS encryption.
- “Service discovery” (we use hardcoded names of services provided by Terraform via the
TRAINER_GRPC_ADDRenv variable).
import (
// ...
"github.com/ThreeDotsLabs/wild-workouts-go-ddd-example/pkg/internal/genproto/trainer"
// ...
)
func NewTrainerClient() (client trainer.TrainerServiceClient, close func() error, err error) {
grpcAddr := os.Getenv("TRAINER_GRPC_ADDR")
if grpcAddr == "" {
return nil, func() error { return nil }, errors.New("empty env TRAINER_GRPC_ADDR")
}
opts, err := grpcDialOpts(grpcAddr)
if err != nil {
return nil, func() error { return nil }, err
}
conn, err := grpc.Dial(grpcAddr, opts...)
if err != nil {
return nil, func() error { return nil }, err
}
return trainer.NewTrainerServiceClient(conn), conn.Close, nil
}
After creating our client, we can call any of its methods.
In this example, we call UpdateHour when creating a training.
package main
import (
// ...
"github.com/pkg/errors"
"github.com/golang/protobuf/ptypes"
"github.com/ThreeDotsLabs/wild-workouts-go-ddd-example/pkg/internal/genproto/trainer"
// ...
)
type HttpServer struct {
db db
trainerClient trainer.TrainerServiceClient
usersClient users.UsersServiceClient
}
// ...
func (h HttpServer) CreateTraining(w http.ResponseWriter, r *http.Request) {
// ...
timestamp, err := ptypes.TimestampProto(postTraining.Time)
if err != nil {
return errors.Wrap(err, "unable to convert time to proto timestamp")
}
_, err = h.trainerClient.UpdateHour(ctx, &trainer.UpdateHourRequest{
Time: timestamp,
HasTrainingScheduled: true,
Available: false,
})
if err != nil {
return errors.Wrap(err, "unable to update trainer hour")
}
// ...
}
Cloud Run authentication & TLS
Authentication of the client is handled by Cloud Run out of the box.

You need to also grant the roles/run.invoker role to service's service account.
The simpler (and recommended by us) way is using Terraform. Miłosz described it in detail in the previous article.
One thing that doesn’t work out of the box is sending authentication with the request. Did I mention that the standard gRPC transport is HTTP/2? For that reason, we can use good old JWT (JSON Web Tokens) for authentication.
To make it work, we need to implement the google.golang.org/grpc/credentials.PerRPCCredentials interface.
The implementation is based on the official guide from Google Cloud Documentation.
type metadataServerToken struct {
serviceURL string
}
func newMetadataServerToken(grpcAddr string) credentials.PerRPCCredentials {
// based on https://cloud.google.com/run/docs/authenticating/service-to-service#go
// service need to have https prefix without port
serviceURL := "https://" + strings.Split(grpcAddr, ":")[0]
return metadataServerToken{serviceURL}
}
// GetRequestMetadata is called on every request, so we are sure that token is always not expired
func (t metadataServerToken) GetRequestMetadata(ctx context.Context, in ...string) (map[string]string, error) {
// based on https://cloud.google.com/run/docs/authenticating/service-to-service#go
tokenURL := fmt.Sprintf("/instance/service-accounts/default/identity?audience=%s", t.serviceURL)
idToken, err := metadata.Get(tokenURL)
if err != nil {
return nil, errors.Wrap(err, "cannot query id token for gRPC")
}
return map[string]string{
"authorization": "Bearer " + idToken,
}, nil
}
func (metadataServerToken) RequireTransportSecurity() bool {
return true
}
The last step is passing it to the []grpc.DialOption list when creating all gRPC clients.
It’s also a good idea to ensure our server’s certificate is valid with grpc.WithTransportCredentials.
Authentication and TLS encryption are disabled in the local Docker environment.
func grpcDialOpts(grpcAddr string) ([]grpc.DialOption, error) {
if noTLS, _ := strconv.ParseBool(os.Getenv("GRPC_NO_TLS")); noTLS {
return []grpc.DialOption{grpc.WithInsecure()}, nil
}
systemRoots, err := x509.SystemCertPool()
if err != nil {
return nil, errors.Wrap(err, "cannot load root CA cert")
}
creds := credentials.NewTLS(&tls.Config{
RootCAs: systemRoots,
})
return []grpc.DialOption{
grpc.WithTransportCredentials(creds),
grpc.WithPerRPCCredentials(newMetadataServerToken(grpcAddr)),
}, nil
}
Are all the problems of internal communication solved?
A hammer is great for hammering nails but awful for cutting a tree. The same applies to gRPC or any other technique.
gRPC works great for synchronous communication, but not every process is synchronous by nature. Applying synchronous communication everywhere will create a slow, unstable system. Currently, Wild Workouts doesn’t have any flow that should be asynchronous. We will cover this topic in more depth in the next articles by implementing new features. In the meantime, check out the Watermill library, which we also created. 😉 It helps with building asynchronous, event-driven applications the easy way.
What’s next?
Having robust contracts doesn’t mean we aren’t introducing unnecessary internal communication. In some cases, operations can be handled in one service in a simpler, more pragmatic way.
Avoiding these issues isn’t simple. Fortunately, we know techniques that help. We’ll share them with you soon. 😉
Until then, we still have one article left about our “Too modern application.” It will cover Firebase HTTP authentication. After that, we’ll start the refactoring! As I mentioned in the first article, we intentionally introduced some issues in Wild Workouts. Are you curious about what’s wrong with Wild Workouts? Let us know in the comments! 😉
See you next week 👋




