How Thanos Would Program in Go

TL;DR: Recently we introduced extended Go Style Guide for the Thanos project , a high scale open-source distributed metric system where, with our large community, we take extra attention and care for the code quality.
Go + Distributed Systems = ā¤ļø
Modern, scalable backend systems can be incredibly complex. Despite our efforts with other Maintainers, to not add too many features, APIs or inconsistencies to our projects like Prometheus or Thanos, those are still large codebases. On top of it, both projects are in some way stateful (databases!), distributed and used by thousands of small and big companies (with often varying requirements). Despite the extra work on unblocking users with integrations and extensible API, we still have to maintain quite a big system.
Undoubtedly Go suites this job really well. While achieving low-level solid performance is not trivial, itās still close to C++ speed-wise. Whatās more important, the maintenance and development velocity is extremely fast in comparison to other languages. Overall, with Go it is quite easy to quickly write reliable software.
Most of those language benefits can be attributed to the main characteristic of Go: Readability. The language itself has many tools to (sometimes automatically) ensure consistency and simplicity: Only ~one āidiomaticā way of doing things (e.g. error handling, concurrency, encoding), the only one formatting⦠and no generics! (: All of those idiomatic patterns are well described in Effective Go and CodeReviewComments , which were written almost at the beginning of the Go language!
So why one would need another style guide?
Official, mentioned guides are amazing, quite strict and cover a large spectrum of patterns, but they were done for the generic use cases and for ALL the users. This means there is a little bit too much of freedom here and there. Those guides had to be applicable for all sorts of software: backend distributed systems, low-level IoT software with linked C code, GFX applications, CLI tools, in-browser client code or even as a configuration templates! .
Thatās why with the Thanos Team we decided to share our official Thanos Style Guide
.
Since our codebase is focused solely on thr backend infrastructure, we can be more opinionated and consistent. It really takes all the things from official
Effective Go
and CodeReviewComments
docs and adds on top of them some additional rules.
This allowed us, with the help of the community, to produce even more readable and efficient Go on projects we maintain like Thanos, Prometheus, Prometheus-Operator and more.
In this blog post, I will try to quickly go through more interesting improvements to the official guides with some rationals (:
Thanos Coding Style Guide
This is a copy of our official Thanos Style Guide with small commentaries.
- Development / Code Review
- Reliability
+ Defers: Donāt Forget to Check Returned Errors
+ Exhaust Readers
+ Avoid Globals
+ Never Use Panics
+ Avoid Using the
reflect
orunsafe
Packages + Avoid variable shadowing - Performance + Pre-allocating Slices and Maps + Reuse arrays
- Readability
+ Keep the Interface Narrow; Avoid Shallow Functions
+ Use Named Return Parameters Carefully
+ Clean Defer Only if Function Fails
+ Explicitly Handled Returned Errors
+ Avoid Defining Variables Used Only Once.
+ Only Two* Ways of Formatting Functions/Methods
+ Control Structure: Prefer early returns and avoid
else
+ Wrap Errors for More Context; Donāt Repeat āfailed ā¦ā There. + Use the Blank Identifier_
+ Rules for Log Messages + Comment Necessary Surprises - Testing
+ Table Tests
+ Tests for Packages / Structs That Involve
time
package.
- Reliability
+ Defers: Donāt Forget to Check Returned Errors
+ Exhaust Readers
+ Avoid Globals
+ Never Use Panics
+ Avoid Using the
- Ensured by linters
Table of contents generated with markdown-toc
Development / Code Review
In this section, we will go through rules that on top of the standard guides that we apply during development and code reviews.
NOTE: If you know that any of those rules can be enabled by some linter, automatically, let us know! (:
Reliability
The coding style is not purely about what is ugly and what is not. Itās mainly to make sure programs are reliable for running on production 24h per day without causing incidents. The following rules are describing some unhealthy patterns we have seen across the Go community that are often forgotten. Those things can be considered bugs or can significantly increase the chances of introducing a bug.
Defers: Donāt Forget to Check Returned Errors
Itās easy to forget to check the error returned by a Close
method that we deferred.
|
Unchecked errors like this can lead to major bugs. Consider the above example: the *os.File
Close
method can be responsible
for actually flushing to the file, so if an error occurs at that point, the whole write might be aborted! š±
Always check errors! To make it consistent and not distracting, use our runutil helper package, e.g.:
|
|
|
Exhaust Readers
One of the most common bugs is forgetting to close or fully read the bodies of HTTP requests and responses, especially on error. If you read the body of such structures, you can use the runutil helper as well:
|
|
|
Avoid Globals
No globals other than const
are allowed. Period.
This means also, no init
functions.
Never Use Panics
Never use them. If some dependency uses it, use recover . Also, consider avoiding that dependency. š
Avoid Using the reflect
or unsafe
Packages
Use those only for very specific, critical cases. Especially reflect
tend to be very slow. For testing code, itās fine to use reflect.
Avoid variable shadowing
Variable shadowing is when you use the same variable name in a smaller scope that āshadowsā. This is very
dangerous as it leads to many surprises. Itās extremely hard to debug such problems as they might appear in unrelated parts of the code.
And whatās broken is tiny :
or lack of it.
|
|
This is also why we recommend to scope errors if you can:
|
While itās not yet configured, we might think consider not permitting variable shadowing with golang.org/x/tools/go/analysis/passes/shadow/cmd/shadow
in future. There was even Go 2 proposal for disabling this in the language itself, but was rejected
:
Similar to this problem is the package name shadowing. While it is less dangerous, it can cause similar issues, so avoid package shadowing if you can.
Performance
After all, Thanos system is a database that has to perform queries over terabytes of data within human friendly response times. This might require some additional patterns in our code. With those patterns, we try to not sacrifice the readability and apply those only on the critical code paths.
Keep in mind to always measure the results. The Go performance relies on many hidden things and tweaks, so the good micro benchmark, following with the real system load test is in most times required to tell if optimization makes sense.
Pre-allocating Slices and Maps
Try to always preallocate slices and map. If you know the number of elements you want to put apriori, use that knowledge! This significantly improves the latency of such code. Consider this as micro optimization, however, itās a good pattern to do it always, as it does not add much complexity. Performance wise, itās only relevant for critical, code paths with big arrays.
NOTE: This is because, in very simple view, the Go runtime allocates 2 times the current size. So if you expect million of elements, Go will do many allocations
on append
in between instead of just one if you preallocate.
|
|
Reuse arrays
To extend the above point, there are cases where you donāt need to allocate new space in memory all the time. If you repeat the certain operation on slices sequentially and you just release the array on every iteration, itās reasonable to reuse the underlying array for those. This can give quite enormous gains for critical paths. Unfortunately, currently there is no way to reuse the underlying array for maps.
NOTE: Why you cannot just allocate slice and release and in new iteration allocate and release again etc? Go should know it has available space and just reuses that no? (: Well, itās not that easy. TL;DR is that Go Garbage Collection runs periodically or on certain cases (big heap), but definitely not on every iteration of your loop (that would be super slow). Read more in details here .
|
|
Readability
The part that all Gophers love ā¤ļø How to make code more readable?
For the Thanos Team, readability is about programming in a way that does not surprise the reader of the code. All the details and inconsistencies can distract or mislead the reader, so every character or newline might matter. Thatās why we might be spending more time on every Pull Requests' review, especially in the beginning, but for a good reason! To make sure we can quickly understand, extend and fix problems with our system.
Keep the Interface Narrow; Avoid Shallow Functions
This is connected more to the API design than coding, but even during small coding decisions it matter. For example how you define functions or methods. There are two general rules:
- Simpler (usually it means smaller) interfaces are better. This might mean a smaller, simpler function signature as well as fewer methods in the interfaces. Try to group interfaces based on functionality to expose at max 1-3 methods if possible.
|
|
- Itās better if you can hide more unnecessary complexity from the user. This means that having shallow function introduce more cognitive load to understand the function name or navigate to implementation to understand it better. It might be much more readable to inline those few lines directly on the caller side.
|
|
This is a little bit connected to There should be one-- and preferably only one --obvious way to do it
and DRY
rules. If you have more ways of doing something than one, it means you have a wider interface, allowing more opportunities for
errors, ambiguity and maintenance burden.
|
|
Use Named Return Parameters Carefully
Itās OK to name return parameters if the types do not give enough information about what function or method actually returns. Another use case is when you want to define a variable, e.g. a slice.
IMPORTANT: never use naked return
statements with named return parameters. This compiles but it makes returning values
implicit and thus more prone to surprises.
Clean Defer Only if Function Fails
There is a way to sacrifice defer in order to properly close all on each error. Repetitions makes it easier to make error and forget something when changing the code, so on-error deferring is doable:
|
|
Explicitly Handled Returned Errors
Always handle returned errors. It does not mean you cannot āignoreā the error for some reason, e.g. if we know implementation will not return anything meaningful. You can ignore the error, but do so explicitly:
|
|
The exception: well-known cases such as level.Debug|Warn
etc and fmt.Fprint*
Avoid Defining Variables Used Only Once.
Itās tempting to define a variable as an intermittent step to create something bigger. Avoid defining such a variable if itās used only once. When you create a variable the reader expects some other usage of this variable than one, so it can be annoying to every time double check that and realize that itās only used once.
|
|
Only Two* Ways of Formatting Functions/Methods
Prefer function/method definitions with arguments in a single line. If itās too wide, put each argument on a new line.
|
|
This applies for both calling and defining method / function.
NOTE: One exception would be when you expect the variadic (e.g. ...string
) arguments to be filled in pairs, e.g:
|
Control Structure: Prefer early returns and avoid else
In most of the cases, you donāt need else
. You can usually use continue
, break
or return
to end an if
block.
This enables having one less indent and netter consistency so code is more readable.
|
|
Wrap Errors for More Context; Donāt Repeat āfailed ā¦ā There.
We use pkg/errors
package for errors
. We prefer it over standard wrapping with fmt.Errorf
+ %w
,
as errors.Wrap
is explicit. Itās easy to by accident replace %w
with %v
or to add extra inconsistent characters to the string.
Use pkg/errors.Wrap
to wrap errors for future context when errors occur. Itās recommended
to add more interesting variables to add context using errors.Wrapf
, e.g. file names, IDs or things that fail, etc.
NOTE: never prefix wrap messages with wording like failed ...
or error occurred while...
. Just describe what we
wanted to do when the failure occurred. Those prefixes are just noise. We are wrapping error, so itās obvious that some error
occurred, right? (: Improve readability and consider avoiding those.
|
|
Use the Blank Identifier _
Blank identifiers are very useful to mark variables that are not used. Consider the following cases:
|
|
|
Rules for Log Messages
We use go-kit logger
in Thanos. This means that we expect log lines
to have a certain structure. Structure means that instead of adding variables to the message, those should be passed as
separate fields. Keep in mind that all log lines in Thanos should be lowercase
(readability and consistency) and
all struct keys are using camelCase
. Itās suggested to keep key names short and consistent. For example, if
we always use block
for block ID, letās not use in the other single log message id
.
|
|
Additionally, there are certain rules we suggest while using different log levels:
- level.Info: Should always have
msg
field. It should be used only for important events that we expect to happen not too often. - level.Debug: Should always have
msg
field. It can be a bit more spammy, but should not be everywhere as well. Use it only when you want to really dive into some problems in certain areas. - level.Warn: Should have either
msg
orerr
or both fields. They should warn about events that are suspicious and to investigate but the process can gracefully mitigate it. Always try to describe how it was mitigated, what action will be performed e.g.value will be skipped
- level.Error: Should have either
msg
orerr
or both fields. Use it only for a critical event.
Comment Necessary Surprises
Comments are not the best. They age quickly and the compiler does not fail if you will forget to update them. So use comments
only when necessary. And it is necessary to comment on code that can surprise the user. Sometimes, complexity
is necessary, for example for performance. Comment in this case why such optimization was needed. If something
was done temporarily add TODO(<github name>): <something, with GitHub issue link ideally>
.
Testing
Table Tests
Use table-driven tests that use t.Run for readability. They are easy to read and allows to add a clean description of each test case. Adding or adapting test cases is also easier.
|
|
Tests for Packages / Structs That Involve time
package.
Avoid unit testing based on real-time. Always try to mock time that is used within struct by using for example timeNow func() time.Time
field.
For production code, you can initialize the field with time.Now
. For test code, you can set a custom time that will be used by the struct.
|
|
Ensured by linters
This is the list of rules we ensure automatically. This section is for those who are curious why such linting rules were added or want similar ones in their Go project. š¤
Avoid Prints.
Never use print
. Always use a passed go-kit/log.Logger
.
Ensured here .
Ensure Prometheus Metric Registration
Itās very easy to forget to add Prometheus metric (e.g prometheus.Counter
) into registry.MustRegister
function.
To avoid this we ensure all metrics are created via promtest.With(r).New*
and old type of registration is not allowed.
Read more about the problem here
.
Ensured here .
go vet
Standard Go vet is quite strict, but for a good reason. Always vet your Go code!
Ensured here .
golangci-lint
golangci-lint is an amazing tool that allows running set of different custom linters across Go community against your code. Give it a Star and use it. (:
Ensured here with those linters enabled.
misspell
Misspell is amazing, it catches some typos in comments or docs.
No Grammarly plugin for this yet ): (We wish).
Ensured here .
Commentaries Should be a Full Sentence.
All comments should be a full sentence. Should start with Uppercase letter and end with a period.
Ensured here .
Summary
The official Thanos Go Style Guide was created to maintain a high quality of the code in Thanos project. We will refer to this guide while reviewing or pushing new code, so if propose a Pull Request against Thanos or you want to help to review the PRs of others (please!), it would be awesome if you could take a look at our Coding Style first! š¤. Itās also not a definitive list, we might add more items to the list or better: change some of those rules into linters checking those bits for us!
I really hope this style guide can help any infrastructure Go projects in open source! If you disagree with something, please feel free to comment here, on #thanos-dev Slack of via GitHub issue against Thanos repository. We believe this makes sense, but we could miss some important facts or exceptions. We would love your feedback!
Comments