My first exposure to AST instrumentation in Go

6 minute read

Published:

Intro to Go Concurrency Verification

Due to its simplicity and functionality, Go is receiving attention among developers. However, there are not that many tools to focus on the correctness of Go concurrency. Two recent Go static verification papers (1, 2) extract a system of type equations from the Go source code SSA IR 1 equivalent. A calculus mirrors channel-based behaviors of Go programs (goroutines, functions, channel create/communicate/close, etc.) in the form of mutually recursive definitions. Table1 explains the syntax of MiGo. The behavioral typing system derives from MiGo can be checked for liveness and safety in three steps:

  1. Syntactic restrictions on types (Fenced types)
  2. Symbolic semantics definitions for types (LTS)
  3. Decidability proof for liveness and channel safety (Symbolic execution)

Static verification, however, has its own pitfalls. For larger programs, it would suffer from state explosion and un-scalability. That is why I am interested in studying dynamic behavior of Go. I am gradually literature reviewing current instrumenting/tracing tool (industrial and academic). My goal is to find the best way to instrument Go applications to trace events that imply:

  • Channel creation/communication/termination
  • Goroutine (i.e., user-level thread) spawning
  • Select non-deterministic selection

to build a knowledge base so that we can bind the typing system described in work by Lange et al. to captured events and check for liveness and safety conflicts.

Thanks to my advisor, we found a recent paper by Marting Sulzmann that:

  • Instruments the source using AST and Parser package in Go
  • Trace channel, goroutine and select events
  • Assign vector clocks to events to infer Happens-Before relation among them
  • Replay the collected event to dynamically verify (check for liveness and safety) Go concurrency
  • Study the alternating potential schedules

Well, they have implemented the exact idea that I had in mind. Here is the repo of their tool. It looks nicely implemented using ideas that I found well explained in building Go tool video and AST-based instrumentation blog.

Tools to build Go tools

Tracing tools like Jaeger, Prometheus, Zipkin and Datadog provides front-ends on top of general-purpose back-end distributed tracing engines like OpenTracing and Dapper. The provided APIs are rich enough to gather evidence and metrics for application performance and latency measurement. But I did not find any of these APIs useful to gather precise information regarding channels, goroutines and non-determinism of select (which are requirements for formal verification of Go applications).

Now that I did not find any of these (mostly commercial) tools helpful for my purpose (which is automatic instrumentation of Go for runtime verification), I found that Go provides packages that allow developers/researchers make their own customized tools:

  • Go/token: Defines constants representing the lexical tokens of the Go and their basic operations.
  • Go/scanner: Implements a scanner for source text. It takes a []byte as source which can be tokenized through repeated calls to the scan method.
  • Go/ast: Declares the types used to represent syntax trees for Go packages.
  • Go/parser: Implements a parser for Go source files. Input may be provided in a variety of forms; the output is an abstract syntax tree (AST) representing the Go source. In other words, parser uses scanner to iterate over tokens of a source code (e.g., line of code) and generates its equivalent AST.
  • Go/printer: Prints the equivalent Go source of an AST.

The basic idea for automatic instrumentation of Go source code is to:

  • take a Go source and pass it to the parser package
  • generate its AST
  • insert whatever instrumentation is needed to the AST
  • print the new source

Now just for fun of it, I want to write a Go program that analyze the source of another Go program and outputs all the channel transactions.

Analyzing channel communication via AST analysis

I just learned how to parse an AST using powerful Inspect function (doc):

Inspect traverses an AST in depth-first order: It starts by calling f(node); node must not be nil. If f returns true, Inspect invokes f recursively for each of the non-nil children of node, followed by a call of f(nil).

As an exercise, I wanted to check all channel, select and send expressions within a source code. First we read in the AST from the source code and use Inspect to parse all nodes and take out whatever we need using type assertion and switch-case statement.

func channelAnalyis(source string){
  fset := token.NewFileSet() // A fileset to store AST
  node, err := parser.ParseFile(fset, source, nil, parser.ParseComments) // Reads and parses the source, stores the AST root in node
  if err != nil {
      log.Fatal(err)
  }
  ast.Inspect(node, func(n ast.Node) bool {
		switch x := n.(type) { // check the type of each node,
		case *ast.ChanType:
      _ = x // blank identifier
      ct := n.(*ast.ChanType) // extract the structs := type assertion
      fmt.Printf("%s: \tChanTYPE\n%+v\n\n",fset.Position(n.Pos()),ct)
		case *ast.SendStmt:
      _ = x
      ss := n.(*ast.SendStmt)
      fmt.Printf("%s: \tSend\n%+v\n\n",fset.Position(n.Pos()),ss)
    case *ast.CommClause:
      _ = x
      cc := n.(*ast.CommClause)
      fmt.Printf("%s: \tSelect statment\n%+v\n\n",fset.Position(n.Pos()),cc)
		}
		return true
	})
}

Execute above function on “fibonacci.go” (borrowed from A Tour of Go) dumps below information:

fibonacci.go:5:24:      ChanTYPE
&{Begin:52 Arrow:0 Dir:3 Value:int}

fibonacci.go:9:3:       Select statment
&{Case:98 Comm:0xc00009e3c0 Colon:109 Body:[0xc00009a180]}

fibonacci.go:9:8:       Send
&{Chan:c Arrow:105 Value:x}

fibonacci.go:11:3:      Select statment
&{Case:130 Comm:0xc000070390 Colon:141 Body:[0xc0000703d0 0xc00000c440]}

fibonacci.go:19:12:     ChanTYPE
&{Begin:211 Arrow:0 Dir:3 Value:int}

fibonacci.go:20:15:     ChanTYPE
&{Begin:235 Arrow:0 Dir:3 Value:int}

fibonacci.go:25:3:      Send
&{Chan:quit Arrow:317 Value:0xc00000c7c0}

Conclusion

In this post, I have initiated my blog writing habits by summarizing/gathering the resources I have studied for a better guided direction in my research. The goal of my current line of research is to design a system to understand dynamic behavior of Go with respect to concurrency. I have started to get familiar with available packages in Go for automatic instrumentation and I wrote my first AST-Inspect code to study channels in Go. I have been strengthening my background on Go types, methods, interfaces and type assertions to be able to extract from/inject to AST of go source. In my next post, I will lead this thread into more details with a focus on designing a monitor for Runtime Verification system.


  1. Single Static Assignment (SSA) is a form property of Intermediate Representation (IR), which requires that each variable assigned precisely once. Go has a package that generates SSA IR from source.