Professor Spear's Blog: Go

Lehigh is part of the KEEN network, an organization that promotes more entrepreneurial-minded learning in engineering curriculum. This summer, as a KEEN project, Corey Caplan and I are designing some fun new courseware for our Software Engineering course.

Our intention is to do everything in Java within the course. But when I need to figure out something about web backends in a hurry, I'd rather use Go. Today was one of those times.

Without going into too much detail, I have a web app that I wanted to stop running via localhost, and start running on Heroku. (If you're thinking this means that our Software Engineering students are going to start learning how to deploy their apps on Heroku's PaaS, you're right!). Below is something of a recipe for how I got it to work.

Confession: this turned out to be a lot harder than I expected, and it was probably my fault.

Caveat: the recipe below is possibly a good bit more complex than it needs to be... but it works, and seems to be repeatable.

Background: I had an app that looked like this:

/src/admin/*.go -- a simple admin program
/src/appserver/*.go -- the code for the server
/web/... -- the entire web frontend, designed as a single-page webapp
/env -- a script to set the environment variable when running the app locally
/setgopath.sh -- a script to set the GOPATH to the root of this project

There were a few more things in the folder, like a .gitignore, but they aren't important to this discussion.

Note, too, that I like to have a different GOPATH for each GO project, instead of checking them all into the same place. I organize my work in folders: teaching, research, etc. Using Visual Studio Code, I can just open a bash prompt, source my setgopath.sh script, type "code &", and I've got an IDE, a shell, and everything else I need.

Dependencies: Here's the first reason why this app was interesting: it uses Google's OAuth provider for authentication, and it connects to a MongoDB instance. There are four dependencies that I usually had to 'go get':

go get golang.org/x/oauth2
go get golang.org/x/oauth2/google
go get gopkg.in/mgo.v2
go get gopkg.in/mgo.v2/bson

And my code is in a bitbucket repository. Let's say it's bitbucket.org/me/myapp. When I started, I had a checkout of myapp on the desktop. So there was a folder ~/Desktop/myapp, in which was a .git/ folder and all the stuff mentioned above.

Restructuring: This was probably overkill, but it worked. I started by creating a new folder on the desktop called myapp_heroku. In it, I made a src/bitbucket/me folder, and I moved myapp/ from the Desktop to that place. I also changed my setgopath.sh script, so that Desktop/myapp_heroku is the new GOPATH.

Note: now when I'm working on this project, I traverse all the way into the src/bitbucket.org/me/myapp folder, and I work there, but when I do a 'go install' or a 'go get', things are placed a few levels up in the directory tree.

After restructuring, I removed some cruft from the build folder. Previously, there were bin/ and pkg/ folders in myapp... I got rid of them. I also removed any source folders that were fetched via 'go get', because dependent files go elsewhere now.

Using godep: Our goal in this step is to get all the code we depend on, in a manner that will ensure that Heroku grabs the same code when it builds updated versions of the app.

This is where things became un-intuitive. Go, of course, doesn't have any built-in mechanism for managing dependencies. Godep essentially just vendors everything into the source tree, which I don't particularly like, but it suffices.

Naturally, we need to get godep first, and add it to our path:

go get github.com/tools/godep
export PATH=$PATH:$GOPATH/bin

With that in order, we should restructure our repository ever so slightly:

git mv src cmd
git mv src/appserver src/myapp

I don't know why these steps were necessary. But stuff really didn't work until I made both of those changes. The Heroku docs obliquely state the first requirement, without any explanation, and the second requirement (which is that the main program you want to run should have the same name as your repository) was just a fact of how the tutorials I read all were done. None of those tutorials had multiple executables in their projects.

(Update: I might not have needed to rename src/appserver.)

We can use godep to fetch the packages on which we depend:

godep get golang.org/x/oauth2
godep get golang.org/x/oauth2/google
godep get google.golang.org/appengine
godep get gopkg.in/mgo.v2
godep get gopkg.in/mgo.v2/bson

Oddly, when fetching oauth2, we get an error that appengine isn't available. For me, doing a recursive get (godep get golang.org/x/oauth2/...) didn't work. So I manually got one more package.

Now we can take the 'vendoring' step:

godep save ./...

And voila! There's a folder called 'vendor', with all of the code we depend upon, and there's also a Godep folder. Too bad it won't work.

The problem is that we're going to push our code to a Heroku "dyno" (think "container") and it's going to build the code. But the mgo.v2 library's optional sasl support will be built when we push to Heroku. That support depends on libsasl-dev being available on the host machine at build time. The image for the Heroku dyno I'm using doesn't have libsasl-dev. So if we were to push this repository to Heroku, it wouldn't build, and the code would be rejected.

The fix is easy: just delete the sasl folder from the vendored mgo.v2:

rm -rf vendor/gopkg.in/mgo.v2/internal/sasl/

Ugly, but it works. And indeed, we're close to having everything work at this point. To test that our vendoring is good, try to locally build the project:

godep go install -v bitbucket.org/me/myapp/cmd/myapp

The code should build... and it should use the vendored versions of the libraries.

Heroku Stuff: Heroku has a few more requirements that we need to satisfy. First, we need a file called Procfile. Its contents will just be "web: myapp". Second, we need an app.json file. Its contents are a bit more complex, though still straightforward:

{
  "name": "myapp",
  "description": "MyApp App Server",
  "keywords": [
    "go",
    "MyApp"
  ],
  "image": "heroku/go:1.6",
  "mount_dir": "src/bitbucket.org/me/myapp",
  "website": "https://bitbucket.org/me/myapp",
  "repository": "https://bitbucket.org/me/myapp"
}

Now we can actually create the heroku app. I was working in Git Bash for Windows, which isn't supported by the Heroku toolbelt. So I had to switch to the command prompt, and log in:

cd \Users\Me\Desktop\myapp_heroku\bitbucket.org\me\myapp
heroku login
heroku app:create myapp

At this point 'heroku local' should work. To push to Heroku, we first 'git add' the vendor folder and all of our other recent additions, and then 'git commit'. Then we can 'git push heroku master'. This takes longer than a usual git push, because it doesn't finish until Heroku is done building and verifying our program.

Are We Done Yet? Not really. If you 'heroku run bash', you can see that bin/admin is present in the dyno, as is bin/myapp. That's a good sign. But our app isn't running yet. One issue I had was that I needed to manually start the app:

heroku ps:scale web=1

The other issue is that we didn't yet set up the environment variables on Heroku. We need to 'heroku config:set DBCONNECTSTRING=...' in order to let our app know how to find our cloud-hosted MongoDB instance, we need to set some OAUTH secrets, and we need to set environment variables for whatever else the app is expecting. But that depends on the app, not on Heroku, so I'm not going to discuss it here.

Wrap-Up: It took longer than I expected to get this to work. Since I'll probably have to do it again, I thought it would be worth writing up the steps I took. If this is helpful to you, too, please leave a comment and let me know.

Mail merge is a funny thing. Once a year, I use "mail merge" in Microsoft Office to produce envelopes that are physically mailed. Mail merge is really good for that... you can make a single word doc that is easy to print, and then you've got all the physical documents you need, ready to be taken to a physical post office.

As a professor, there are many, many times that I need to do a mail merge that results in an email being sent. Partly because I do a lot of work in Linux environments, and partly because of other oddities of how I like to work, I usually have a hybrid Excel-then-text workflow for this task.

The first step is to produce a spreadsheet, where each column corresponds to the content I want placed into an email. Ultimately, I save this as a '.csv' file. Importantly, I make sure that each column corresponds to text that requires no further edits. If I'm sending out grades, I'll store three columns: your sum, the total, and your average. You could imagine something like the following:

bob@phonyemail, 15, 20, 75%

sue@phonyemail, 19, 20, 95%

...

The third step (yes, I know this sounds like Underpants Gnomes) is that I have one file per email, saved with a name corresponding to the email address, ready to be sent, and I use a quick shell script like this to send the files:

for f in *; do mutt -s "Grade Report" -c myemail@phony.net $f < $f; done

That is "for each file, where the name happens to be the same as the full email address of the recipient, send an email with the subject 'Grade Report', cc'd to me, to the person, and use the content of the corresponding file as the content of the email".

So far, so good, right? What about phase two? I'm pretty good with recording emacs macros on the fly, so I used to just record a macro of me turning a single line of csv into a file, and then replay that macro for each line of the csv. It worked, it took about 10 minutes, but it was ugly and error-prone.

I recently decided to start learning Google Go (in part because one of the founders of a really cool startup called Loopd pointed out that native code performance can make a huge difference when you're doing real-time analytics on your web server). Since I've simplified my problem tremendously (remember: the csv has text that's ready to dump straight into the final email), the Go code to make this work is pretty simple. Unfortunately, it wasn't as simple to write as I would have hoped, because the documentation for templates is lacking.

Here's the code:

package main

import ("encoding/csv"; "flag"; "io"; "os"; "text/template")

/// Wrap an array of strings as a struct, so we can pass it to a template
type TWrap struct { Fields *[]string }

/// Parse a CSV so that each line becomes an array of strings, and then use
/// the array of strings with a template to generate one file per csv line
func main() {
 // parse command line options
 csvname := flag.String("c", "a.csv", "The csv file to parse")
 tplname := flag.String("t", "a.tpl", "The template to use")
 fnameidx := flag.Int("i", 0, "Column of csv to use as output file basename")
 fnamesuffix := flag.String("s", "out", "Output file suffix")
 flag.Parse()

 // load the text template
 tpl, err := template.ParseFiles(*tplname)
 if err != nil { panic(err) }

 // load the csv file
 file, err := os.Open(*csvname)
 if err != nil { panic(err) }
 defer file.Close()

 // parse the csv, one record at a time
 reader := csv.NewReader(file)
 reader.Comma = ','
 for {
  // get next row... exit on EOF
  row, err := reader.Read()
  if err == io.EOF { break } else if err != nil { panic(err) }
  // create output file for row
  f, err := os.Create("./" + row[*fnameidx] + "." + *fnamesuffix)
  if err != nil { panic(err) }
  defer f.Close()
  // apply template to row, dump to file
  tpl.Execute(f, TWrap{Fields:&row})
 }
}

This lets me make a "template" file, merge the csv with it, and output one file per csv row. Furthermore, I can use a specific row of the csv to dictate the filename, and I can provide extensions (which makes the shell script above a tad trickier, but it's worth it).

The code pulls in the csv row as an array of strings. That being the case, I can wrap the array in a struct, and then access any array entry via {{index .Fields X}}, where x is the array index.

To make it a tad more concrete, here's a sample template:

Programming Assignment #1 Grade Report

Student Name:  {{index .Fields 2}} {{index .Fields 1}}
User ID:       {{index .Fields 0}}
Overall Score: {{index .Fields 3}}/100

The script uses command line arguments, so it's 100% reusable. Just provide the template, the csv, the column to use as the output, and the output file extension.

The code isn't really all that impressive, except that (a) it's short, and (b) it is almost as flexible as code in a scripting language, yet it runs natively. The hardest part was finding good examples online for how to get a template to write to a file. It's possible I'm doing it entirely wrong, but it seems to work. If any Go expert wants to chime in and advise on how to use text templates or the csv reader in a more idiomatic way, please leave a comment.

Professor Spear's Blog

Tuesday, June 21, 2016

Launching a Go App on Heroku

Thursday, October 29, 2015

E-Mail Merge in Go