Thursday, October 29, 2015

E-Mail Merge in Go

Mail merge is a funny thing.  Once a year, I use "mail merge" in Microsoft Office to produce envelopes that are physically mailed.  Mail merge is really good for that... you can make a single word doc that is easy to print, and then you've got all the physical documents you need, ready to be taken to a physical post office.

As a professor, there are many, many times that I need to do a mail merge that results in an email being sent.  Partly because I do a lot of work in Linux environments, and partly because of other oddities of how I like to work, I usually have a hybrid Excel-then-text workflow for this task.

The first step is to produce a spreadsheet, where each column corresponds to the content I want placed into an email.  Ultimately, I save this as a '.csv' file.  Importantly, I make sure that each column corresponds to text that requires no further edits.  If I'm sending out grades, I'll store three columns: your sum, the total, and your average.  You could imagine something like the following:
bob@phonyemail, 15, 20, 75% 
sue@phonyemail, 19, 20, 95%
...
The third step (yes, I know this sounds like Underpants Gnomes) is that I have one file per email, saved with a name corresponding to the email address, ready to be sent, and I use a quick shell script like this to send the files:


for f in *; do mutt -s "Grade Report" -c myemail@phony.net $f < $f; done


That is "for each file, where the name happens to be the same as the full email address of the recipient, send an email with the subject 'Grade Report', cc'd to me, to the person, and use the content of the corresponding file as the content of the email".

So far, so good, right?  What about phase two?  I'm pretty good with recording emacs macros on the fly, so I used to just record a macro of me turning a single line of csv into a file, and then replay that macro for each line of the csv.  It worked, it took about 10 minutes, but it was ugly and error-prone.

I recently decided to start learning Google Go (in part because one of the founders of a really cool startup called Loopd pointed out that native code performance can make a huge difference when you're doing real-time analytics on your web server).  Since I've simplified my problem tremendously (remember: the csv has text that's ready to dump straight into the final email), the Go code to make this work is pretty simple.  Unfortunately, it wasn't as simple to write as I would have hoped, because the documentation for templates is lacking.

Here's the code:


package main

import ("encoding/csv"; "flag"; "io"; "os"; "text/template")

/// Wrap an array of strings as a struct, so we can pass it to a template
type TWrap struct { Fields *[]string }

/// Parse a CSV so that each line becomes an array of strings, and then use
/// the array of strings with a template to generate one file per csv line
func main() {
 // parse command line options
 csvname := flag.String("c", "a.csv", "The csv file to parse")
 tplname := flag.String("t", "a.tpl", "The template to use")
 fnameidx := flag.Int("i", 0, "Column of csv to use as output file basename")
 fnamesuffix := flag.String("s", "out", "Output file suffix")
 flag.Parse()

 // load the text template
 tpl, err := template.ParseFiles(*tplname)
 if err != nil { panic(err) }

 // load the csv file
 file, err := os.Open(*csvname)
 if err != nil { panic(err) }
 defer file.Close()

 // parse the csv, one record at a time
 reader := csv.NewReader(file)
 reader.Comma = ','
 for {
  // get next row... exit on EOF
  row, err := reader.Read()
  if err == io.EOF { break } else if err != nil { panic(err) }
  // create output file for row
  f, err := os.Create("./" + row[*fnameidx] + "." + *fnamesuffix)
  if err != nil { panic(err) }
  defer f.Close()
  // apply template to row, dump to file
  tpl.Execute(f, TWrap{Fields:&row})
 }
}


This lets me make a "template" file, merge the csv with it, and output one file per csv row.  Furthermore, I can use a specific row of the csv to dictate the filename, and I can provide extensions (which makes the shell script above a tad trickier, but it's worth it).

The code pulls in the csv row as an array of strings.  That being the case, I can wrap the array in a struct, and then access any array entry via {{index .Fields X}}, where x is the array index.

To make it a tad more concrete, here's a sample template:


Programming Assignment #1 Grade Report

Student Name:  {{index .Fields 2}} {{index .Fields 1}}
User ID:       {{index .Fields 0}}
Overall Score: {{index .Fields 3}}/100

The script uses command line arguments, so it's 100% reusable.  Just provide the template, the csv, the column to use as the output, and the output file extension.

The code isn't really all that impressive, except that (a) it's short, and (b) it is almost as flexible as code in a scripting language, yet it runs natively.  The hardest part was finding good examples online for how to get a template to write to a file.  It's possible I'm doing it entirely wrong, but it seems to work.  If any Go expert wants to chime in and advise on how to use text templates or the csv reader in a more idiomatic way, please leave a comment.

No comments:

Post a Comment