Computer Science for the Slothful: 2017

Wednesday, July 12, 2017

Interfaces in action with the Go image package

The Go image package, besides being fun to use, is also an excellent example of interfaces in Go. Image formats like JPEG, GIF, and PNG are often very different from each other, and so are the different color formats you can use, like old-school 256 colors, red/blue/green/alpha, grayscale, and Y’C_BC_R. All these differences for working with effectively the same thing, a bunch of pixels on a rectangle, are where Go’s interfaces shine!

A quick look at interfaces

If you’re learning Go, interfaces can be a confusing kind of type to work with, but once you have them figured out, you get a lot of power for abstraction and object-oriented Go programming. An interface type, rather than telling you what its values are, tells you what its values can do; when you define an interface type, you define which methods there are on an interface, and then an object of any type can be used as that interface type as long as it has all the methods on the interface. So the familiar error interface

type error interface {
    Error() string
}

means that an error can be any concrete type as long as it has an Error method that returns a string; it can be as simple as just a struct with a string

type errorString struct {
    s string
}

func (e *errorString) Error() string {
    return e.s
}

or it could be a more detailed struct to give you information about when an error happened and how severe it is

type detailedError struct {
    message string
    severity string
    tm time.Time
}

func (d *detailedError) Error() string {
    return fmt.Sprintf("%s error '%s' happened at time %v",
        d.severity, d.message, d.tm)
}

That means you can write code that only cares about what an object’s methods are rather than what that object is made of; the sort package, uses an interface to give you a generic way to sort anything, and by defining the net/http ResponseWriter as an interface, developers can make their own ResponseWriter implementations for doing things like logging HTTP status codes.

One interface you’ll see a lot of in Go is the empty interface; an object of any type can be an interface{} as long as it implements all zero of its methods, which means all types are interface{}s. However, while it’s tempting to make a function that takes in any type by having it take in an interface{}, you should steer clear of that without a good reason. This is because to do anything useful with an interface{}, you’ll end up with a lot of complicated type-checking logic, which is tough to maintain and test and can lead to panics if you miss a case. But it does have its uses, like json.Marshal and Unmarshal converting almost any kind of struct to and from JSON.

The color interface

As I mentioned before, there are a lot of different color schemes to give you a lot of ways to say what color a pixel is, so we need some way of unifying them. Computer screens emit color in terms of red, green, blue, and alpha (alpha = how transparent a pixel is), so if you take any pixel on any kind of image, you should be able to see how red, green, blue, and see-through it is. And if you import the image/color package, you’ll see that that’s exactly what the Color interface does

type Color interface {
    RGBA() r, g, b, a uint32
}

BOOM! Whether you have a pixel in CMYK (cyan/magenta/yellow, like in comic books), RGBA (8 bits each telling you how red, green, and blue a pixel is), RGBA64 (that but with 16 bits per color), or grayscale, those can all be represented with 32-bit integers. What struct fields a color has, how many bits it is, what it does with those bits to put colors onto our eyeballs, whether or not the color is capable of being transparent, we don’t care; if it can tell us those red, green, blue, and alpha values in 32 bits, it’s a Color!

So that means we can have types like

type RGBA64 struct {
    R, G, B, A uint16
}

type Gray struct {
    Y uint8
}

This flexibility also means we can represent pixels in as much or as little detail, and as much or as little memory, as we need. Images have a lot of pixels (which is why we have special super fast GPU hardware to render them all fast, especially in video games), so if you can represent an image in a space-efficient way, you can get enormous performance boosts from having to go to main memory less often.

The image interface

Now that we know what a color is, once we have a rectangle to put it on, we’ve got an image. And the image interface gives you that rectangle

type Image interface {
    ColorModel() color.Model
    Bounds() Rectangle
    At(x, y int) color.Color
}

ColorModel tells you what format the colors of the image’s pixels are in. Bounds tells you the coordinates of the top-left and bottom-right pixels of this image (the top-left is not necessarily (0,0)) And At tells you the color of the pixel at the (x, y) coordinates.

Note, by the way, that unlike in the Cartesian coordinate planes from math class, the Y-axis of an image starts at the top-left. So the pixel at (0, 100) is 100 pixels below, not above, the pixel at (0, 0).

Just like how there are struct implementations of colors, there are struct implementations of images corresponding to colors, like image.RGBA, image.RGBA64, image.Gray, image.CMYK, etc, with different formats for each color. Let’s look at RGBA64, in particular its Pix field:

type RGBA64 struct {
    Pix []uint8
    Stride int
    Rect rectangle
}

As you can see, there’s no two-dimensional array/slice of slices here. Not only that, but there’s no field of type color.RGBA64. We just have a slice of all the bytes that go into each pixel, and we fetch those bytes with RGBA64.At(x, y). This means we get the space efficiency and cache locality of having all the pixels’ data close together in memory, but with the convenience of working with pixels at their X and Y coordinates.

These image-implementing structs also have a Set method for setting one of the pixels, which we can pass a color.Color into, and then they take it from there. No direct messing with the Pix array to make our code look grotty!

Turn a sloth purple with image/jpeg

We have the Image interface, we have the Color interface, and we know all these image implementations like Gray and RGBA64. But what’s a JPEG in the image package?

The implementations of different image file formats are kept in jpeg, gif, and png subpackages. And once we import one of those subpackages, we can pull a JPEG, GIF, or PNG out of the bytes from an io.Reader (a file, an input stream like your camera, or anything else you can read bytes of a JPEG from) and get an image.Image to work with. Then we can convert that image.Image to an actual file format with jpeg.Encode, gif.Encode, or png.Encode. Let’s try that by making this sloth picture purple!

Download the sloth to a folder as sloth.jpg, and then make a file in the same folder called main.go:

func main() {
    f, err := os.OpenFile("./sloth.jpg", os.O_RDONLY, 0666)
    if err != nil {
        log.Fatalf("could not open sloth.jpg - %v", err)
    }

    img, _, err := image.Decode(f)
    if err != nil {
        log.Fatalf("could not decode sloth.jpg - %v", err)
    }

    img = purple(img)
    if err := jpeg.Encode(os.Stdout, img, nil); err != nil {
        log.Fatalf("error encoding the new JPEG: %v", err)
    }
}

We open sloth.jpg with os.OpenFile to get an os.File (which we can pass to image.Decode since File implements io.Reader), we turn the image purple with a purple function we will define, and then we convert it to a JPEG by jpeg.Encodeing it to standard output. Now all we need to do is implement purple!

NOTE: Due to formatting issues, the less than signs are displayed as <. Sorry for the inconvenience

func purple(img image.Image) image.Image {
    dst := image.NewRGBA(img.Bounds())
    b := dst.Bounds()
    for y := b.Min.Y; y &lt; b.Max.Y; y++ {
        for x := b.Min.X; x < b.Max.X; x++ {
            px := color.RGBAModel.Convert(img.At(x, y)).(color.RGBA)
            if px.R+50 <= 0xFF {
                px.R += 50
            } else {
                px.R = 0
            }

            if px.B+50 <= 0xFF {
                px.B += 50
            } else {
                px.B = 0
            }
            dst.Set(x, y, px)
        }
    }
    return dst
}

First we make a new RGBA image with NewRGBA, passing in the original image’s Bounds so it has the same top-left and bottom-right coordinates as the original. Then, we loop through each pixel, from the top-left (b.Min.X, b.Min.Y) to bottom-right (b.Max.X, b.Max.Y), and inside the loop is our color conversion code...

We convert each pixel to RGBA with Color.RGBAModel, which lets us convert any Color to a concrete RGBA struct. So we do color.RGBAModel.Convert(image.At(x, y)) to get the pixel at the current coordinates and convert it to an RGBA. The converted color is then finally converted to a concrete RGBA struct with .(color.RGBA), and now we’ve got an RGBA we can work with and its data are from the original pixel!

Now we just make each pixel 50 values redder and 50 values bluer. However, in an RGBA, since we only get 8 bits of color for red, green, and blue, so if we make a color value higher than 255, we get an integer overflow. We could stick to making the reddest and bluest pixels get red/blue values of 255, which does make the picture more purple, but instead let’s give those pixels values of 0 for a fun special effect.

With the red and blue values incremented, we run dst.Set(x, y, px) to give our new image a new pixel. We then return that image, which in main gets encoded back into a jpeg. To see this in action, in the terminal run ./purple.go > purplesloth.jpg, and if you open purplesloth.jpg you should get:

As you can see, the image/color and image packages give us a useful abstraction for treating images as rectangles with colors on them, without having to think as much as you would in a language like C on everything you need to do with the bits representing the colors. Have fun manipulating images, and STAY SLOTHFUL!

Sunday, January 15, 2017

How to crush it in COMP 105

Four years ago, I took COMP 105 (Programming Languages) in my last semester at Tufts, and the first half of that class almost torpedoed my whole final semester into disarray. I was regularly starting the projects for the class late and working on them solo, a lot of the more theoretical questions on homework went unanswered, and any time I caught up in 105, I fell behind in my other classes. All-nighters were both frequent - and ineffective.

Despite how the first half went, the second half was an epic rebound to the proudest C I've ever gotten on something, and 105 turned out to be incredibly useful for learning new languages. You really can get a lot out of 105 even if you have no plans to be an academic, so I'm writing this blog post to give some tips on how to get the most out of this class and become an excellent polyglot programmer. The tips all seem pretty obvious, but I saw firsthand that following them was pivotal to changing 105 from a nightmare class to one of the coolest classes at Tufts!

Talk to everyone

Easily the #1 piece of advice I've got for anyone in 105 is to talk to everyone who has something to do with the class. The biggest mistake I made in the beginning of 105 was that I didn't start doing pair programming on 105 projects until after the midterm. As a result, it took forever to understand the material and I couldn't finish most of the assignments, leaving quite a few questions unanswered with no credit. Talking to everyone, however, was the crown jewel of my mid-semester comeback.

While COMP 40 is considered the pair programming class in Tufts CS, I would actually say 105 is more deserving of that title. 40 has long assignments, but not much theory, so a 40er could solo pretty much any of COMP 40's projects with a few extra hours. 105, on the other hand, is more theoretical and has more new concepts to delve into, like operational semantics and the theory behind type systems. Because of that, in addition to a second set of eyes for faster bug-spotting, pair programming adds a second person to think about the concepts with and trade ideas. That'll more than double the speed of you and your partner mastering the material and figuring out how to finish the projects as your two brains throw the wrong ideas at each other until they turn into the right ideas.

Besides pair programming, “talk to everyone” also applies with showing up at Halligan for TA and professor office hours. The TAs are all former 105 students, and unlike in 40, they answer questions from everyone at once, rather than one programming pair at a time. As a result, when I started going to office hours I was getting answers not only to my own questions, but also hearing answers to other people's questions, which were in my blind spots. Not to mention, by going to TA office hours you'll get to know more of your fellow Tufts CS students, so that's another win.

WARNING: While communication was key to doing well in 105, make sure you read up on the academic integrity rules for the class! The big ones I can remember are that you can't show your code to anyone else or read anyone else's code, and you have to cite anyone you shared ideas with when you send your homework in, but obviously, follow all the rules so you have a clean transcript.

Keep up the habits from 40

If you already took 40, basically anything that helped you succeed there also works in 105. As I mentioned, pair programming and working in Halligan consistently save you a lot of time. Besides that, there's a big takeaway from COMP 40's design documents. They're not part of your grade in 105 like they were in 40, but in 105 (and on the job), taking the time to clearly specify what you want your code to do is a “secret sauce” to rival the peanut sauce from Stir Fry Night at Carmichael. That planning helps as you're thinking about how to solve the problems on the project and figuring out whether or not your code works. If you haven't taken 40 yet, you can also get in the habit of specification by getting your feet wet learning test-driven development (TDD). Many popular languages have frameworks for that, like JavaScript (Mocha), Go (standard testing library), and Ruby (Minitest). Knowing TDD, by the way, also happens to be a great skill to have for coding on the job!

Another important habit 40 taught was to document your code well. Documentation is part of your grade on 40 and 105's coding assignments, so if anything it gives you more points. But like with specification, it forces you to think more about the code you're writing during the period of time where you're stuck figuring out how to make it work. Moreover, if you can only get some parts of a programming problem solved, documentation will help the TAs understand what your thought process was as you were writing your code.

Finally, the design documents and pair programming in 40 made it really hard not to start early. In 105, as Norman said in the first lecture when I was there, starting early is more useful in 105 than in a lot of other classes. I'm totally with Norman on that; since a lot of 105 questions are theoretical, you'll want those questions to enter your head as soon as possible. That'll get your brain to start working on them, and you'll also figure out what you're stuck on so you can ask good questions at professor and TA office hours!

Read before lecture so you know what the experts are saying

Something else I found when I was in 105 is that any time I did the reading before class, I was able to follow along. If I hadn't done the reading beforehand, though, I would fall off the lecture about halfway through, which meant I had to do a whole lot more reading afterward to pound the ideas into my head. Even if the material doesn't make sense the first time you read it, it'll make more sense hearing it in class if you're seeing it the second time there.

If you're taking 105 as an undergrad, you'll probably be seeing a lot of the concepts in the class for the first time, like the theory and mathematical notation of how programming languages work, and how interpreters in those languages work to give you features like higher-order functions, type systems, pattern matching, and type inference. Your professor will be someone who really knows this material well, so you'll want to be able to follow along so you can learn the material from an expert. While it can be tough to find the time, if you invest in doing the reading ahead of time, you will have already wrestled with some of the theory and had it in your head going into lecture. Then even if you don't completely have the lay of the land on the material, you'll still be able to pick up where you left off.

Where to go after 105

Like I said at the beginning of this blog post, my favorite thing about 105 is that it sets you up to quickly learn any new programming language you want, and with so many languages popping up, it's a great time to do just that. I highly recommend learning more functional programming with the knowledge you will have gotten in 105 since it's getting popular even in imperative languages. On my job at Diffeo, which uses Go, our utility libraries include toolbelts of functions that take in other functions as parameters for actions like database interactions and caching. Functional features are also being added in newer versions of Java and C++, so functional programming is here to stay.

Haskell is the main language I recommend learning straight out of 105 because it has all the best features of Standard ML like pattern matching and type inference! Also, while Haskell is stereotyped as something only academics use, that's rapidly changing; startups like Helium, an Internet of Things startup, use Haskell in production, so if you thought ML's type system was cool, don't be discouraged by Haskell's reputation. You can check out the video below to hear about how Helium uses Haskell in production.

If you were new to functional programming before 105, this class also teaches you how to tackle new programming paradigms, so you can use that to learn concurrent programming, where you write code to do things simultaneously. Functional languages like Erlang and Elixir are natural fits for concurrency since they offer the safety of working with immutable data. But these days concurrency is becoming easier in imperative languages too. While pthread.h is a nightmare to work with in C, Go uses a “share memory by communicating” style for intuitive concurrency, and Rust's ownership system guarantees safety when working with your concurrent code, plus a sweet inferred, generic type system.

By the way, if you're gonna be at Tufts or close to Boston after 105, you're in a city with AWESOME programming communities. You can find a Boston group on Meetup for just about any language you want to learn, so you can meet professionals in your favorite languages, learn how to code in them like a pro, get your own projects off the ground, and even give tech talks. If you're learning a language, going to these meetups is a great way to really bring the language to life!

If you're in 105, while it'll be a hard class, the challenge will be worth it for becoming a polyglot programmer. Make yourself a regular at office hours, pair program, do the reading in advance, and do what you did to succeed in 40, and the class will get a whole lot easier, and you'll get more out of it. Best of luck, Jumbos, and stay slothful!

Subscribe to: Posts ( Atom )