Thursday, September 18, 2014

Andy's Node Tutorial Part 2: npm, express.js, and the Node community Part 1

Now that you've tried making some pure-Node servers and seen the basics of asynchronous code, time to get into the fun stuff. Most of this tutorial will be about programming with Express, but the other really important thing we will be seeing in this tutorial is that we are now starting to use modules that are not part of the core of Node.js but instead were made by other people in the programming community, and we will be getting our new modules with an awesome tool called the npm package manager.

Getting Express with npm

One really awesome thing about using the npm package manager is that it comes bundled with Node, so if you didn't have any issues installing Node, you shouldn't need to jump through any more "install stuff" hoops to get it. Just type npm in the command line to run this software and you should get a message like this:

Now to get Express. First make a directory called express-1 and in it create a file called package.json and put this in the file:

{
  "name": "first-express-app",
  "version": "0.0.0",
  "dependencies": {
    "express": "4.8.7"
  }
}

package.json is a JSON file where you can describe things like the name and version of your project, what modules it uses as dependencies, who is contributing to the project, what license your project is under, and a bunch more stuff. The only fields that are required in package.json are the name and a version.

 
Besides describing your project, though, package.json is also used for communicating with the npm package manager. Take a look at the lines:

"dependencies": {
  "express" : "4.8.7"
}

This specifies Express as its one dependency. The part : "4.8.7" specifies that we specifically want to use Express 4.8.7, which is a fairly recent version of Express as of when I wrote this tutorial.

Note: To get the latest version, you would put "express" : "latest" in your dependencies instead of "express" : "4.8.7", but I chose Version 4.8.7 so anyone using this tutorial is using the same version as me.


Now in the command line, run npm install and you should see this:

There are some things to note here.
  • First of all, the stuff below express@4.8.7 node_modules\express is Express's dependencies. Your project is using Express as a dependency, and Express in turn has dependencies of its own.
  • Also, note that in the express-1 folder a new folder, node_modules, appeared. If you look inside this folder you will see that that's where Express is, and if you look in Express's folder, you will see that it too has a node_modules folder. Look inside there and you will see all of Express's dependencies.

Now that Express is in your node_modules folder, if you have
var express = require('express'); in a JavaScript file in the express-1 directory, you will be able to use Express in your code.

From here on out, you're no longer a solitary Node developer using only your own code and Node's built-in modules. Welcome to the Node community!

While the code in this part of the tutorial only uses Express as a dependency, your package.json file can specify a lot of packages as dependencies. And as you can see if you look in Express's node_modules directory, your dependencies often have dependencies themselves.

So what exactly is this npm package manager we're using?

The npm package manager is the package manager for Node, and because so many developers put Node modules they created onto the npm public registry, you can use the code on there to add functionality to your project without reinventing the wheel. And they have everything. Security code, web frameworks, testing tools, you name it. If you feel like you're re-inventing the wheel when you're writing some code for a Node project, check npm and see if there's already a Node module for what you want to do. Or if you have an idea for something you want to add as a Node module, you can put write it and put it online for other developers to use. The npm package manager can do a lot of other useful stuff like running scripts and testing code, but for this tutorial we will only focus on using it for getting packages.

As an example of a package from the npm package manager, let's go to the page for Express on npmjs.org:


As you can see, it shows information like its version, a description of what it is, and who is working on the project. If you scroll further down you will see more information, such as the module's dependencies and what modules use Express as a dependency. Also, note its license. Express.js uses the MIT License, which is very permissive, and also very short.

Important note on licenses:
Since Node developers use a lot of modules made by other people, if you are going to be using someone else's module, it is very important that you know what license you are using it under. Generally I see a lot of permissive licenses out there, but some licenses require that you don't use the module for commercial purposes, and some require you to give credit to whoever made the module, so make sure you know what licenses your project's dependencies are under so you know how you can and can't use a module in your project. Luckily, in addition to the licenses being generally permissive, a lot of the licenses I see are also very short, so you won't have to worry too much about spending hours reading long licenses.

 
A few other package.json fields

Before we go on, let's just add a few more fields to package.json:

{
  "name": "first-express-app",
  "version": "0.0.0",
  "license": "MIT",
  "description": "Express tutorial project",
  "author": "(insert your name here)",
  "contributors": ["(One of your friends), (Another one of your friends)"],
  "dependencies": {
    "express": "4.8.7"
  }
}

The fields we added are "license", which I put under the MIT license, "description", which is a description of the project, and "author", which is the name of the author of the project. You can also add more contributors to your project by adding an array of contributors in the "contributors" field; if you are doing this tutorial with any friends, add them there.

For a complete list of fields you can add to your package.json file, check out


Now let's try out Express!

As I mentioned at the end of the first tutorial, you can make a web server using only the core modules of Node, but it can get to be disorganized quickly, and having to manually process the parts of the HTTP requests and responses can lead to redundant code when you are serving very similar files and it can get in the way of the big picture of what your web app, as well as leaving a lot of potential for bugs. So to get rid of some redundant HTTP processing code, let's try serving some pages with Express!

Make a new directory in express-1 called public and in it:

-Create a new directory in public called pages

-Save this code to pages/hello.html

<html>
  <head>
    <title>Home page</title>
  </head>
  <body>
    <h1>Hello world!</h1>
    <p>This page was served with Express!</p>
  </body>
</html>

Save this code to pages/index.html

<html>
  <head>
    <title>First 400</title>
  </head>
  <body>
    <h1><img src="http://localhost:34313/images/track-shoe.jpg" />First 400</h1>
  </body>
</html>

Then make a new directory in public called images and save this picture to the images directory as track-shoe.jpg.

Now in the express-1 directory, make a file called app.js and add in the code for the server:

var express = require('express'); //1
var app = express(); //1
app.use(express.static(__dirname+"/public/pages")); //2
app.listen(34313, function(){console.log("Now listening on Port 34313!");}); //3

Wow, just four lines of code and we got a server. Go to localhost:34313/hello.html, we will get our Hello world page for Express:




So how does that work?

1. The first line requires Express as a module for Node to use, and the second line calls the express function to create your Express server.
2. This line makes it so the app uses static, which is used to serve static pages from a directory. In our case, express.static takes in __dirname+"/public/pages", so...
  • __dirname in Node means the path for the current directory, so in our case, __dirname is "/path/to/the/directory/express-1"
  • So __dirname+"/public/pages" would translate to:
    •  "/path/to/the/directory/express-1/public/pages"
So express.static is being told to serve pages from that directory.
  • The whole line, app.use(express.static(__dirname+"/public/pages"));, is making it so when there is a request to the server (like
    localhost:34313/some-webpage.html), express.static looks in the public/pages directory for that page, serving it if the page exists.
3. This line starts the server in the same way you would in a regular Node app.

Express sees we are requesting hello.html, so
express.static(__dirname+"/public/pages") is applied, looking for hello.html in the public/pages directory. It finds public/pages/hello.html and serves it.

And if we request a page that doesn't exist, like localhost:34313/doesnotexist.html, Express has got us covered there:

 
Express doesn't find any file named /public/pages/doesnotexist.html, so it serves an error message

But what if we try just localhost:34313?

The page is served since index.html is there but we get a broken image because we are requesting track-shoe.jpg with:

<img src = "localhost:34313/images/track-shoe.jpg" />

The request to localhost:34313/images/track-shoe.jpg is handled by express.static(__dirname+"/public/pages"). Express looks in public/pages for an image with the path images/track-shoe.jpg, but there is no directory named public/pages/images so the request fails and we get a broken image.

Serving index.html itself looks like this:





 

But the attempt to serve track-shoe.jpg looks like this:


 
What we wanted was to serve the image at public/images/track-shoe.jpg, not public/pages/images/track-shoe.jpg, so to do that we need to add another app.use line to our server:

var express = require('express');
var app = express();
app.use(express.static(__dirname+'/public/pages')); //1
app.use('/images', express.static(__dirname+'/public/images')); //2
app.listen(34313, function(){console.log('Now listening on Port 34313!');});

And then the page should be served properly:





 

Here's how that works:

  • The request for index.html is processed by app.use(express.static(__dirname+'/public/pages'));. public/pages/index.html is found so it's served.
  • The request for images/track-shoe.jpg is made.
1. app.use(express.static(__dirname+'/public/pages')); tries to find public/pages/images/track-shoe.jpg, but it doesn't exist.
2. The request is in '/images', so
app.use('/images',express.static(__dirname+'/public/images'));
applies to the request for the image since we are requesting something in 
localhost:34313/images. So it looks in public/images for the picture, and the 
file public/images/track-shoe.jpg exists, so it is served.

So serving the home page looks like:

 
where the black arrows are the request to index.html itself, the red arrows are the image request's failed attempt at finding images/track-shoe.jpg in public/pages, and the purple arrows are the image request's successful attempt at finding track-shoe.jpg in public/images using the second app.use rule we added.


If you pass in '/' or you don't pass in any path parameter, the rule will be applied to all requests to your Express server. So the rule

app.use(express.static(__dirname+'/public/pages')); 

applies to all requests, so all requests to the server check for a file in public/pages.

But in app.use('/images', express.static(__dirname+'/public/images')); the path parameter is '/images' so the rule only applies to requests to /images. That means only requests to localhost:34313/images/[image-path-name] are processed with a rule to serve images in public/images.

 
Why not use just one app.use rule for everything?

Now, if we just put everything into the same directory as app.js and used this code (don't copy this example):

var express = require('express');
var app = express();
app.use(express.static(__dirname));
app.listen(34313, function(){console.log('Now listening on Port 34313!');});

and changed the image HTML tag in index.html from

<img src="http://localhost:34313/images/track-shoe.jpg" />

to just:

<img src="http://localhost:34313/track-shoe.jpg" />

We would be able to serve every request from the express-1 directory with just the one rule app.use(express.static(__dirname));, which would take a request and just look for a matching file. So why aren't we doing that?

Well, if we are serving a lot of webpages, it can be really easy to get a website with a lot of pages. Because of that, we want to keep the directories for our project more structured. If we have a rule for serving requests to HTML pages, a rule for serving requests to CSS stylesheets, a rule for serving requests to front-end JavaScript code, and a rule for serving requests for images, we can put the folders for the pages, stylesheets, front-end JavaScript, and images wherever we see fit. And in our case, the static HTML pages and images are in the public directory.

But there is also an alarming security hole in the idea of storing all the static HTML, CSS, JavaScript, and image files for the web app in the same directory as the server where they can be served statically. If you are running this server and instead of requesting an HTML page someone requests localhost:34313/app.js...





 

THEY CAN SEE YOUR SERVER-SIDE SCRIPTS!

To prevent this, the code we were using made it so requests for static pages and images were handled by looking in their sub-directories in the public directory.

In general, it's a really good practice in web development to keep anything you're serving statically, like static HTML pages, CSS, client-side JavaScript files, and images in a public directory. This keeps it away from everything else, like server-side scripts, templates, and database data.

Now let's serve something that isn't static

As I mentioned in the first tutorial, in this tutorial series we are going to make a social networking site for distance runners to organize group runs, start and find track meets, and start and join running clubs and track teams.

This site will be called First 400 since there are 400 meters in a track and this site is supposed to be open to new runners looking for a track team to join to go to meets and end up on that first lap of a race.

So for the first thing we will be doing for the site in this tutorial is to make it so we can serve webpages to advertise group runs.

We could just make an HTML page for each one, but that would mean we'd be storing a lot of HTML pages. So instead, we are going to store the data about each group run, including the run's name, a description of the run, and the filename of a picture, and then we will use that data to generate the HTML for the webpage we want to render.

First, save this picture to dog-days.jpg 



Then, at the top of app.js, add this:

/*Note: This object is not how we would get data in a real project, and it would
*not be sustainable to put your data in your server file. On a real project
*you would store your data in a file, or better yet, a database.
*/
var runs = {
    'dog-days': {
    'name': 'Dog Days of Summer 8K',
    'desc': 'An 8000-meter race through Belmont that finishes with a lap '+
    'around Cambridge\'s Fresh Pond. Bring your dogs for the '+
    'cheering section at the finish line!',
    'picture': 'dog-days.jpg'
   },
    'bu-and-back': {
    'name': 'BU and Back 9-mile run',
    'desc': 'A run from Tufts to and from BU with a side order of '+
    'awesome wind, courtesy of the Charles River!',
    'picture': ''
  }
};

And after all the other rules in the server, add this:

app.use('/runs/:name', function(req, res, next){          //1, 2
  var runName = req.param('name');                        //3
  var runData = runs[req.param('name')];
  var runPicture = runData.picture !== '' ? runData.picture :
                                            'track-shoe.jpg';

   var runHTML =
    '<html>'+
      '<head>'+
        '<title>'+runData.name+'</title>'+                 //4
      '</head>'+
      '<body>'+
        '<img src = "/images/'+ runPicture +'" />'+       //4
        '<h1>'+runData.name+'</h1>'+                      //4
        '<p>'+runData.desc+'</p>'+                        //4
      '</body>'+
   '</html>';

   res.send(runHTML);                                       //5
   next();
});

If you request localhost:34313/dog-days, you should get:



 
And if you request localhost:34313/bu-and-back, you should get:



So how does this work?

1. For this rule we are adding to the server, we are processing requests to /runs/:name. :name is a URL parameter; for Express rules where you want to catch multiple similar URL requests, you can specify parameters for your URL,

So in this example the one parameter is :name, so if you request,
localhost:34313/runs/dog-days, the :name parameter is 'dog-days', and in
localhost:34313/runs/bu-and-back, the :name parameter is 'bu-and-back'.
2. The second parameter to this call to app.use is a function that tells the server how to handle request and response when the user is requesting a page in /runs/:name.
3. This function processes requests to /runs/:name, so to get what :name is, we use req.param('name'), which we then use to get the data for the run with that name.
4. As you can see, we are making the HTML as a string, so in different parts of the HTML we are adding the data we got in runData into different parts of the HTML.
5. res.send(runHTML); sends the HTML we made in the HTTP response, rendering our webpage.

Here is a diagram of what happens:



 
So now we route requests to /runs/:name to serve a bunch of pages in the same format with a new app.use rule in the server. But why is app.use taking in a function, and what's that next parameter in the function?

You might have guessed before, but in the previous example, with express.static(__dirname+"/public/pages"), what gets returned when you call the express.static function is itself a function; express.static is a function that returns another function. So when we are calling

app.use('/images', express.static(__dirname+'/public/images'));

What we are getting for the second parameter of app.use is the function returned from express.static(__dirname+'/public/images'). So both that app.use rule and the app.use rule we just created ultimately take in a path as the first argument and then a function that processes the HTTP request and response as the second argument.

NOTE:
The idea of a function taking in another function or returning another function is part of what's called functional programming. It is a very powerful and expressive style of programming, and it is used all over the place in JavaScript and Node. If you are new to the concept of functional programming, confused about functional programming, or just need to brush up on it, I recommend reading the functional programming chapter of Eloquent JavaScript because functional programming is everywhere.

Express Middleware

Now, since we are talking about those functions we give to app.use to make what I have been referring to as "rules" for the server, let's hear their real name: Express middleware. Express middleware is what Express uses for processing the HTTP requests and responses.

In Express, you can define many middleware functions that the request and response go through before the response is finally served, so in

app.use(express.static(__dirname+'/public/pages'));
app.use('/images', express.static(__dirname+'/public/images'));
app.use('/runs/:name', function(req, res, next){ ... });

those three functions are Express middleware that handle your requests.

  • The first one has no path parameter so all requests go through that middleware
  • Only requests to /images go through the second middleware
  • Only requests to /runs/:name go through the third middleware.

Because of that, with that request to localhost:34313/runs/dog-days, before the function we saw earlier gave us the webpage we got, the first middleware also attempted to process our request, but couldn't find a file called public/pages/runs/dog-days so that middleware failed and we moved on to the '/runs/name' middleware.

Why did two middleware functions try to process the app? When your request goes through all of the Express middleware, what happens is that the request goes through every middleware that applies to that request. So a request to /runs/dog-days is processed by the first and third middleware, with the server response ultimately being sent in the third middleware.

For another example, on the server test-middleware.js:

 
var express = require('express');
var app = express();
app.use(function(req, res, next){req.m = 1; next()});
app.use(function(req, res, next){req.m++; next()});
app.use('/three', function(req, res, next){req.m++; next()});
app.use(function(req, res, next){
  req.m++;
  res.send(req.m + " middleware were used.");
});
app.listen(34313, function(){console.log('Now listening on Port 34313!');});

localhost:34313 would go through the first two middleware and the last one and send the response "3 middleware were used." and

localhost:34313/three go through all four middleware would send the response "4 middleware were used".




 
It is common to have a series of middleware that can do things like logging HTTP requests and responses in the console, giving more data to the request, making sure a user is logged in, processing cookies, and many other things, so several middleware functions can process a request before a response is finally served. The way I think about it is that middleware functions in an Express server are like an assembly line processing a request to make the final response.


The next() function

Also, something else worth mentioning, what does next() do? To find out, let's get rid of all the calls to next() in that code and then run the code again.

Add this to no-next.js and run node no-next.js:

var express = require('express');
var app = express();
app.use(function(req, res, next){req.m = 1;});
app.use(function(req, res, next){req.m++;});
app.use('/three', function(req, res, next){req.m++;});
app.use(function(req, res, next){
  req.m++;
  res.send(req.m + " middleware were used.");
});
app.listen(34313, function(){console.log('Now listening on Port 34313!');});

Make a request to localhost:34313 and you should get...

Nothing! But why? Notice that only the very last middleware actually sends a response, so if that last middleware isn't called, no response is served. So what next() does is it calls the next middleware function for our Express server. However, next() is not needed if an HTTP response is served within the middleware.

Where to next?

You might have noticed the resemblance between the way the middleware process the request and serve the response and the way a plain Node HTTP server handles this with the function you passed into http.createServer. Both of them work with HTTP requests and responses. req.param looks like req.url from the http module and res.send looks a lot like res.write.

req and res are still HTTP requests and responses, but with more built-in functionality. Express also does a lot of stuff for you as far as req and res go; notice that even though you still sent the HTML you didn't need to do res.writeHead() to write a response header, and you didn't need to do res.end().

There are still plenty of great uses for Node's regular http module, but for serving pages, Express abstracts away a lot of the details of HTTP. Though it's a thin layer of abstraction, so there isn't a lot of new material and syntax to learn before you can get going with writing server-side scripts with Express. In fact, I highly recommend reading through the entire API reference of Express at http://expressjs.com/api.html. It's very short and readable, much like Express code itself.

So far for Express we have taken a look at how to serve static HTML with the serve-static middleware, we saw the npm package manager, we re-organized our static HTML and images to their own directory, we used routing and some rudimentary templating to render webpages, and we've taken a closer look at what Express middleware are.

There's still a lot to cover in Express, but between routing and using Express middleware, we have two fundamental building blocks for understanding the rest of Express. The next tutorial will be Part 2 of this one, where we will be learning about templating, then afterwards we will talk about how to use Express to handle HTTP POST requests. And we're also going to learn how to use some other middleware in Express.

Before I finish this tutorial, though, I am also going to talk about a few other websites you should know about for JavaScript and for programming in general.

GitHub:

GitHub is an awesome website where users can go to put their projects online and collaboratively develop them with others. A LOT of the programming community is using GitHub right now, from individual developers to science labs to even huge companies like Google and Twitter. That means there is a lot of software you can download and use for free, and it also means that GitHub is the place to be if you want to network as a programmer and collaborate with other developers. And since so much of the programming community is on there, GitHub is also a great place to stay up on what's new in programming, which is very important to do as a developer.

Git and GitHub take some figuring out since there are some Git vocabulary you need to pick up to get going, but Lauren Orsini from ReadWrite has two awesome tutorials on getting started on that at:


StackOverflow:

StackOverflow is a very popular Q&A website on programming where if you post a question about programming, other programmers will help you out often in minutes and sometimes in seconds! I have found when learning new programming languages, technologies, and concepts that if I am confused on something, other people's StackOverflow questions often top the Google search results, so it's easy to get a well-written explanation on whatever I am having trouble with right on the first page of the Google search results, making people's StackOverflow questions a great way to supplement the tutorials I read.

That being said, if you are asking questions on there, please make sure to read this http://stackoverflow.com/help/how-to-ask before posting a new question. People on there are volunteering their time to answer programming questions for free, so it's important to be respectful of that by following those guidelines for asking new questions.

Twitter:

We've all heard of Twitter, whether it's for hearing the news, telling jokes, sharing pictures of your cat, or having yet another social networking site for posting food pictures, Twitter is huge. And for programmers, Twitter also is really useful for staying up on what's going on in programming and for networking with other programmers. If you follow Node developers on there, it's easy to get news about what's going on in the Node.js community and you can also often find links to tutorials for something you are trying to figure out in
programming. For example, I've found following @AngularJS on there really useful for finding new Angular tutorials they tweet or retweet for navigating that framework's big ecosystem. I'm @AndyHaskell2013 by the way.

Javascriptissexy and Eloquent JavaScript:

As I mentioned before, more advanced JavaScript concepts like functional programming, callbacks, closures, and prototypal inheritance show up all over the MEAN Stack, so knowing these are a must. Luckily, both Javascriptissexy and Eloquent JavaScript give great explanations of these concepts, and this year the second edition of Eloquent JavaScript came out, so I highly recommend those tutorials.