19 Jan 2023 · Software Engineering

    Blue Green Deployment for Node.js Without Kubernetes

    17 min read
    Contents

    The Blue Green Deployment strategy allows for the testing of new deployments with production data before making them available to users. This type of deployment also has the added benefit of having no downtime during the deployment process.

    Using blue green deployments requires a series of manual tasks. Cloud providers and Kubernetes can aid in this process, but sometimes you do not have access to these tools.

    We (my firm) recently found ourselves in such a situation. We had a Node.js application running behind Nginx. Nginx was set up as a reverse proxy. This means it was receiving all public requests and forwarding them to the running Node.js application. All this was running on a single Linux VPS.

    We wanted to set up a blue green deployment for this application in a process requiring minimal human intervention. The obvious solution is using a CI/CD platform. But, first we’d need to create the deployment manually to understand how it could work.

    What we wanted to achieve was the following:

    • A push to the main branch should trigger a deployment to the currently non-live blue green environment.
    • After that, we should be able to press a Promote to Live button to promote the non-live environment to the live environment.
    • Once an environment is live, we should be able to roll back to the previous live environment by pressing the Rollback button.

    This is the result that we achieved:

    In this article, we will look at how we created this custom CI/CD-driven blue green deployment pipeline in depth. We will also discuss how we broke down the problem and the challenges that we faced and solved along the way.

    Prerequisites

    We will be referencing this express application on GitHub throughout this article. This repository contains a minimal Node.js application that accepts incoming HTTP requests. It also contains the deployment scripts and the Semaphore CI/CD configuration.

    How do blue green deployments work?

    By now, we know what blue green deployments allow us to do. Next, let’s understand how blue green deployments work. This will allow us to break down the problem.

    Let’s take an example. Our application is currently running in the green environment. It is currently on Version 1.1.2 and is accessible at myapp.com.

    We need to update our application to Version 1.2.0. To do this, we first create or update the blue environment with the code for Version 1.2.0. Then, we use the load balancer to direct traffic on blue.myapp.com to the blue environment.

    After this point, we can do any sort of testing that we want to do on blue.myapp.com, be it manual or automated. After we are sure that blue is working as expected, we configure the load balancer to direct traffic on myapp.com to the blue environment. We can also access the green environment from green.myapp.com in case we need it for some reason.

    After this point, the new version of the application is up. We can safely shut down green. We also have the choice to keep green around for some time in case something goes wrong with blue. In such a situation, we can quickly switch back to green.

    Manual blue green deployment

    To perform blue green deployment using the current setup on the server, we need to have the following configuration:

    1. We need to have two copies of the repository, one for blue and the other for green.
    2. We need to run the Node.js application for both blue and green. Both of these should listen to different HTTP ports.
    3. We need to configure Nginx to forward requests for blue.myapp.com and green.myapp.com to their respective HTTP ports.
    4. We need to configure Nginx to forward requests for myapp.com to whichever environment is currently considered live.

    Let us assume that blue is currently live. Then, we need to perform the following steps during deployment to green:

    1. Pull new code for green on the server
    2. Start/Restart the Node.js application for green
    3. Change the Nginx configuration for myapp.com to forward requests to the HTTP port that green is using.
    4. Apply the new configurations for Nginx

    There are enough steps involved to give rise to a human error once in a while. This is why we found so much utility in automating this process.

    Blue green deployment automation

    Before we could even think about setting up a CI/CD pipeline, we needed to figure out a way to automate the deployment process.

    We took some time to break down the automated deployment process into smaller steps. We ended up with the following list of tasks to automate:

    • Get the code to deploy for the blue or green environment on the server.
    • Start and stop the application for an environment.
    • Set up the Nginx configuration for the environment, so that it is accessible from blue.myapp.com and green.myapp.com.
    • Set up the Nginx configuration for the live environment. We can reuse this for both live deployments and rollbacks.
    • Delete the Nginx configuration for an environment. This is useful when shutting down an environment.

    Getting the code on the server

    The first strategy that we tried was to install git on the server. We planned to pull the code from GitHub during the deployment process.

    This strategy required setting up deploy keys on GitHub and then configuring SSH keys on the server. We did implement this and it worked wonders. Yet, we decided to scrap it for the following reasons:

    • The ultimate goal was to run this on a CI/CD pipeline. These already set up deploy keys on GitHub to pull code. Setting up GitHub access on the server again would be kind of redundant.
    • Setting up GitHub access on the server adds one more step during server setup. If we ever needed to migrate servers, we would need to set up GitHub access all over again.
    • Managing the Deploy Key on the server is extra overhead and a bit of a security risk.

    Our new strategy was to write a script that could checkout the target code on its own.

    A local dev environment would already have GitHub’s SSH-based authentication set up.

    CI/CD environments already provide a way to checkout code on the branch or tag that the pipeline is running for. For this reason, the script would also have the option to accept a path to a directory. If this option is set, the script would deploy the code in the directory without pulling any code from GitHub.

    After we had the code that we want to deploy checked out, we would need to build the application. Finally, we would need to copy it to a specific directory on the server. We opted to use the $HOME/environment directory. This directory would have one directory for each environment. Thus, we would have the following directories:

    /home/username/environment/blue
    /home/username/environment/green

    We decided to use the rsync utility to copy the code to the server. We experimented with scp first but rsync was much faster compared to scp.

    You’ll find the completed script that pushes the code at deployment/push-code.sh.

    Starting and stopping the application for a specific environment

    Starting a Node.js application is pretty straightforward. However, the plan was to run multiple versions of the application. Thus, some special considerations needed to be taken when starting the application for an environment.

    • Each environment would need to listen to different ports for HTTP
    • Each environment would need to be able to be stopped and started easily

    Using different ports for each environment is pretty easy. All that is required is to configure the HTTP port as an environment variable. This can be accessed in code using Node.js’s process.env.

    Starting the application for an environment is also a breeze.

    Stopping the application is a different story. We would need to keep track of the process ids for the application for each environment.

    A process id is a number that operating systems use to uniquely identify each process running on the system. We can then tell the operating system to end a process by referring to it using the process id.

    To stop an application, we would first need to store the process ids for every environment that we would start. We would also need to keep this list updated as different environments are started and stopped. This would require writing more code so we decided to use a readily-available solution instead.

    PM2 proved to be the perfect solution to this problem. It is a process manager which can start and stop Node.js applications.

    The main utility of PM2 for our use case is that it allows assigning names to Node.js applications. We later used these assigned names to stop applications.

    If we were to name the Node.js applications after their environments (which is what we did), we could start an environment by using this command:

    HTTP_PORT=$HTTP_PORT pm2 start index.js --name "$ENVIRONMENT_NAME"

    This would also allow us to stop an environment by using the following command:

    pm2 stop $ENVIRONMENT_NAME

    PM2 also brings other advantages to the table. It supports the following useful flags when starting an application:

    • -i: The number of instances to create. If you use -i number, it will create that many instances. If you do not specify a number, it will default to the number of CPU cores available.
    • --wait-ready: This instructs PM2 to wait for a ready signal from the process.
    • --listen-timeout 10000: This instructs PM2 to wait for the ready signal for 10,000 milliseconds.
    • --kill-timeout 10000: We use this for graceful shutdowns. PM2 will wait for 10,000 milliseconds for the process to exit by itself after sending the SIGINT signal. After that, it will kill the process.

    So, running the application with these options will give us an application with support for running many instances and graceful shutdowns:

    HTTP_PORT=$HTTP_PORT pm2 start index.js --name "$ENVIRONMENT_NAME" -i --wait-ready --listen-timeout 10000 --kill-timeout 10000

    We decided to use <environment>.myapp.com as the name for each environment.

    Now, all we needed were two scripts — one for starting an environment and the other for stopping an environment.

    You’ll find the completed script for starting an environment at deployment/remote/environments/start-environment.sh and for stopping an environment at deployment/remote/environments/stop-environment.sh.

    Setting up the Nginx configuration for an environment

    The next problem on the list was creating Nginx configurations dynamically. The configuration would need to have at least the following information:

    • The domain associated with the environment
    • The HTTP ports to proxy the requests to

    However, we did not want to generate the configuration completely via the code.

    If we did this, then future changes to the Nginx config would require changes to the deployment script. This could introduce bugs. It would also be difficult to view the history of the Nginx configuration on git. We wanted the configuration to be easy to maintain.

    In the end, we decided to take inspiration from Mustache, a template syntax.

    This syntax allows replacing placeholders in a string, writing conditionals, loops, etc. But, it would be overkill to use the full Mustache syntax. We only need basic string replacement.

    Instead, we decided to use the sed utility, which is used to find and replace text in files.

    To do this, we created a text file with the required Nginx configuration while using DOMAIN and HTTP_PORT as placeholders.

    server {
      listen 80;
      server_name {{DOMAIN}};
    
      location / {
        proxy_pass http://127.0.0.1:{{HTTP_PORT}};
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
      }
    }

    The next step was to update the start-environment.sh script. We just needed to add a few lines of code to generate the Nginx configuration.

    The code copies the template file to /etc/nginx/sites-enabled/environment-name.domain.com. Then it replaces the values of DOMAIN and HTTP_PORT in the file using sed. We use <environment>.myapp.com as DOMAIN and the HTTP port of the PM2 application for that app as HTTP_PORT.

    The last thing it does is create a symlink from /etc/nginx/sites-available/environment-name.domain.com to /etc/nginx/sites-enabled/environment-name.domain.com. Placing the configuration in the sites-enabled directory lets Nginx know that we want the configuration to be enabled.

    Setting up the Nginx configuration for the live environment

    If any of the environments we created earlier is accessible from the main domain, myapp.com, it is the live environment.

    We can either reuse the earlier configuration template or create a new one for the live environment.

    For the live environment, we would need to copy the template to /etc/nginx/sites-available/myapp.com and create a symlink at /etc/nginx/sites-enabled/myapp.com.

    For the value of DOMAIN, we use myapp.com. The HTTP_PORT is the port for the PM2 application for the same environment that we are promoting to the live environment.

    The only problem here was finding the port where the target environment is running. We first scoured the PM2 documentation to find a command that would give us the ports used by an application. However, we did not find any such information. So, we decided to use another strategy.

    We added the port information as a comment to the Nginx configuration itself!

    #environment:{{ENV_NAME}};{{HTTP_PORT}}

    We also decided to include the name of the environment for good measure. So, the Nginx configuration now looked like this:

    #environment:{{ENV_NAME}};{{HTTP_PORT}}
    server {
      listen 80;
      server_name {{DOMAIN}};
    
      location / {
        proxy_pass http://127.0.0.1:{{HTTP_PORT}};
        proxy_http_version 1.1;
        proxy_set_header Host $host;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
      }
    }

    At this point, we can do the following to promote a running environment to live:

    • Find the Nginx configuration for the target environment in /etc/nginx/sites-available/<environment>.myapp.com.
    • Extract the first line from the configuration–this is the comment that we added. Extract HTTP_PORT from the line using a regular expression.
    • Create a live Nginx configuration using the extracted HTTP_PORT and ENV_NAME.

    You’ll find the completed script at deployment/remote/environments/set-live-environment.sh.

    Deleting the Nginx configuration for an environment

    For this, we would need to delete the configuration file for the target environment at /etc/nginx/sites-available. That’s it!

    We updated the stop environment script with code to achieve this.

    Test running the scripts

    We already had the scripts necessary to perform the deployment. However, we hit a couple of roadblocks when we tried to run the scripts over SSH.

    The first problem was that our script was not allowed to write to Nginx’s sites-enabled and sites-available directories. We resolved this by changing the ownership of these directories to the Linux user that we were using to run the deployment scripts. We used the chown command to achieve this.

    The second problem was that we were calling systemctl restart nginx after changing the Nginx configurations. However, this command was also only accessible to a super user. This was a bit more challenging. But we figured out that there is something called the sudoers file.

    We just needed to add the following line to the /etc/sudoers file.

    deployuser ALL=NOPASSWD: /bin/systemctl restart nginx

    Here, deployuser is the name of the Linux user we were using for running the deployment scripts.

    The addition of the above line allows the user with the specified username to run sudo systemctl restart nginx without the need to enter a super user password. Perfect!

    Tighter scripts

    At this point, we could already do blue green deployment, but we could also do all sorts of other cool things.

    For example, we could deploy a specific branch to a subdomain to test it out on production data! This was powerful but opened doors to making mistakes during deployment. So, we decided to create more scripts that would allow us to work with only the blue and green environments.

    So, we created the following scripts:

    • push-code-blue-green-non-live This script figures out what the current non-live environment is among blue and green. Then, it runs our push-code script with the correct environment variables.
    • start-blue-green-non-live This script figures out the non-live environment and then runs the start-environment script with the correct environment name and HTTP port. We also hard-coded the required target HTTP ports for blue and green in this script. We decided to go for 3001 for blue and 3000 for green.
    • stop-blue-green-non-live This script figures out the non-live environment and runs the stop-environment script with the correct environment name.
    • switch-blue-green This script figures out the non-live environment and runs the set-live-environment script with the correct environment name.

    All these scripts except push-code-blue-green-non-live run on the server itself. push-code-blue-green-non-live runs in the CI/CD pipeline or locally.

    The scripts that run on the server can figure out the current live environment by parsing the comment that we placed on top of the Nginx configurations.

    However, push-code-blue-green-non-live needs a way to figure out the current non-live environment. So, we decided to add one final script that echoes the non-live environment’s name. You’ll find this script in get-non-live-environment. We run this script through SSH from within the push-code-blue-green-non-live script to get the current non-live environment.

    In the end, we decided to also run this new script from the other blue green scripts. This allows us to have a single source of truth for the name of the non-live environment.

    Integrating the CI/CD pipeline

    We wanted to achieve the following CI/CD pipeline:

    We decided to use Semaphore for this. We created three different Semaphore workflows and connected them all through Semaphore’s promotions feature. This allows Semaphore workflows to trigger and branch off into other workflows. We configured the entry workflow to run every time we push code to the main branch. The entry workflow is a yml file located in .semaphore/semaphore.yml at the base path of the repository.

    This workflow performs the following steps:

    • Runs tests.
    • Uses scp to copy the deployment scripts from the repository to ~/deployment on the server.
    • Runs push-code-blue-green-non-live. This pushes code to the currently non-live environment.
    • Runs start-blue-green-non-live on the server over SSH. This starts the non-live environment to which we pushed the code.

    This workflow also defines a promotion named Promote to Production. Defining this promotion displays a button on the Semaphore UI to manually trigger it. This is exactly what we wanted! This is another workflow defined in .semaphore/promote-to-production.yml.

    This workflow runs the switch-blue-green script on the server over SSH. This is all that we need to promote the current non-live environment to production.

    The Promote to Production workflow defines a promotion named Rollback. This is a workflow defined in .semaphore/rollback-production.yml.

    A rollback on a blue green deployment requires only switching the live environment. Thus, the Rollback workflow does the same thing as Promote to Production. In the end, our whole CI/CD Pipeline looks like this:

    It would be fairly trivial to automate these promotions. However, blue green deployments are supposed to need user intervention. The great thing about using promotions for our use case is that it displays buttons which we can use to drive the deployment process. This eliminates the need to manually run scripts!

    Final thoughts

    This blue green deployment mechanism completely replaced our manual deployment process. The deployment process does not require any mental overhead because it is automated. As a result, our confidence in our deployments has gone up. We can always rollback our deployments with the click of a button. This means that we also deploy more frequently. Another major win we have achieved is the ability to run tests and benchmarks on production data.

    During the implementation phase, we also learnt more about CI/CD integration, Nginx, bash scripting, and the Linux permissions system. It was also a lot of fun to solve the challenges that we encountered along the way.

    One thought on “Blue Green Deployment for Node.js Without Kubernetes

    1. Excellent article. Wondering if you explored Amazon CodeDeploy’s Blue/Green deployment option. (Not an option if you’re not on AWS, of course.) I’ve had unexpected results with it sometimes; keep crossing my fingers it’ll “just work” next time I try it, because conceptually it’s not that much of a heavy lift. Thanks!

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    Bikash is a Software Engineer who revels in experimenting with and writing about all things Software. He is based in Nepal and is the Tech lead at Butterfly.ai.
    Avatar
    Reviewed by:
    I picked up most of my soft/hardware troubleshooting skills in the US Army. A decade of Java development drove me to operations, scaling infrastructure to cope with the thundering herd. Engineering coach and CTO of Teleclinic.