How to Containerize a Ruby on Rails 6 App


Ruby on Rails should give you super powers. However its not the easiest to deploy in the traditional way (deploying on a Linux virtual machine).

Docker to the Rescue ! If you already know what docker is then you can just copy and paste the docker file below. However if you don’t here it is in a nut shell:

What is Docker ?

Docker is a virtualization tool that does not need a operating system for each application. It does not emulate an entire operating system. Instead it uses the concept of “Containers” to run your applications.

Containers act like a virtual world that your application lives in. The magic of docker comes from the “Docker Engine” its what translates the system calls  from the docker container to the base operating system kernel.

Long story short you will never have to worry about dependency management ever again !!! Or have a case of  “It works on my machine”  😁

Now on to the docker file

RUN apt-get update && apt-get install -y postgresql-client

RUN curl | bash

RUN curl | apt-key add -

RUN echo "deb stable main" | tee /etc/apt/sources.list.d/yarn.list

RUN apt-get update && apt-get install -y nodejs yarn

RUN apt install imagemagick


WORKDIR /application

COPY Gemfile Gemfile.lock ./

RUN bundle install 

COPY package.json yarn.lock ./

COPY . .

RUN yarn install


# RUN rake assets:precompile
RUN RAILS_ENV=production bundle exec rake assets:precompile

ENV RAILS_ENV=production
# Start the application server
CMD ["rails", "server", "-p","8080","-b", ""]

What Are we doing ?

  1. First we are using the latest “Base Image” (set of stuff we are using as a base of our dependencies of our app) of Ruby:2.6.6
  2. All the “RUN” commands install all the other dependencies. We going to be adding:
    1. postgresql-client
    2. Node
    3. Yarn
    4. imagemagick (included because I want to use “ActionText”)
  3. Now we set the Enviroment variable called RAILS_SERVE_STATIC_FILES, because we want our rails app to serve our Javascript and CSS.
  4. After that we create a directory inside our container where our project files will live.
  5. We copy over our Gem file and install our dependencies.
  6. Then we do the same with our package.json file
  7. Now we expose a port that allows our app to communicate to the outside world with, Port 8080.
  8. We run the assets:precompile command so that we compile any SCSS to CSS and put them in the right folders.
  9. And we set the RAILS_ENV to production so our Ruby on Rails app runs in high performance mode.
  10. Finally we specify the first command that runs we start our container:
    1. “rails server” : starts our server up
    2. “-p 8080” : lets our app know which port to listen to
    3. “-b” : lets app know it should listen on all the IPs of the container

Hope this helps you deploy your ruby apps a lot easier 😀

WTF is Async & Await in C# ?

Simply put they allow you to easily write asynchronous programs. Without you ever having to reorganize your code. Which can lead to massive performances increases.

The “async” & “Await” markers are keywords that mark the beginning and end of asynchronous code. Where “async” is put right before a function name, and “await” is put right before calling the function. However if a method is async then it needs to return a Task object.

Now you can use different parts of Task Asynchronous Programming (TAP) model. Such as start a bunch of tasks, and wait for them to finish. Or even call a new task on the completion of another task. All while your main application is running.

How is this possible ? Does it start a bunch of new threads ? Yes and No. If you start a bunch of tasks and wait for them to complete then yes. Where as if you await a heavy task it cuts up everything happening in our program the second it hits an await keyword. And starts executing everything based on the time available on the current thread. So you aren’t able to tell that your programming is waiting around for something.

Questions People have Asked Me – Part 2

What is the root object in the base class library ?

For Java and C# that would be “Object”.

What methods does “Object” have ?

For C#:

    • Equals – Supports comparisons between two objects
    • Finalize – Used to perform cleanup operations on un-managed resources being used by the object. Before the object is destroyed.
    •  GetHashCode – Generates a number based on the value of an object. Used to supported hash tables.
    • ToString – Create a human readable piece of text that describes the object.

Is “String” mutable ?

For C# & Java: Strings are always IMMUTABLE

What is Boxing and Un Boxing ?

For C#:

Process of converting a value type to the type of object or any interface type implemented by this value type. Like storing int in a object which would be “Boxing” (implicit aka do it with out thinking about it). And then taking that object, and “Un-Boxing” it explicitly. Example of this would be something like “int i = (int)x” where x is type of object. Why would you ever want to do this ? Well that’s cause value types get stored in the stack, whereas reference types get stored in the heap. So if your running into performance problems by having a lot of value types floating around in the stack. You can just dump them into the heap, by boxing them.


WTF is S.O.L.I.D – WTF is I ?

I = Interface Segregation Principle

This is all about separating your interfaces and making them smaller. Since we don’t want to end up having mega interfaces with a tons of properties, and methods, that may or may not be used by classes that implement them. If you don’t follow this principle you are probably gonna end up with a hair ball of OOP design. That will lead to harder refactors further down the line. For example lets say you have a “Entity” interface, that has properties “attackDamage”, “health”, and also has methods “move”, “attack”, and “takeDamage”.  Now lets say classes “Player”, “Enemy” , “Chair”, and “MagicChest” implement it. Does this make sense ? Should the “Chair” class need to implement the “attack” method ? Most likely no it should not, but then it does need the “name” property. So we can factor out the common piece among the classes that implement “Entity”. So instead of just having the “Entity” interface. We can have a “BaseEntity”, “ActorEntity” and “StaticObject” interface. This way we won’t have any unnecessary implementations for any of the classes that implement the interfaces.

WTF is S.O.L.I.D – WTF is O ?

O = Open/Closed Principle

This is all about being “open for extension but closed for modification”. Your code should be extendable. To the point where you don’t need to constantly change anything about it. This can come in many forms; for example instead of overloading a class with a number of different methods. Such as the “Person” class needing a method like “write a book” and also like “fight a fire”, or even “cook a five star meal”. Instead we could separate these into classes that all inherit from the “person” class. Which would allow use to write code to extend existing functionality. Without having us go in and make changes to the core logic of the person class.

Another example of this is when we are using a large “if – else” or “switch” statement. And we do things based on what input we get passed in. Instead of having this large “if -statement” we can refactor the logic back into the input we are are being passed in. If we had gotten a generic account class as an input. And we had to calculate their net-worth based on their account type. We should create classes for each account type, and store the calculation logic in them, rather then in the “if else” clauses.

How to get rid of duplicates in a table

How does one even know they have duplicates ? or to what extent ? You can use this SQL statement right here:

title, url, count(*)
title, url
count(title) > 1

view raw
hosted with ❤ by GitHub

The key here is that we are using the “group by” clause to aggregate a bunch of data. And after that we are using the “having” filter clause, to add the filter “count(title) > 1” which is just saying “title found more then once”

Now that we have made sure that we actually do have duplicates, lets get down to deleting them.

DELETE dupes
FROM jobtransparncyprod.JobPostings dupes, jobtransparncyprod.JobPostings fullTable
WHERE dupes.url = fullTable.url
AND dupes.title = fullTable.title

view raw
hosted with ❤ by GitHub

The key here is that we are using aliases rather than joins or something else. So you can think of “dupes” and “fulltable” as variables. First we set the values of these variables, then we use the where clause to filter the things we are looking for. At this point if we just ran the query, we would just end up deleting our whole table. Therefor we have a final “and” where we specify that we only want to select the id that greater out of the two, rows that were in “dupes” and “fulltable”.

Docker – Cheat Sheet

The basic commands you need, to be productive with docker:

How do I get a list of all running docker containers ?

  • docker ps

How do I just get all the containers ?

  • docker ps -a

How do I remove a container ?

  • docker rm <container id or name>

How do I see all my images ?

  • docker images

How do I remove an image ?

  • docker rmi <name of image here>

How do I get an image on to my local machine ?

  • docker pull <name of image here>

How do I make a container and run it ?

  • docker run <image name>

How do I run & start a interactive shell  with my container ?

  • docker run -it <image name>

How do I map a port from my container to the outside ?

  • docker run -p <outside port>:<port inside docker container> <image name>

How do I get details about an image ?

  • docker inspect <image name>

How do I look at the logs coming out of a container ?

  • docker logs <container name>

How do I start up a container and leave it running, without it consuming a session ?

  • docker run -d <image name>

How do I build my application, and tag it ?

  • docker build -t <user-name>/<app-name> .


How I went about choosing a Deep Learning Framework

The following is a excerpt that was made, as part of my final capstone project.


The hardware and software section will be primarily exploring the two key parts in the development of neural networks. Currently the two competing software libraries for the development of neural networks are PyTorch and Tensor Flow. And the two competing hardware platforms to train models is between AMD and Nvidia [6]. In this section I will explore the benefits and disadvantages of each.

Deep Learning Software & Hardware Selection

When looking into developing our model I identified the 2 key choices, software selection and hardware selection. I identified framework selection as a key choice since, it would act as the key building block in constructing the model, and effect how fast I could train them. Where as hardware selection was important since it would be the primary limiting factor in how fast I could train the model, and how complex I could make the model.

Software Selection

Due to the exponential expansion of machine learning (ML) research and computing power seen over the last decade. There has also been an explosion of new types of software infrastructure to harness it. This software has come from both academic and commercial sources. The need for this infrastructure arises from the fact that there needs to be a bridge betIen theory and application. When I looked at what Ire the most popular frameworks, I found it was a mix of strictly academic and commercial driven software. The four main frameworks Ire Caffe, Theano, Caffe2 + PyTorch, and Tensor Flow (TF).

When I went about choosing a framework, I considered three different factors, community, language, and performance. Community was one the biggest factors, since I had no real production experience in doing any sort of large scale ML modeling and deployment. The only framework that fulfilled this need was Google’s Tensor Flow. It had been released in 2015 and had been made available to the open source community. Leading to many academic researchers to contribute and influence its development. Which has resulted in many other companies using it in their production deep learning pipelines. The combination of both software developers and scientists using it has led to a lot of community driven development. This has lead to making it easier to use and deploy. A side effect of this large amount of adoption is the generation of detailed documentation. Written by the community, large amount of personal, and company blogs, detailing how they used TF to accomplish their goals. The only real competitor at the time of writing it this is Facebook’s Caffe 2 + PyTorch Libraries which was just open sourced early this year.

The other factor was the language interface it would use. I wanted an easy to use interface, with which to build out the model. When I looked at what was available, I found that all of the popular frameworks were written in C++ and CUDA, but had a easy to use Python based interface. The only framework out of the four mentioned above, that only had C++ based interface was Caffe.

The most important part of framework selection was the performance aspect. Most if not all ML research and production use cases happen on Nvidia GPU hardware. This is due to Nvidia’s development of their CUDA programming framework for use with their GPUs. It makes parallel programming for their GPUs incredibly easy. This parallelization is what lets the complex matrix operations be computed with incredible speed. There were only two frameworks out of the four I mentioned, that used the latest version of CUDA in its code base. Which were TF and Caffe 2 + PyTorch, however Caffe 2 + PyTorch was not as robust as Tensor Flow in supporting the different versions of CUDA.

In the end I choose to go with TF since it had a better community and CUDA support. I did not choose to go with its nearest competitor, since it was not as well documented, and its community was just starting to grow. Whereas TF has been thoroughly documented and has had large deployments outside of Google (such as at places like LinkedIn, Intel, IBM, and UBER). Another major selling point for TF is the fact that, it is free, continually getting new releases, and has become an industry standard tool.

Deep Learning Software Frame Works
Name Caffe Theano Caffe 2 + PyTorch Tensor Flow
Computational Graph Representation No Yes Yes Yes
Release Date 2013 2009 2017 + 2016 2015
Implementation language C++ Python & C C++ C++, JS, Swift
Wrapper languages N/A Python Python, C++ C, C++, Java, GO, Rust, Haskell, C#, Python
Mobile Enabled NO NO YES YES
Corporate Backing UC Berkeley University of Montreal Facebook Google
Multi GPU Support NO NO YES YES
Exportable Model YES NO YES & NO YES
Library of pretrained models YES NO YES YES
Unique Features Don’t need to code to define Network First to use CUDA and Computational Graph in Memory Uses the original developers of Caffe and Theano frameworks


VISDOM – Error function Visualization Tool


PoIrs Facebook ML

Tensor Board – Network Visualization and Optimization Tool


Developed by Google Deep Brain



PoIrs Google ML

Under Active Development No No Yes Yes



The reason as to why PyTorch and Caffe 2 are always mentioned together is because they are meant to be used together. PyTorch is much more focused on research and flexibility. Where as Caffe 2 is more focused on production deployment and inference speed. Facebook’s researchers use PyTorch to prototype models, then translate the model into Caffe 2, using their model transfer tool known as ONIX.

Table 1 A summary of all information of note that I collected during my research

Map vs Reduce vs Filter in JavaScript

So these methods are part of the “functional” aspect of JavaScript. JavaScript is a strange language in a good way. Before we get into this, know that all these methods above try to replace the “for” loop, or any other type of loop you can think of.

Some of you might be saying, “I like my loops thank you very much, it is in every language out there !”. And yes that is very true, however these make writing a whole loop for something trivial a thing of the past. So lets get on with it, its not as hard as you probably things it is.

All these methods only working on arrays. And they never modify the array you apply them to. They just return a new array.


As in Mapping “X” to “Y”, or transforming X into Y. In the case of JavaScript you are making a map of your values. You basically just give it a function (could be anonymous) and it applies it to each array element, and forms a new array.

// Array you wanna do the operation on
const peopleArray = [ {Name: "Tim", BankBalance: 10},
{Name: "Leo", BankBalance: 20},
{Name: "Sam", BankBalance: 30} ];
// calling the Map function and having it return its value into
// reformattedArray. Where "obj" represents each object in the
// peopleArray
let reformattedArray = =>{
let newObj = {};
newObj.Name = "Mr." + obj.Name;
newObj.BankBalance = obj.BankBalance * 1000;
return newObj;
// "reformattedArray" is now: [ { Name: 'Mr.Tim', BankBalance: 10000 },
// { Name: 'Mr.Leo', BankBalance: 20000 },
// { Name: 'Mr.Sam', BankBalance: 30000 } ]
// you can also get another parameter that is passed in. Known as the index
// of the object in the old array (peopleArray in this case)
reformattedArray =, index) =>{
let newObj = {};
newObj.MyIndex = index;
newObj.Name = "Mr." + obj.Name;
newObj.BankBalance = obj.BankBalance * 1000;
return newObj;
// "reformattedArray" is now:[ { MyIndex: 0, Name: 'Mr.Tim', BankBalance: 10000 },
// { MyIndex: 1, Name: 'Mr.Leo', BankBalance: 20000 },
// { MyIndex: 2, Name: 'Mr.Sam', BankBalance: 30000 } ]

view raw
hosted with ❤ by GitHub


As in reducing something to its essence, or compressing something down. In the case of JavaScript you are taking all your values in an array, and compressing them into something useful.

// Here we have a list of all the transactions made by Mike
const transActionsForMike = [ {Name: "Bike", Cost: 5000},
{Name: "Apple Music", Cost: 20},
{Name: "Cook Book", Cost: 30},
{Name: "Azure Hosting", Cost: 60},
{Name: "Phone Bill", Cost: 70}];
// And we try to figure out how much Mike Spent
// "reduce" takes in three main arguments, the Accumulator (acc), the
// currentValue (curr) in the form of an anonymous function. And a inital
// value for the accumulator (in our example its 0). The accumulator acts
// like a bucket that you keep adding things to. Or keep modifying every
// interaction of the reduce function.
// Here we are taking each object in the "transActionsForMike" variable. Then
// we are accessing the cost property of each object, then continuously adding it to
// the accumulator.
// then at the very end we are returning the accumulator that gets dumped into the
// MikesSpending variable.
let MikesSpending = transActionsForMike.reduce((acc, curr) =>{
return acc + curr.Cost;
}, 0);
console.log(MikesSpending) // MikesSpending is: 5180
// Note you can make the accumulator into anything you want. That is, you can set the
// inital property of the accumulator to object or array and continously add things
// to it, to have the desired output.
// Other things that you can have "reduce" pass into your function is:
// accumulator, currentValue, currentIndex, array (yes the whole array)

view raw
hosted with ❤ by GitHub


As in finding something based on a certain set of parameters. I think you guys kinda already know what this particular method does.

const transActionsForMike = [ {Name: "Bike", Cost: 5000},
{Name: "Apple Music", Cost: 20},
{Name: "Cook Book", Cost: 30},
{Name: "Azure Hosting", Cost: 60},
{Name: "Phone Bill", Cost: 70}];
// As seen in Reduce and Map, we can add more elements into the
// anonymous functions such as the index and the whole array
// filter gives us each element in the array. We then have to return
// "true" to keep it in, or "false" to get rid of it.
// In the case below we are keeping the element in the new array we are returning
// if it's cost property is less than 100 or more than 20
let MikesSmallTransactions = transActionsForMike.filter((ele) => {
return ele.Cost < 100 && ele.Cost > 20;
// MikesSmallTransactions is now: [ { Name: 'Cook Book', Cost: 30 },
// { Name: 'Azure Hosting', Cost: 60 },
// { Name: 'Phone Bill', Cost: 70 } ]

view raw
hosted with ❤ by GitHub

I hope this helped you better understand Reduce, Map, and Filter 🙂

Questions People have Asked Me – Part 1

Below are some questions I was recently asked, with my answers. 

Please let me know if any of them are wrong, its a learning opportunity for me 🙂

What is the binary sort algorithm and how does it work? 

The binary sort algorithm (BSA) is used to effectively sort data. It works on the principle of continuously cutting the data set in half, until it finds what it is searching for. However, this algorithm only works on data sets that are already sorted. 

First the BSA checks the middle of the data set and compares the value it is searching for. If this value is the value it is searching for then it stops. However, if it is not equal, it checks if the value it found is bigger or smaller then the value it is trying to find. If the value is smaller, the BSA repeats the process on the left, whereas if the value is bigger it repeats the process on the right. This process is repeated multiple times until the value that the BSA is looking for is found. 

What is recursion and how is it used? 

Recursion in programming is when the program starts calling its self, from inside its self. This programming technique is usually used when a single large program can be solved in smaller parts and has a valid base case. Such as computing the Fibonacci Sequence or traversing a binary search tree. 

What is polymorphism and what is its purpose? 

Polymorphism is an aspect of Object-Oriented Programming where a “object” can take on many different forms, if all the forms are its children.  

Explain when you should use interfaces and when you should use abstract base classes. 

Both interfaces and abstract classes are a type of contract, in the class structures of a software application. Interfaces are a form of contract between two different entities, where you want to separate the functions from the implementations. This is done such that you existing application does not need to change much if a certain part of it is changed. This can be seen in the commonly used repository pattern, which is used to separate data access logic from the business logic. Where as abstract classes are a less extreme version of interfaces, where certain methods defined in it can have real implementations. This allows any child class that inherits from the abstract class to get those method implementations. The difference between the two can be further seen in how their child methods are derived. Since interfaces are “implemented”, where as abstract classes are “extended”. 

When should you use static methods and static variables? And when shouldn’t you use them. 

Static methods and variables can be used from a class without having to instantiate it. This is usually used when, you want to group a set of functionality or utility function together. An example of this is the “Math” class in JAVA, which gives the user all the math related function they need. You wouldn’t want to use them when you would be creating your inheritance-based class structures, most of the time. 


Write a SQL statement to create a table called “author” with the columns “id”, “name”, “age” (for MySQL or SQL Server). 

CREATE TABLE author ( 


name varchar(255) NOT NULL, 

age int, 



Write a SQL statement to create a table called “book” with the columns “id”, “title”, “genre”, “author_id” (for MySQL or SQL Server). 



title varchar(255) NOT NULL, 

genre varchar(255), 

author_id int, 


FOREIGN KEY (`author_id`) REFERENCES `author` (`id`) ON DELETE CASCADE); 


For the “author” and “book” tables created above, write a SQL statement to tell you the number of books each author has written, but only for authors who have written 2 or more books. The output should not show authors that have written only 1 book. The output should have the author’s name and the number of books they have written. 

SELECT, COUNT(*) AS ‘# books’ FROM author, book WHERE = book.author_id GROUP BY HAVING COUNT(*) > 1; 


In databases, what are indexes used for and how to you decided how to use them effectively. 

Indexes in databases are used to speed up data retrieval. However, they come at the additional cost of space, and added complexity to database maintenance. They should only ever be used when the same type, or group of data is constantly being accessed. If the number of reads get even larger, there should also be some sort of caching layer the application queries, such that it doesn’t need to query the SQL database directly. 


What is the value of unit testing and what are some of your strategies for writing good unit tests? 

Unit testing is used to test the functionality of the different parts of an application. Its value lies in the fact that they make the programmer, test their code in a systematic way. And feeds into a workflow where tests are run before anything gets committed to the master branch. I think the best way to write a test case, is to write the test before writing the application logic, since it gets you thinking about what edge/special cases to consider. This is also known as the test-driven development approach.