Machine learning is fun but can be very cumbersome as well. Any tool that can help you get rid of the manual work therefore deserves our full attention. Machinebox promises exactly that, so we gladly put it to the test, and ended up even happier. Read our impressions and conclusion below.

1.1 Introduction to Machinebox

According to their own sales pitch, Machinebox is a suite of prebuilt machine learning algorithms which take virtually no learning effort and provide great results.

Their services are provided as a number of Docker images, each with specific ML capabilities:

  • Face recognition
  • Media categorization
  • Nudity detection
  • Text analysis

All interaction with Machinebox services is done through a standard HTTP REST API, which means there’s no custom SDK to install and learn.

1.2 Deployment of Machinebox for Machine Learning

Machinebox is run on-prem as a Docker image, which reduces the dependency on a 3rd party and makes GDPR compliancy much easier. As another benefit, this deployment model also means Machinebox services can easily be put in place on a PaaS such as Heroku.

Ready to use machine learning tool

1.3 Scaling 

Machinebox services can easily be scaled horizontally to many instances simply by spinning up a new Docker instance. Note: instances currently have to be synchronized manually by uploading the state of a ‘processor’ instance to all read boxes (see below for further information).

1.4 Let’s get cracking

For this tutorial we’ll be using the Tagbox image classification service as it’s fairly easy to get started with and has many exciting use cases.

2 Tagbox walkthrough

2.1 Prerequisites

  • Docker

2.2 Tagbox Setup

  1. Create an account and copy your API key: Sign Up
  2. Run the Docker image as described in the Tagbox tutorial:
Machinebox for machine learning
  • Wait until the image is spun up and [INFO] box readyis printed to the console
  • A GUI and REST endpoint is now available on localhost

2.3 Usage

Interaction with Tagbox is done primarily via two endpoints: the classification ( /check ) and learning (/teach) routes

2.3.1 Classification (/check)

Classification is the statistical method of identifying which categories a piece of information belongs to.

  • Send a classification request:
Machinebox for machine learning
  • Send an image binary using the API:
Machine learning, ready to use

Tagbox provides many ways to submit an image for classification:

  • Direct HTTP post
  • URL via JSON payload
  • URL via query parameters
  • URL via form post
  • Base64 encoded string via POST body
  • Base64 encoded string via JSONRequest

A response similar to this will be returned:

code machinebox ready to use tool for machine learning

The tags array is the meat of the payload. This is an ordered set of tags sorted by how confident the ML algorithm is of that tag being correct. Note that positional data is not returned for the image tags, which reduces its usefulness for some GUI applications. It would have been interesting to create a clickable image with live tag links for example.

Latency and performance

When measured with the time command, most responses took about one second. (I did not perform rigorous testing using a standard statistical model, so YMMV.)

  • This response time remains constant for subsequent requests of already processed image, which would indicate to me that Tagbox does not use caching at all
  • By using a perceptual hash, similar images would not have to be processed in order to be classified. This would be simple to implement using pHash and a caching service for example
  • Caching would also severely reduce response times, so I highly recommend it for user-facing applications

2.3.2 Teaching and similarity (/teach, /similar)

By teaching Tagbox about custom tags, we can make it perform much better on a specific dataset, tailor-made for a given business objective:

  • Low-quality CCTV footage
  • A specific category of entities (items from a shop, a certain type of animal, …)

Teaching Request

  • Initiate a teaching process:

All relevant options are supplied as query parameters:

  • tag: the tag to teach
  • id: a unique ID of the provided image (important for retrieving similar images)

The accuracy of the provided tag is dependent on how many examples are provided for training. Posting multiple examples cannot be done in batch, additional samples must be provided by making a new request with the same tag and a different image.

Teaching Response

Upon successful processing a success response is returned (surprisingly enough):

machine learning code


Once a new tag has been taught to Tagbox, that tag will be returned in the custom_tags array of /check responses for matching images.

Similar images

Once an image has been uploaded to Tagbox, the ID you provided will be returned in a request to the /similar endpoint when a similar image is submitted:

similar images machine learning tool machinebox
machine learning hotdog detection, detect objects automatically by learning the software to do so

Enterprise-grade hot dog detection


2.3.3 State backup and instantiation

Tagbox uses an online algorithm, as opposed to batch-processed algorithms, which means that the learning process is continuous and predictions are dynamically updated as predictions change. This is generally preferred when the expected results change regularly. This method has its downsides as well however: the model may become skewed or biased, decreasing its accuracy and usefulness.

To prevent such an event being irreversible, the current model state can be downloaded and re-uploaded to Tagbox:

Machinebox for machine learning

This type of state upload is also useful when load balancing with multiple boxes. These backups should be done regularly to ensure the model can always be restored to a useful state.

2.3.4 Conclusion 


Setting up Tagbox was a walk in the park for me, as it would be for anyone who’s used Docker before.


The Tagbox API has a narrow, well-defined focus which makes it very pleasant to work with and integrate into an app. Clearly a lot of attention has been given to developer-friendliness. Within less than an hour, I was able to build a fully functional and accurate prototype model with a set of custom image tags. I cannot emphasize enough how important this is as it allows developers to rapidly iterate and test several competing setups.


Several other options exist for image classification, one which I would like to point out as an example is YOLO:

  • The YOLO model provides a similar set of features, yet is entirely free/open source
  • It provides positional information
  • When run on a GPU it’s fast enough for real-time classification

A downside to most of these ML models is that classification takes a while unless task processing is offloaded to an NVIDIA GPU (using CUDA).

This is an issue when the intent is to deploy on PaaS environments which usually don’t have dedicated GPU compute instances.

They also tend to have a much steeper learning curve, expecting expert-level knowledge concerning machine learning from its users.

Use cases

A great use case for image classification would be visual image search: see this link for a Tagbox demo app with such a purpose. According to their own blog post, they uploaded a teaching set of 6500 images and let the algorithm run its course without any custom parameter configuration.

The results are stunningly accurate in my opinion.

Some other example uses according to the Tagbox documentation are:

  • Automatically categorize images
  • Improve SEO by adding keywords to web pages containing the images
  • Make search smarter by allowing users to find images based on their content
  • Use your own custom tags, to classify your images in categories that you choose
  • Calculate image similarity and find images which are related

Further reading

3 Facebox walkthrough

In this tutorial we’ll create a model which can (hopefully) identify Obama.

This walkthrough assumes you’ve finished the Tagbox walkthrough and understand it completely.

3.1 Setup

  1. Run the Facebox Docker image
facebox machine learning

3.2 Training Data

We should get some training data for our model! The google-images-download Python script comes in handy here.

  1. Download it by entering pip install google_images_downloadinto the console.
  2. Now run the scraping command: googleimagesdownload -k Obama -o ..
  3. After around a minute you should have 100 Obama images ready to train the model with (Tip: You might want to manually remove the non-Obama-ish ones if any are present – I had about 75 left after a manual pass).
  4. Facebox is also unable to train on images in which multiple people are present (they’ll simply be skipped).
  5. Upload all the images to Facebox: ls -d -1 $PWD/obama/**/* | xargs -I {} curl -X POST -F 'file=@{}' http://localhost:8080/facebox/teach\?name=Obama\&id=`basename {} | tr -cd '[[:alnum]]._-'` df Don’t worry about what the script does (if you don’t want to). Simply put, it iterates over the Obama directory and posts the contents to Facebox. (Note that if you’re running the free version you’ll only be able to train 20 images, which is not a problem for our purposes.)

3.3 Checking

Our model should now have a reasonably good understanding of what Obama looks like. Give it a try using the localhost demo (on an image you didn’t use for training, obviously).

A trained Obama facial detection model

3.4 Conclusion

Once again we had a blast readying a simple model for production use. If you take a look at the Facebox docs you’ll notice that it follows the same basic usage pattern as the previous Tagbox example. This is done on purpose as it makes knowledge about one service transferable to another.