Introduction to Uber’s Ludwig

Uber’s AI Lab continues with open-sourcing deep learning framework with there newest release which is called Ludwig, a toolbox build on top of TensorFlow that allows users to create and train models without writing code.

Finding the right model architecture and hyperparameters for your model is a difficult aspect of the deep learning pipeline. As a data scientist, you can spend hours experimenting with different hyperparameters and architectures to find the perfect fit for your specific problem. This procedure isn’t only time consuming, code-intensive but also requires knowledge of all the algorithms used and state-of-the-art techniques, which are used to squeeze out the last percent of performance.

Ludwig tries to provide you with a toolbox that allows you to train and test your deep learning model without writing code. This helps domain experts without a lot of deep learning knowledge to build there own high performing models.

Ludwig

Uber has developed Ludwig internally over the past two years to streamline and simplify the use of deep learning models. They have witnessed its value on several of their own projects such as information extraction from driver licenses, identification of points of interest during conversations between driver-partners and riders and many more. For this reason, they decided to release it as open source, so everybody can get the flexibility and ease of use Ludwig provides.

Ludwig was built with the following core principles:

  • No coding required: no coding skills are required to train a model and use it for obtaining predictions.
  • Generality: a new data type-based approach to deep learning model design that makes the tool usable across many different use cases.
  • Flexibility: experienced users have extensive control over model building and training, while newcomers will find it easy to use.
  • Extensibility: easy to add new model architecture and new feature data types.
  • Understandability: deep learning model internals are often considered black boxes, but we provide standard visualizations to understand their performance and compare their predictions.

Ludwig allows us to train a deep learning model by only providing a file containing the data like a csv and a YAML configuration file in which we need to specify some information about the features contained in our data file like if they are dependent or independent variables. If more than one dependent/output variable is specified, Ludwig will learn to predict all of the output simultaneously.

The main new idea behind Ludwig is the notion of data-type-specific encoders and decoders. These specific type of encoders and decoders can be set in the configuration file and provide us with a highly modularized and extensible architecture that has specific preprocessing steps for each type of data.

Figure 2: Different input and output features

This design gives the user access to a lot of different functions and options that allow them to build cutting edge models for there specific domain without demanding a lot of deep learning knowledge.

Using Ludwig

To use Ludwig we need to install it which can be done with the following command:

pip install git+https://github.com/uber/ludwig
python -m spacy download en

The next step would be to create our model definition YAML file that specifies our input and output features as well as some additional information about the specific preprocessing steps we want to take.

But before we can create this file we need to decide what data-set we want to use. For this article I decided to use the Twitter US Airline Sentiment data-set, which is freely available for download.

Now that we have our dataset we can start writing our model definition.

input_features:
 -
  name: text
  type: text

output_features:
 -
  name: airline_sentiment
  type: category

With our YAML configuration file ready, we can start training our model using the following command:

ludwig train –data_csv Tweets.csv –model_definition_file model_definition.yaml

Ludwig now performs a random data split into training, validation and test set, preprocesses them and then builds a model with the specified encoders and decoders.

It also displayes the training process inside the console and also provides TensorBoard capapility.

After training, Ludwig creates a result directory containing the trained model with its hyperparameters as well as some summary statistics which can be used to visualize the training process. One of these visualizations can be executed with the following command:

ludwig visualize –visualization learning_curves –training_stats results/training_stats.json

This will display a graph that shows the loss and accuracy as functions of the number of epochs.

Figure 3: Loss and accuracy plots

After training we can use the model to make predictions by typing:

ludwig predict –data_csv path/to/data.csv –model_path /path/to/model

Ludwig’s programmatic API

Ludwig also provides a Python programmatic API that allows us to train or load a model using Python. The problem above can be implemented using the programmatic API as shown below.

from ludwig import LudwigModel
import pandas as pd

df = pd.read_csv('Tweets.csv')
print(df.head())

model_definition = {
    'input_features':[
        {'name':'text', 'type':'text'},
    ],
    'output_features': [
        {'name': 'airline_sentiment', 'type': 'category'}
    ]
}

print('creating model')
model = LudwigModel(model_definition)
print('training model')
train_stats = model.train(data_df=df)
model.close()

Conclusion

Ludwig is a toolbox build on top of TensorFlow that allows users to create and train models without writing code.

It provides us with lots of different functions and options — like data-type specific encoders and decoders — that allow us to build cutting edge deep learning models.

If you liked this article consider subscribing on my Youtube Channel and following me on social media.

The code covered in this article is available as a Github Repository.

If you have any questions, recommendations or critiques, I can be reached via Twitter or the comment section.