Advanced Machine Learning: Deep Learning with Keras and TensorFlow in R - Practical Examples

published on 16 February 2024

Developing deep learning models can seem daunting to many R users.

Introduction to Advanced Machine Learning with Keras and TensorFlow in R

Advanced machine learning techniques like deep learning allow computers to learn and improve from experience without being explicitly programmed. Deep learning models can process vast amounts of data to uncover complex patterns and insights.

Understanding Advanced Deep Learning Concepts

Deep learning is a subset of machine learning based on artificial neural networks with multiple layers. As opposed to traditional machine learning, deep learning algorithms can process raw, unstructured data and scale better with more data. This makes deep learning ideal for working with images, text, speech, and more.

Some key concepts in deep learning include:

  • Neural networks - Inspired by biological neurons, neural networks have interconnected nodes called artificial neurons that transmit signals.
  • Convolutional neural networks - Used for image recognition and processing.
  • Recurrent neural networks - Specialized for sequence data like text or speech.
  • Activation functions - Mathematical formulas that determine neuron output. Common options are ReLU, sigmoid, and tanh.

The Role of Keras and TensorFlow in R Deep Learning Packages

Keras and TensorFlow are popular Python libraries for building and training deep learning models. The R ecosystem has packages like keras, kerasR, and tensorflow that provide interfaces to leverage Keras and TensorFlow directly within R.

This is impactful because R users can now access state-of-the-art deep learning functionalities without needing to code full models from scratch or switch to Python. Packages like keras also have R-native datasets and models tailored for R workflows.

How to Install Keras and TensorFlow in R

To setup a deep learning environment in RStudio:

  1. Install R and RStudio first if you haven't already.
  2. Open RStudio and install keras and tensorflow packages from CRAN via:
install.packages("keras")
install.packages("tensorflow")
  1. Import libraries and test if the install worked:
library(keras)
library(tensorflow)

The keras and tensorflow packages include all dependencies needed to run Keras and TensorFlow in R. Now you can build deep learning models in RStudio using Keras and TensorFlow APIs.

R Keras Tutorial: Building Your First Neural Network

Keras is a powerful deep learning library for Python that provides a high-level neural network API to build and train deep learning models easily and quickly. The keras R package allows you to access Keras directly from R. In this tutorial, we will walk through a simple example of building and training a neural network for classification using Keras in R.

Importing RStudio’s Keras Package and Dependencies

To get started, we first need to install and load the keras R package. This also installs the TensorFlow backend and Python dependencies automatically:

install.packages("keras")
library(keras)

We also load any other R packages we may need:

library(tidyverse)

Now we are ready to use Keras in R!

Defining Your Keras Model Architecture

Next, we need to define our neural network model architecture in Keras using the keras_model_sequential() function. For example, a simple model with 2 dense layers for a classification task may look like:

model <- keras_model_sequential() %>% 
  layer_dense(units = 16, activation = "relu", input_shape = c(10)) %>%
  layer_dense(units = 1, activation = "sigmoid")

This creates a sequential model with 10 input units that feeds into a 16 unit dense hidden layer with ReLU activation, and finally a 1 unit output layer with sigmoid activation for binary classification.

We can also summarize the model architecture:

summary(model)  

Activation Functions and Layer Configurations

Choosing activation functions like ReLU and sigmoid are important for neural network performance. We typically use ReLU for hidden layers and sigmoid for binary classification output.

We configure layers by setting the number of units, activation functions, regularization mechanisms like dropout, batch normalization etc. This takes some experimentation to find the best architecture.

Training and Validating Your Model

Finally, we can compile our model to configure the loss function and optimizer:

model %>% compile(
  loss = "binary_crossentropy",
  optimizer = "adam"
)

And train the model by fitting the training data:

model %>% fit(
  x_train, y_train, 
  epochs = 10, 
  validation_data = list(x_valid, y_valid)  
)

This will run for 10 epochs while validating on our validation set to prevent overfitting. And that's it! We've now trained our first neural network in R with Keras.

sbb-itb-ceaa4ed

Advanced Deep Learning Techniques with Convolutional Neural Networks (CNNs)

Convolutional neural networks (CNNs) are a specialized type of neural network well-suited for image recognition and advanced deep learning applications. We will explore building CNN models in R using Keras and TensorFlow to tackle complex machine learning problems.

Building a Simple Convolutional Neural Network in R

To get started with CNNs in R, we can build a very simple convolutional neural network using the Keras package. Here is an example:

First, we load the Keras library and packages:

library(keras)
library(tensorflow)

Next, we load our image data and preprocess it by scaling the images down to a 28x28 pixel size:

images <- dataset$train$images / 255
dim(images) <- c(nrow(images), 28, 28, 1)

Then, we define our Keras model using the Sequential API and adding convolutional and dense layers:

model <- keras_model_sequential() 
model %>% 
  layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = "relu", input_shape = c(28, 28, 1)) %>%
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%
  layer_flatten() %>%
  layer_dense(units = 128, activation = "relu") %>%
  layer_dense(units = 10, activation = "softmax")

We finish by compiling the model and training it on our image data:

model %>% compile(
  loss = "categorical_crossentropy",
  optimizer = optimizer_rmsprop(),
  metrics = c("accuracy")
)

model %>% fit(images, labels, epochs = 5, batch_size = 128)

And we have built a simple CNN for image classification in R! We can build on this foundation to construct more complex and accurate CNN architectures.

MNIST Handwritten Digit Classification with CNNs

A common benchmark dataset for testing CNN image classification is the MNIST database of handwritten digits. We can demonstrate how CNNs can successfully categorize these images into the 10 digit classes (0-9).

We load and preprocess the MNIST dataset:

mnist <- dataset_mnist()
x_train <- mnist$train$x / 255
y_train <- to_categorical(mnist$train$y)

Then construct a CNN with convolutional layers, ReLU activations, pooling layers, dropout for regularization, and dense layers for classification:

model <- keras_model_sequential()
model %>%
  layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = "relu", input_shape = c(28, 28, 1)) %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%
  layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = "relu") %>%
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%
  layer_dropout(rate = 0.25) %>%
  layer_flatten() %>%  
  layer_dense(units = 128, activation = "relu") %>%
  layer_dropout(rate = 0.5) %>%
  layer_dense(units = 10, activation = "softmax")

We can achieve over 99% accuracy on the test set with this CNN architecture, successfully categorizing the diverse handwritten digits.

Fine-tuning CNNs with Pooling Layers and Hyperparameter Tuning

To further improve CNN performance, we can fine-tune the model architecture and hyperparameters.

Strategically placing pooling layers in between conv layers helps to downsample feature maps and extract only the most salient features. We can also tune hyperparameters like batch size, number of epochs, and layer sizes.

Here is an example CNN fine-tuned on MNIST:

model %>%
  layer_conv_2d(filters = 32, kernel_size = c(5,5), activation = "relu", input_shape = c(28, 28, 1)) %>%
  layer_max_pooling_2d(pool_size = c(2, 2)) %>% 
  layer_conv_2d(filters = 64, kernel_size = c(5,5), activation = "relu") %>% 
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%
  # ...

model %>% compile(
  optimizer = optimizer_adam(lr = 0.001), 
  loss = "categorical_crossentropy",
  metrics = c("accuracy")  
)

model %>% fit(x_train, y_train, epochs = 30, batch_size = 512, validation_split = 0.2)

Proper tuning provides significant accuracy gains. Pooling extracts the most useful features, and hyperparameters like batch size and learning rate are optimized.

Utilizing Advanced Features: Dropout and BatchNormalization Layers

Finally, we can leverage advanced Keras features like Dropout() and BatchNormalization layers to reduce overfitting and improve model generalization.

Dropout randomly sets input units to 0 during training, preventing complex co-adaptations that lead to overfitting. BatchNormalization layers normalize activations throughout the network, allowing higher learning rates and faster convergence.

Here is an example CNN architecture for MNIST utilizing these layers:

model <- keras_model_sequential()
model %>%
  layer_conv_2d(filters = 32, kernel_size = c(3,3), activation = "relu", input_shape = c(28, 28, 1)) %>%
  layer_batch_normalization() %>%
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%
  layer_conv_2d(filters = 64, kernel_size = c(3,3), activation = "relu") %>%
  layer_batch_normalization() %>%
  layer_dropout(rate = 0.2) %>%
  layer_max_pooling_2d(pool_size = c(2, 2)) %>%  
  # ...  

These techniques lead to state-of-the-art results on MNIST and other image datasets. Advanced deep CNN architectures enable incredible breakthroughs in computer vision and beyond.

Leveraging TensorFlow Core: Keras Functional API and Callbacks

The Keras Functional API in TensorFlow: Beyond Sequential Models

The Keras functional API provides more flexibility for building neural network models beyond the simplicity of Sequential models. With the functional API, you can define models with shared layers, multiple inputs or outputs, directed acyclic graphs (DAGs), and more complex architectures.

To use the functional API, you start by defining input layers and then connecting them through other layers to outputs. This allows you to reuse layers across different parts of the model. For example:

main_input = tf.keras.Input(shape=(28, 28, 1))
x = tf.keras.layers.Conv2D(32, 3, activation='relu')(main_input)
x = tf.keras.layers.MaxPooling2D(2)(x)

auxiliary_output = tf.keras.layers.Dense(10, activation='softmax')(x)

x = tf.keras.layers.Conv2D(64, 3, activation='relu')(x)
x = tf.keras.layers.MaxPooling2D(2)(x)

main_output = tf.keras.layers.Dense(10, activation='softmax')(x)

model = tf.keras.Model(inputs=main_input, outputs=[main_output, auxiliary_output])

This allows you to create complex neural networks beyond stacking layers sequentially.

tf.keras Callbacks: EarlyStopping and Model Checkpoints

tf.keras Callbacks provide ways to customize and enhance model training. Two useful callbacks are:

EarlyStopping: Stop training when a monitored metric stops improving, which helps prevent overfitting.

early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)

ModelCheckpoint: Save checkpoints of the model at certain intervals. This allows you to resume stopped training sessions and use the best performing models.

checkpoint_path = 'training/cp.ckpt'
checkpoint = tf.keras.callbacks.ModelCheckpoint(checkpoint_path, 
                                                 save_weights_only=True,
                                                 save_best_only=True)

Pass these callbacks to model.fit() to use them.

Run Keras and TensorFlow in RStudio: Distributed Training

To speed up model training by leveraging multiple GPUs with distributed training:

  1. Configure a TensorFlow cluster in RStudio across nodes with GPUs using tf.distribute.Server().

  2. Create a distribution strategy to distribute training like tf.distribute.MirroredStrategy().

  3. Wrap model code in the scope of the distribution strategy:

with strategy.scope():
  model = ... # Create Keras model
  model.compile(...)

model.fit(...) # Model trains distributed across nodes  

This allows scaling up Keras model training in RStudio by distributing work across multiple machines.

Save and Load Models for Deployment

To save a Keras model in R:

model %>% tf$keras$models$save_model("path/to/location")

To load:

model <- tf$keras$models$load_model("path/to/location")

This persists the model architecture and weights to disk so it can be deployed for predictions in production environments.

Conclusion: Harnessing the Power of Deep Learning in R

Recap of Deep Learning with Keras and TensorFlow in R

This tutorial demonstrated how easy it is to build and train deep learning models in R using Keras and TensorFlow. We saw how to quickly create neural networks, add layers, compile the models, fit them on data, and make predictions. The practical examples showed how powerful yet simple these tools are for tasks like image classification and natural language processing. Key takeaways include:

  • Keras provides a high-level API that makes building neural networks very fast and simple in R.
  • TensorFlow handles the computation and scaling under the hood.
  • Using RStudio's integrated tools like the Keras package, we can seamlessly run Keras and TensorFlow without leaving the R environment.
  • We walked through hands-on examples for computer vision and NLP using real datasets.
  • Using techniques like convolution and pooling layers, dropout regularization, and hyperparameter tuning, we were able to achieve great performance.

Overall, this tutorial showed how R users can harness the power of deep learning using Keras and TensorFlow to solve real-world problems.

Practical Examples and Real-World Impact

The practical computer vision and NLP examples in this tutorial can serve as templates for real-world applications. For example, the image classification model could be retrained to identify defects in manufacturing or scan medical images for abnormalities. The text classification model could be adapted to analyze customer feedback or process legal contracts. The skills acquired could be applied across industries like autonomous vehicles, finance, healthcare, education and more. The simplicity yet customizability of Keras and TensorFlow in R enables both rapid prototyping and large-scale deployment. As these libraries continue to evolve, deep learning will empower R users to create intelligent systems that transform how we work and live.

Further Learning and Exploration in Advanced Machine Learning

For those inspired to take their R machine learning skills to the next level, the Keras documentation and TensorFlow guides provide extensive resources. There are also many online courses and tutorials focused on both foundations and cutting-edge techniques like GANs and reinforcement learning. Participating in communities like RStudio Connect enables collaboration and insight sharing with peers. As real-world applications drive innovation, R will continue playing a pivotal role in democratizing state-of-the-art deep learning. This is just the beginning of an exciting new era in advanced analytics.

Related posts

Read more