Glow: Better Reversible Generative Models (2023)

6 minute read

We introduce Glow, a reversible generative model which uses invertible 1x1 convolutions. It extends previous work on reversible generative models and simplifies the architecture. Our model can generate realistic high resolution images, supports efficient sampling, and discovers features that can be used to manipulate attributes of data. We're releasing code for the model and an online visualization tool so people can explore and build on these results.

Read Paper

An interactive demo of our model to manipulate attributes of your face, and blend with other faces

Motivation

Manipulating attributes of images of researchers Prafulla Dhariwal and Durk Kingma. The model isn't given attribute labels at training time, yet it learns a latent space where certain directions correspond to changes in attributes like beard density, age, hair color, and so on.

Generative modeling is about observing data, like a set of pictures of faces, then learning a model of how this data was generated. Learning to approximate the data-generating process requires learning all structure present in the data, and successful models should be able to synthesize outputs that look similar to the data. Accurate generative models have broad applications, including speech synthesis, text analysis and synthesis, semi-supervised learning and model-based control. The technique we propose can be applied to those problems as well.

Glow is a type of reversible generative model, also called flow-based generative model, and is an extension of the NICE and RealNVP techniques. Flow-based generative models have so far gained little attention in the research community compared to GANs and VAEs.

Some of the merits of flow-based generative models include:

  • Exact latent-variable inference and log-likelihood evaluation. In VAEs, one is able to infer only approximately the value of the latent variables that correspond to a datapoint. GAN's have no encoder at all to infer the latents. In reversible generative models, this can be done exactly without approximation. Not only does this lead to accurate inference, it also enables optimization of the exact log-likelihood of the data, instead of a lower bound of it.
  • Efficient inference and efficient synthesis. Autoregressive models, such the PixelCNN, are also reversible, however synthesis from such models is difficult to parallelize, and typically inefficient on parallel hardware. Flow-based generative models like Glow (and RealNVP) are efficient to parallelize for both inference and synthesis.
  • Useful latent space for downstream tasks. The hidden layers of autoregressive models have unknown marginal distributions, making it much more difficult to perform valid manipulation of data. In GANs, datapoints can usually not be directly represented in a latent space, as they have no encoder and might not have full support over the data distribution. This is not the case for reversible generative models and VAEs, which allow for various applications such as interpolations between datapoints and meaningful modifications of existing datapoints.
  • Significant potential for memory savings. Computing gradients in reversible neural networks requires an amount of memory that is constant instead of linear in their depth, as explained in the RevNet paper.

Results

Using our techniques we achieve significant improvements on standard benchmarks compared to RealNVP, the previous best published result with flow-based generative models.

Dataset RealNVP Glow
CIFAR-10 3.49 3.35
Imagenet 32x32 4.28 4.09
Imagenet 64x64 3.98 3.81
LSUN (bedroom) 2.72 2.38
LSUN (tower) 2.81 2.46
LSUN (church outdoor) 3.08 2.67

Quantitative performance in terms of bits per dimension evaluated on the test set of various datasets, for the RealNVP model versus our Glow model.*

(Video) Glow: Better Reversible Generative Models

Samples from our model after training on a dataset of 30,000 high resolution faces

Glow models can generate realistic-looking high-resolution images, and can do so efficiently. Our model takes about 130ms to generate a 256 x 256 sample on a NVIDIA 1080 Ti GPU. Like previous work, we found
that sampling from a reduced-temperature model often results in higher-quality samples. The samples above were obtained by scaling the standard deviation of the latents by a temperature of 0.7.

Interpolation in latent space

We can also interpolate between arbitrary faces, by using the encoder to encode the two images and sample from intermediate points. Note that the inputs are arbitrary faces and not samples from the model, thus providing evidence that the model has support over the full target distribution.

Interpolating between Prafulla's face and celebrity faces.

Manipulation in latent space

We can train a flow-based model, without labels, and then use the learned latent reprentation for downstream tasks like manipulating attributes of your input. These semantic attributes could be the color of hair in a face, the style of an image, the pitch of a musical sound, or the emotion of a text sentence. Since flow-based models have a perfect encoder, you can encode inputs and compute the average latent vector of inputs with and without the attribute. The vector direction between the two can then be used to manipulate an arbitrary input towards that attribute.

The above process requires a relatively small amount of labeled data, and can be done after the model has been trained (no labels are needed while training). Previous work using GAN's requires training an encoder separately. Approaches using VAE's only guarantee that the decoder and encoder are compatible for in-distribution data. Other approaches involve directly learning the function representing the transformation, like Cycle-GAN's, however they require retraining for every transformation.

# Train flow model on large, unlabelled dataset Xm = train(X_unlabelled)# Split labelled dataset based on attribute, say blonde hairX_positive, X_negative = split(X_labelled)# Obtain average encodings of positive and negative inputsz_positive = average([m.encode(x) for x in X_positive])z_negative = average([m.encode(x) for x in X_negative])# Get manipulation vector by taking differencez_manipulate = z_positive - z_negative# Manipulate new x_input along z_manipulate, by a scalar alpha \in [-1,1]z_input = m.encode(x_input)x_manipulated = m.decode(z_input + alpha * z_manipulate)

Simple code snippet for using a flow-based model for manipulating attributes

Contribution

Our main contribution and also our departure from the earlier RealNVP work is the addition of a reversible 1x1 convolution, as well as removing other components, simplifying the architecture overall.

The RealNVP architecture consists of sequences of two types of layers: layers with checkboard masking, and layers with channel-wise masking. We remove the layers with checkerboard masking, simplifying the architecture. The layers with channel-wise masking perform the equivalent of a repetition of the following steps:

(Video) PR-116: Glow: Generative Flow with Invertible 1x1 Convolutions

  1. Permute the inputs by reversing their ordering across the channel dimension.
  2. Split the input into two parts, A and B, down the middle of the feature dimension.
  3. Feed A into a shallow convolutional neural network. Linearly transform B according to the output of the neural network.
  4. Concatenate A and B.

By chaining these layers, A updates B, then B updates A, then A updates B, etc. This bipartite flow of information is clearly quite rigid. We found that model performance improves by changing the reverse permutation of step (1) to a (fixed) shuffling permutation.

Taking this a step further, we can also learn the optimal permutation. Learning a permutation matrix is a discrete optimization that is not amendable to gradient ascent. But because the permutation operation is just a special case of a linear transformation with a square matrix, we can make this work with convolutional neural networks, as permuting the channels is equivalent to a 1x1 convolution operation with an equal number of input and output channels. So we replace the fixed permutation with learned 1x1 convolution operations. The weights of the 1x1 convolution are initialized as a random rotation matrix. As we show in the figure below, this operation leads to significant modeling improvements. We've also shown that the computations involved in optimizing the objective function can be done efficiently through a LU decomposition of the weights.

Glow: Better Reversible Generative Models (1)

Our main contribution, invertible 1x1 convolutions, leads to significant modeling improvements.

In addition, we remove batch normalization and replace it with an activation normalization layer. This layer simply shifts and scales the activations, with data-dependent initialization that normalizes the activations given an initial minibatch of data. This allows scaling down the minibatch size to 1 (for large images) and scaling up the size of the model.

Scale

Our architecture combined with various optimizations, such as gradient checkpointing, allows us to train flow-based generative models on a larger scale than usual. We used Horovod to easily train our model on a cluster of multiple machines; the model used in our demo was trained on 5 machines with each 8 GPUs. Using this setup we train models with over a hundred million parameters.

Directions for research

Our work suggests that it's possible to train flow-based models to generate realistic high-resolution images, and learned latent representations that can be easily used for downstream tasks like manipulation of data. We suggest a few directions for future work:

  1. Be competitive with other model classes on likelihood. Autoregressive models and VAE's perform better than flow-based models on log-likelihood, however they have the drawbacks of inefficient sampling and inexact inference respectively. One can combine flow-based models, VAEs and autoregresive models to trade off their strengths; this would be an interesting direction for future work.
  2. Improve architecture to be more compute and parameter efficient. To generate realistic high-resolution images, the face generation model uses ~200M parameters and ~600 convolution layers, which makes it expensive to train. Models with smaller depth performed worse on learning long-range dependencies. Using self attention architectures, or performing progressive training to scale to high resolutions could make it computationally cheaper to train glow models.

Finally, if you’d like use Glow in your research, we encourage you to check out our paper for more details, or look at our code on this Github repo.

(Video) Glow: Better Reversible Generative Models

FAQs

What is the glow model? ›

Glow is a type of reversible generative model, also called flow-based generative model, and is an extension of the NICE and RealNVP techniques. Flow-based generative models have so far gained little attention in the research community compared to GANs and VAEs.

What is glow in machine learning? ›

Glow is a machine learning compiler that accelerates the performance of deep learning frameworks on different hardware platforms. It enables the ecosystem of hardware developers and researchers to focus on building next gen hardware accelerators that can be supported by deep learning frameworks like PyTorch.

What are examples of generative models? ›

Examples of Generative Models
  • ‌Naïve Bayes.
  • Bayesian networks.
  • Markov random fields.
  • ‌Hidden Markov Models (HMMs)
  • Latent Dirichlet Allocation (LDA)
  • Generative Adversarial Networks (GANs)
  • Autoregressive Model.
Jul 19, 2021

What is the advantage of generative Modelling? ›

In a generative model, each class is learned individually and only considers the data whose labels correspond to it. The model does not focus upon inter-model discrimination and avoids considering the data as whole. Thus the learning is simplified and the algorithms proceeds faster.

Who is the founder of glow? ›

Its founders and co-CEOs, Christine Chang and Sarah Lee first created Glow Recipe in 2014 as a curation of other K-beauty products imported from Seoul to help smaller brands launch in the U.S., then started their own in-house skin-care line three years later.

How does a glow work? ›

When you have something like a toy that glows in the dark, it can glow because it contains materials called phosphors. Phosphors can radiate light after they have gotten energy from the sun or another source of bright light. The phosphors soak up the energy from the light, and then they radiate this energy as light.

What is PyTorch glow? ›

Glow is a machine learning compiler that accelerates the performance of neural network frameworks on different hardware platforms. The compiler takes in machine learning frameworks such as PyTorch and Tensorflow and produces optimized code for accelerators.

What is TensorFlow light? ›

TensorFlow Lite is a mobile library for deploying models on mobile, microcontrollers and other edge devices. See the guide. Guides explain the concepts and components of TensorFlow Lite. See examples. Explore TensorFlow Lite Android and iOS apps.

What are normalizing flows? ›

In simple words, normalizing flows is a series of simple functions which are invertible, or the analytical inverse of the function can be calculated. For example, f(x) = x + 2 is a reversible function because for each input, a unique output exists and vice-versa whereas f(x) = x² is not a reversible function.

Is CNN a generative model? ›

Introduction. This paper investigates generative modeling of the convolutional neural networks (CNNs). The main contributions include: (1) We construct a generative model for CNNs in the form of exponential tilting of a reference distribution.

Is Bert a generative model? ›

Overview. The BertGeneration model is a BERT model that can be leveraged for sequence-to-sequence tasks using EncoderDecoderModel as proposed in Leveraging Pre-trained Checkpoints for Sequence Generation Tasks by Sascha Rothe, Shashi Narayan, Aliaksei Severyn.

Is PCA a generative model? ›

It turns out that both PCA and FA can be viewed as special cases of the generative model described above.

What are the drawbacks of generative Modelling? ›

Generative models use probability estimates and likelihood to model data points and distinguish between different class labels in a dataset. These models are capable of generating new data instances. However, they also have a major drawback. The presence of outliers affects these models to a significant extent.

What are the limitations of generative design? ›

Generative design also comes with drawbacks, though not necessarily of its own doing. The biggest is its potential to automate many jobs and make human workers redundant. That's especially true in the construction industry. Wood trades, painters, plasterers, floorers, and decorators are also vulnerable to automation.

What are the disadvantages of Gan? ›

Disadvantages of Generative Adversarial Networks

This is because the two networks in a GAN (the generator and the discriminator) are constantly competing against others, which can make training unstable and slow. Additionally, GANs often require a large amount of training data in order to produce good results.

Did Glow Recipe start on Shark Tank? ›

Nearly a decade ago, best friends Christine Chang and Sarah Lee braved the Shark Tank with Glow Recipe, a brand simplifying Korean Beauty for everyday Americans. Robert Herjavec offered the duo a deal, but they opted to pass.

Who is Ruth from GLOW? ›

Ruth Wilder is one of the main characters of GLOW, portrayed by Alison Brie. She wrestles under the name Zoya the Destroya. Her character speaks with an exaggerated Russian accent, part of an act called the "Red Scare." Her stage persona is based loosely on the real-life GLOW wrestler Colonel Ninotchka.

How do you induce a glow? ›

17 Cheap Ways To Glow Up
  1. Stay Hydrated. ...
  2. Maintaining A Healthy Diet Plan. ...
  3. Maintain Proper Hygiene. ...
  4. Plan Out Your Outfits. ...
  5. Get Enough Rest. ...
  6. Cut Off Negativity. ...
  7. Workout And Exercise Regularly. ...
  8. Shower Daily.
Jun 15, 2022

How long does it take glow to work? ›

With continuous use, it is possible to see visible results in 8 to 12 weeks. However, results may vary for GLOW depending on the individual's skin condition & lifestyle habits.

How long does glow charge last? ›

How bright is the glow's intensity? Brightness varies between 900 to 26,000 mcd/m2 after a full charge and will typically last for about 30 minutes at peak brightness before it settles into a lesser intensity.

Is PyTorch overtaking TensorFlow? ›

It is obvious from the above data that PyTorch currently dominates the research landscape. While TensorFlow 2 made utilizing TensorFlow for research a lot easier, PyTorch has given researchers no reason to go back and give TensorFlow another try.

Why did PyTorch overtake TensorFlow? ›

Most researchers prefer PyTorch's API to TensorFlow's API. This is partially because PyTorch is better designed and partially because TensorFlow has handicapped itself by switching APIs so many times (e.g. 'layers' -> 'slim' -> 'estimators' -> 'tf. keras').

Is PyTorch harder than TensorFlow? ›

PyTorch vs TensorFlow: Debugging

Since PyTorch uses immediate execution (i.e., eager mode), it is said to be easier to use than TensorFlow when it comes to debugging.

Is PyTorch better than TF? ›

TensorFlow offers better visualization, which allows developers to debug better and track the training process. PyTorch, however, provides only limited visualization. TensorFlow also beats PyTorch in deploying trained models to production, thanks to the TensorFlow Serving framework.

Is TensorFlow Lite faster than TensorFlow? ›

Using TensorFlow Lite we see a considerable speed increase when compared with the original results from our previous benchmarks using full TensorFlow. We see an approximately ×2 increase in inferencing speed between the original TensorFlow figures and the new results using TensorFlow Lite.

What are the three rules of normalization? ›

Eliminate repeating groups in individual tables. Create a separate table for each set of related data. Identify each set of related data with a primary key.

Does normalizing improve accuracy? ›

The short answer is — it dramatically improves model accuracy. Normalization gives equal weights/importance to each variable so that no single variable steers model performance in one direction just because they are bigger numbers.

Why GANs are better than CNN? ›

Both the FCC- GAN models learn the distribution much more quickly than the CNN model. A er ve epochs, FCC-GAN models generate clearly recognizable digits, while the CNN model does not. A er epoch 50, all models generate good images, though FCC-GAN models still outperform the CNN model in terms of image quality.

What is the difference between GANs and CNNs? ›

Discriminator is a Convolutional Neural Network consisting of many hidden layers and one output layer, the major difference here is the output layer of GANs can have only two outputs, unlike CNNs, which can have outputs respect to the number of labels it is trained on.

Is SVM a generative model? ›

Generative models such as HMMs and GMMs focus on estimating the density of the data and are not suitable for classifying the data of confusable classes. Discriminative classifiers such as support vector machines (SVM) are suitable for the fixed dimensional patterns.

Why is GPT better than BERT? ›

Finally, there are differences in terms of size as well. While both models are very large (GPT-3 has 1.5 billion parameters while BERT has 340 million parameters), GPT-3 is significantly larger than its predecessor due to its much more extensive training dataset size (470 times bigger than the one used to train BERT).

Is BERT always better than LSTM? ›

Our experimental results show that bidirectional LSTM models can achieve significantly higher results than a BERT model for a small dataset and these simple models get trained in much less time than tuning the pre-trained counterparts.

Why is BERT bidirectional? ›

BERT is bidirectional because its self-attention layer performs self-attention on both directions.

Is Knn a generative model? ›

KNN is a discriminative algorithm since it models the conditional probability of a sample belonging to a given class.

Is Lstm a generative model? ›

LSTM is a widely used deep generative model in natural language processing6,7.

Is Random Forest a generative model? ›

Meta-Tree Random Forest: Probabilistic Data-Generative Model and Bayes Optimal Prediction.

What is the major problem with GAN? ›

In recent times, GANs has achieved outstanding performance in producing natural images. However, there exist major challenges in training of GANs, i.e., mode collapse, non-convergence and instability, due to inappropriate design of network architecture, use of objective function and selection of optimization algorithm.

Why is GAN unstable? ›

The fact that GANs are composed by two networks, and each one of them has its loss function, results in the fact that GANs are inherently unstable- diving a bit deeper into the problem, the Generator (G) loss can lead to the GAN instability, which can be the cause of the gradient vanishing problem when the ...

Are generative models Bayesian? ›

All Bayesian Models are Generative (in Theory)

Why generative design is the future? ›

An important benefit of this technology is the ability to consolidate parts. A generative design program can create complex parts that can easily replace multiple single parts. Through additive manufacturing processes, these parts with complex geometry can be manufactured with ease.

What algorithms are used in generative design? ›

By defining parameters and rules, the generative approach is able to provide optimized solution for both structural stability and aesthetics. Possible design algorithms include cellular automata, shape grammar, genetic algorithm, space syntax, and most recently, artificial neural network.

What companies use generative design? ›

Companies such as Airbus, Black & Decker, Under Armour, and other massive corporations embrace generative design as a trend molding the future of the engineering industry.

Why is it so hard to train GaN? ›

GANs are difficult to train. The reason they are difficult to train is that both the generator model and the discriminator model are trained simultaneously in a game. This means that improvements to one model come at the expense of the other model.

Why do GANs fail? ›

Training GANs can be a challenging task. This is because the generator and the discriminator networks compete against each other during the training. In fact, if one network learns too quickly, then the other network may fail to learn. This can often result in the network not being able to converge.

Can you Overfit a GaN? ›

The learning process of GAN models typically trains a generator and discriminator in turn. However, overfitting problems occur when the discriminator depends excessively on the training data. When this problem persists, the image created by the generator shows a similar appearance to the learning image.

Is glow a dating app? ›

I have been using Glow - Video Chat, Dating for a while and have always really liked it. It's easy to use and I find that the people I have to choose from are higher quality. It's also very easy to filter people you aren't interested in thanks to their preferences feature.

What organization is glow? ›

Girls know what girls need.

Global G.L.O.W. is a nonprofit organization that creates and operates innovative programs to mentor girls to advocate for themselves and make their communities stronger.

How does glow in the dark technology work? ›

Photoluminescent products contain phosphors, that when energised by light, glow in the dark. The phenomenon of photoluminescence is created by the absorption of visible, UV or Infra-red radiation and is a non-radioactive process. The pigments absorb and store photons or 'particles' of light from the light source.

What is the science behind glow in the dark? ›

Phosphorescence, the phenomenon that you see when something glows in the dark, occurs when a material absorbs energy from a light source, and then continues to emit light after the light source has been removed.

What dating app do millionaires use? ›

Rich Meet Beautiful: Excellent wealthy dating app. Elite Singles: Not only successful but very educated professionals. The League: Premium millionaire dating site. eHarmony: Great for marrying attractive and successful people.

What dating app do billionaires use? ›

Our top pick for this list is Millionaire Match which has been around since 2001. It is the perfect dating site for rich and attractive singles. Most of the members you'll see here are doctors, lawyers, CEOs, professional models, beauty queens, and even Hollywood celebrities.

What is the most discreet dating app? ›

  • Plenty Of Fish - Great Dating Website With Large User Base.
  • EliteSingles - Top Dating App for Professionals.
  • Bumble - Excellent Dating App for Women.
  • Hinge - Most Discreet Adult Dating App.
  • Victoria Milan - Great Option for Privacy.
  • NoStringsAttached.com - Best for Casual Dating.
  • Find Love on eHarmony.
  • Find Love on eHarmony.
May 31, 2022

Is glow Google classroom? ›

You can access Google classroom in two ways. If you are using a school iPad you can only access Google Classroom via Glow.

What religion is glow church? ›

Associate Professor Singleton said Pentecostal churches like Glow had managed to attract worshippers from other Christian denominations – many which have seen congregations shrink in recent years.

Videos

1. What are Normalizing Flows?
(Ari Seff)
2. [FFJORD] Free-form Continuous Dynamics for Scalable Reversible Generative Models (Part 1) | AISC
(ML Explained - Aggregate Intellect - AI.SCIENCE)
3. 12 - Generative Models - Emily Denton
(NERSC)
4. Full-Glow: Fully conditional Glow for more realistic image generation
(DAGM GCPR 2021)
5. How Generative Models Work
(Runway)
6. Behind the Scenes of GANs, VAEs and Flow-based Generative Models
(Sima_CV_ML)

References

Top Articles
Latest Posts
Article information

Author: Arielle Torp

Last Updated: 09/17/2023

Views: 6049

Rating: 4 / 5 (41 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Arielle Torp

Birthday: 1997-09-20

Address: 87313 Erdman Vista, North Dustinborough, WA 37563

Phone: +97216742823598

Job: Central Technology Officer

Hobby: Taekwondo, Macrame, Foreign language learning, Kite flying, Cooking, Skiing, Computer programming

Introduction: My name is Arielle Torp, I am a comfortable, kind, zealous, lovely, jolly, colorful, adventurous person who loves writing and wants to share my knowledge and understanding with you.