What is Optimal Transport


The Basics of Optimal Transport

Optimal Transport is a mathematical concept that has seen increasing attention in recent years due to its applications in various fields such as computer vision, machine learning, and economics. In simple terms, optimal transport is concerned with finding the most efficient way to move a set of objects or data from one distribution to another.

Optimal transport, also known as transport theory or Monge-Kantorovich theory, was first introduced by Gaspard Monge in the late 18th century. It was later developed and extended by Leonid Kantorovich in the mid-20th century. However, it was not until the late 1990s that optimal transport became widely recognized in the mathematics community as a powerful tool for various branches of applied mathematics.

The theory of optimal transport provides a way to quantitatively measure the distance between two probability distributions. This distance is known as the Wasserstein distance, and it is calculated by finding an optimal way to transport the mass from one distribution to another. The main idea is to minimize the cost of transporting the mass, where the cost can be defined in various ways such as the Euclidean distance, cosine similarity, or any other distance metric.

Applications of Optimal Transport

Optimal Transport has a wide range of applications in various fields, including:

  • Computer Vision: Optimal transport has been used to compare two images or point clouds and detect the differences between them. It has also been used to match two sets of features in computer vision problems.
  • Machine Learning: Optimal transport has been used in various machine learning tasks such as clustering, classification, and regression. It has been used to compare the distributions of features in different classes and measure the similarity between them.
  • Economics: Optimal transport has been applied to economic problems such as matching markets, auction theory, and transportation economics. It has been used to calculate the optimal transportation of goods from one location to another, minimizing transportation costs, and improving supply chain efficiency.
The Mathematics of Optimal Transport

The math behind optimal transport can be quite complex, but the basic idea is relatively straightforward. Given two probability distributions P and Q, the goal is to find the optimal way to transport the mass from P to Q. The mass can be thought of as a set of particles, and the goal is to move each particle from its starting point in P to its corresponding point in Q.

However, there are various constraints that must be considered when transporting the mass. The mass cannot disappear during the transportation, and it cannot be split or merged with other particles. Additionally, the total mass must remain the same throughout the transportation.

The cost of transportation is typically defined in terms of a cost function c(x,y), which measures the cost of transporting a particle from point x in P to point y in Q. The cost function can be arbitrary, but it must satisfy certain conditions such as being continuous and non-negative.

The optimal transport plan between P and Q is a measure that describes how the mass in P is distributed in Q. The optimal transport plan can be thought of as a matrix T, where T(i,j) represents the amount of mass transported from point i in P to point j in Q. The optimal transport plan must satisfy certain constraints such as the mass conservation constraint mentioned earlier.

The Wasserstein distance between P and Q is defined as the minimum cost of transporting the mass from P to Q:

W(P,Q) = inf Σi,j (T(i,j) * c(xi,yj))

Discrete Optimal Transport

Optimal transport is often computationally expensive, especially when dealing with large datasets or continuous distributions. Therefore, researchers have developed several algorithms for discrete optimal transport, where the distributions are defined on a finite set of points.

Discrete optimal transport involves finding an optimal transport plan between two discrete distributions P and Q. The optimal transport plan can be found using algorithms such as the Hungarian algorithm, Sinkhorn-Knopp algorithm, or the entropic regularization method.

The Hungarian algorithm is a classic algorithm for the assignment problem, which is a special case of the optimal transport problem. The algorithm finds the optimal one-to-one assignment between two sets of points that minimizes the total cost of assignments.

The Sinkhorn-Knopp algorithm is an iterative algorithm that can be used to efficiently compute the optimal transport plan. The algorithm is based on matrix scaling, where the transportation matrix T is repeatedly scaled until it satisfies certain constraints.

The entropic regularization method is a recent algorithm that has gained popularity in the machine learning community. The method involves adding an entropy regularization term to the cost function, which makes the optimization problem convex and leads to faster convergence.

Conclusion

Optimal transport is a powerful mathematical concept that has many applications in various fields. It provides a way to quantitatively measure the similarity between two distributions and find an optimal way to transport the mass from one distribution to another. Optimal transport has also been used to solve various problems in computer vision, machine learning, and economics.

The math behind optimal transport can be quite complex, but researchers have developed several algorithms for discrete optimal transport that can efficiently compute the optimal transport plan. Such discrete optimal transport algorithms have found a wide range of applications in various fields, including computer vision and machine learning.