Part 2: Kernel Density Estimation using Density Matrices

Implementation in TensorFlow

December 23, 2021 · 8 mins read

In Part 1 we checked at the key mathematical and statistical concepts that could allow us to talk about Kernel Density Estimation using Density Matrices. In Part 2, we would like to describe how to use the algorithm already implemented in Python by Professor Fabio Gonzalez and his research group MindLab at Universidad Nacional de Colombia using custom layers and models in TensorFlow 2.

Install qmc

Let´s install the module qmc which contains

  1. Custom models inherited from the super class tf.keras.Model, in our case we will use:
    • QMDensity()
  2. Custom layers inherited from the super class tf.keras.layers.Layer, we will take:
    • QFeatureMapRFF() layer of the quantum feature map.
    • QMeasureDensity() layer that actually does the measurement.
    !pip install git+https://github.com/fagonzalezo/qmc.git

For more information about qmc you can check this repository which contains examples in estimation, classification and regression.

Additionally, to recreate this experiment we will use other modules.

Importing libraries

Now we can call the necessary libraries and dependencies to run this experiment.

import numpy as np
import tensorflow as tf
import qmc.tf.layers as layers
import qmc.tf.models as models
from scipy.stats import norm, bernoulli
import matplotlib.pyplot as plt

Kernel Density Estimation using Density Matrices

In order to measure how accurate the algorithm is let’s generate some random data from a two component Gaussian Mixture defined by the following density:

\[f (x) = \alpha\left(\frac{1}{\sqrt{2\pi\sigma^2}}exp\left(\frac{(x-\mu_1)^2}{2\sigma_1^2}\right)\right) + \left(1-\alpha \right) \left(\frac{1}{\sqrt{2\pi\sigma^2}}exp\left(\frac{(x-\mu_2)^2}{2\sigma_2^2}\right)\right).\]

For this we create the Mixture() class.

class Mixture(object):

    def __init__(self, loc1=0, scale1=1, loc2=1, scale2=1, alpha=0.5):
        self.var1 = norm(loc=loc1, scale=scale1)
        self.var2 = norm(loc=loc2, scale=scale2)
        self.alpha = alpha
        
    def pdf(self, x):
        return (self.alpha * self.var1.pdf(x) + 
                (1 - self.alpha) * self.var2.pdf(x))

    def rvs(self, size=1):
        vals = np.stack([self.var1.rvs(size), self.var2.rvs(size)],  axis=-1)
        print(vals.shape)
        idx = bernoulli.rvs(1. - self.alpha, size=size)
        return np.array([vals[i, idx[i]] for i in range(size)])

And instanciate it as an object mixt. Then we generate a random sample using the method rvs

mixt = Mixture(loc1=0, scale1=0.5, loc2=3, scale2=1, alpha=0.5)
sample = mixt.rvs(100)

Notice that for this example we are working on a \(1D\) space, so each element of sample belong to \(\mathbb{R}\) and the way the algorithm will recieve the whole sample is in a form of an array of shape (n_samples , n_dimensions) where the n_dimensions then will be 1. So, let’s reshape our data:

X = sample.reshape((-1, 1))

Now we are ready to use the algorithm just by specifying:

  • dim : Number of Random Fourier Features.
  • gamma: Bandwidth related parameter.

and then just using the custom models and layers in the following way.

# Parameters
n_rffs = 300
gamma = 1
# Training
rffmap_x = layers.QFeatureMapRFF(1, dim = n_rffs, gamma = gamma, random_state = 2021)
kde_dm = models.QMDensity(rffmap_x, n_rffs)
kde_dm.compile()
kde_dm.fit(X, epochs=1)
# Estimation
x = np.linspace(-10., 10., 100)
densities = kde_dm.predict(x.reshape((-1, 1)))

Note: There’s no optimization step on this methodology, when we use the method fit() the density matrix is calculated once and with this matrix we can estimate our densities using the method predict().

Plotting

We graph the results on the new data x where densities where calculated.

plt.plot(x, mixt.pdf(x), 'r-',  alpha=0.6, label='Mixture')
plt.plot(x, densities, 'g-',  alpha=0.6, label='DMKDE')
plt.legend()
plt.show()

Further reading

This paper describes on detail how density matrices can be used as a building block for machine learning models because of their ability to mix linear algebra and probability in a simple way. Novel methods such as

  • Density matrix kernel density classification
  • Quantum measurement classification
  • Quantum measurement regression
  • Quantum measurement ordinal regression

are introduced and are the basis for new methods that might include complex valued density matrices as their authors claim.

More information on customization when using tf this can be found here.