=======================================
Gaussian-Process Factor Analysis (GPFA)
=======================================

Gaussian-process factor analysis (GPFA) is a dimensionality reduction method [Yu2009]_ that extracts smooth, low-dimensional latent trajectories from noisy, high-dimensional time series data.
GPFA applies `factor analysis (FA) <https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.FactorAnalysis.html>`_ to observed data to simultaneously reduce their dimensionality and smooth the resulting low-dimensional trajectories by fitting a `Gaussian process (GP) <https://scikit-learn.org/stable/modules/generated/sklearn.gaussian_process.kernels.Kernel.html#sklearn.gaussian_process.kernels.Kernel>`_ model to them.
See the below :ref:`notes` for its main differences to the popular PCA dimensionality redunction method.

This documentation describes `BlockInvGPFA`, a Python implementation of the GPFA model that incorporates several performance and usability improvements over existing implementations (e.g., from the Elephant toolbox [Denker2018]_).

The essential ingredients of the generative model underlying GPFA are *observations* :math:`\boldsymbol{X}` and *latent variables* :math:`\boldsymbol{Z}`.
For each high-dimensional time series in a dataset, the :math:`x_\textrm{dim}`-dimensional time series is split into :math:`T` time bins of size :math:`\delta t`, leading to the :math:`x_\textrm{dim} \times T`  matrix :math:`\boldsymbol{X}` that contains these data.
The observations in each time bin are assumed to be stochastically generated by a :math:`z_\textrm{dim}`-dimensional latent variable (:math:`z_\textrm{dim} < x_\textrm{dim}`) through some noisy linear mapping.
Across time, these latent variables are collected in the :math:`z_\textrm{dim} \times T` matrix :math:`\boldsymbol{Z}`.
A dataset usually contains a set of observations :math:`\boldsymbol{X}`'s that each have their associated latent variabes :math:`\boldsymbol{Z}`, and that all have the same observation and latent dimensions, :math:`x_\textrm{dim}` and :math:`z_\textrm{dim}`, but might differ in their number of time bins :math:`T`. 

.. _gpfa_prob_model:

GPFA's generative model is defined by the emission and the latent time-course models:

1. **Emission model**, :math:`p(\boldsymbol{X} | \boldsymbol{Z})`: GPFA's emission model assumes that, in each time bin :math:`t`, the observations :math:`\boldsymbol{x}_{:,t}` are generated independently through a stochastic linear mapping from the corresponding latent variables :math:`\boldsymbol{z}_{:,t}`,

   .. math:: \boldsymbol{x}_{:,t} | \boldsymbol{z}_{:,t} \sim \mathcal{N} \left( \boldsymbol{C} \boldsymbol{z}_{:,t} + \boldsymbol{d}, \boldsymbol{R} \right) ,

   where :math:`\boldsymbol{x}_{:,t}` and :math:`\boldsymbol{z}_{:,t}` are the :math:`t` th column (i.e., the :math:`t` th time bin) of :math:`\boldsymbol{X}` and :math:`\boldsymbol{Z}`, respectively, :math:`\mathcal{N}\left( \boldsymbol{\mu}, \boldsymbol{\Sigma}\right)` denotes the multivariate Gaussian with mean :math:`\boldsymbol{\mu}` and covariance :math:`\boldsymbol{\Sigma}`, and the loading matrix :math:`\boldsymbol{C}` (:math:`x_\textrm{dim} \times z_\textrm{dim}`), bias vector :math:`\boldsymbol{d}` (:math:`x_\textrm{dim}`), and diagonal (as in FA) emission covariance matrix :math:`\boldsymbol{R}` (:math:`x_\textrm{dim} \times x_\textrm{dim}`) are model parameters.

2. **Latent time-course model**, :math:`p(\boldsymbol{Z})`: GPFA assumes that the latent variables evolve in each latent dimension :math:`i` (or *factor*) independently according to a Gaussian process,

   .. math:: \boldsymbol{z}_{i,:} \sim \mathcal{GP} \left( \boldsymbol{0}, \boldsymbol{K} \right) ,

   where :math:`\boldsymbol{z}_{i,:}` is the :math:`i` th row (i.e., :math:`i` th latent dimension/factor) of :math:`\boldsymbol{Z}`, and :math:`\mathcal{GP} \left( \boldsymbol{0}, \boldsymbol{K} \right)` denotes a zero-mean Gaussian process with covariance kernel :math:`\boldsymbol{K}`. 

Overall, the model has emission model parameters :math:`\boldsymbol{C}`, :math:`\boldsymbol{d}`, and :math:`\boldsymbol{R}` and time-course model parameters that are the parameters of the chosen GP kernel :math:`\boldsymbol{K}`. When GPFA is fit to a set of observations, :math:`\boldsymbol{X}_1, \boldsymbol{X}_2, \dots`, GPFA adjusts these parameters and associated latent variables to best explain these observations.


.. _notes:

Notes
=====

GPFA differs from the popular `PCA <https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html>`_ in two significant ways:

1. It performs dimensionality reduction by factor analysis, which supports different residual variances in different data dimensions, whereas PCA assumed this residual variance to be the same across all dimensions.
2. PCA does not assume any temperal dependencies between different latent variables.
   GPFA, in contrast assumes latent trajecories to be smooth across time, and models this assumption by a Gaussian process.

.. _examples:

Examples
========

.. _example_1:

(i) Applying BlockInvGPFA to synthetic data
-------------------------------------------

This example illustrates the application of BlockInvGPFA to data analysis by encompassing synthetic data generation, BlockInvGPFA model fitting, and latent variable extraction. It emphasizes a step-by-step explanation of utilizing BlockInvGPFA for data analysis, highlighting key functionalities of the library.

.. code-block:: python

   import numpy as np
   from blockinvgpfa import BlockInvGPFA
   from sklearn.gaussian_process import GaussianProcessRegressor
   from sklearn.gaussian_process.kernels import ConstantKernel, RBF, WhiteKernel

   # Simulation parameters
   rng_seeds = [0, 42]
   z_dim = 2
   x_dim = 10
   tau_f = 0.6
   sigma_n = 0.001
   sigma_f = 1 - sigma_n
   bin_size = 0.05  # [s]
   num_obs = len(rng_seeds)
   T_per_obs = 400 
   kernel = ConstantKernel(sigma_f, constant_value_bounds='fixed') * RBF(length_scale=tau_f) + \
         ConstantKernel(sigma_n, constant_value_bounds='fixed') * WhiteKernel(noise_level=1, noise_level_bounds='fixed')

   tsdt = np.arange(0, T_per_obs) * bin_size

   np.random.seed(2)
   C = np.random.uniform(0, 2, (x_dim, z_dim))     # loading matrix
   sqrtR = np.random.uniform(0, 0.5, x_dim)

   # Generate latent and observation sequences in line with GPFA model
   X = []
   Z = []
   for n in range(num_obs):
      np.random.seed(rng_seeds[n])

      # Sample latent sequence according to GPFA latent time-course model
      gp_model = GaussianProcessRegressor(kernel=kernel)
      
      z = gp_model.sample_y(tsdt[:, np.newaxis], n_samples=z_dim, random_state=rng_seeds[n]).T
      Z.append(z)
      # Sample observations according to GPFA emission model
      x = C @ z + np.random.normal(0, sqrtR[:, np.newaxis] , (x_dim, T_per_obs))
      X.append(x)

   blockinv_gpfa = BlockInvGPFA(bin_size=bin_size, z_dim=z_dim, em_tol=1e-3, verbose=True)
   blockinv_gpfa.fit(X)  # this will take a while

   results = blockinv_gpfa.predict(returned_data=['Z_mu_orth', 'Z_mu'])

   print(blockinv_gpfa.variance_explained())

.. code-block:: console

   Initializing parameters using factor analysis...

   Fitting GP parameters by EM...
   Fitting has converged after 135 EM iterations.
   (0.9834339959622203, array([0.80069798, 0.18273602]))

Note that, due to differences in the linear algebra packages used on different machine architectures and operating systems, the above results might differ across machines.

Let us now visualize the true orthogonalized latents against the inferred orthogonalized latents.

.. code-block:: python

   import matplotlib as mpl
   import matplotlib.pyplot as plt
   mpl.rcParams.update({'font.size': 20})
   mpl.rcParams.update({'lines.linewidth': 3})

   # orthogonalize true latents using inferred C
   true_Z = Z
   true_Z_orth = [blockinv_gpfa.OrthTrans_ @ Zn for Zn in true_Z]

   # Extract Z_orth and pZ_mu from results
   Z_orth = results[0]['Z_mu_orth']
   Z_mu = results[0]['Z_mu']

   fig, axs = plt.subplots(2, 1, figsize=(13, 13))

   # plot first (orthonormalized) latent from first sequence (flipping the sign
   # of the inferred latent, as this is arbitrary and happens to be wrongly inferred)
   axs[0].plot(tsdt, true_Z_orth[0][0,:], label='true $Z_{1,:}$ (orth)', c='blue')
   axs[0].plot(tsdt, -Z_orth[0][0,:], label='inf $Z_{1,:}$ (orth)', c='orange')
   axs[0].legend()
   axs[0].set_title('Comparing inferred to true latents')
   axs[0].set_ylabel('$Z_{1,:}$ from first sequence')

   # plot first latent from second sequence
   axs[1].plot(tsdt, true_Z_orth[1][0,:], label='true $Z_{1,:}$ (orth)', c='blue')
   axs[1].plot(tsdt, -Z_orth[1][0,:], label='inf $Z_{1,:}$ (orth)', c='orange')
   axs[1].legend()
   axs[1].set_xlabel('time $t$')
   axs[1].set_ylabel('$Z_{1,:}$ from second sequence')

   plt.tight_layout()
   plt.show()


.. image:: /_static/example-2.png
   :width: 600
   :height: 600
   :alt: Results Image
   :align: center

.. _examples_2:

(ii) Applying BlockInvGPFA to real neural Data
----------------------------------------------
You can find a more complex example that applies BlockInvGPFA to neural data used by [Even2019]_  (available at [Even2022]_) in an :doc:`neural_example`. To run this example yourself, download the :download:`Jupyter notebook containing the code <neural_example.ipynb>`.

Original code
-------------
The code was adjusted and extended from the `Elephant ephys analysis toolkit <http://neuralensemble.org/elephant/>`_ which is a direct Python translation of `Byron Yu's MATLAB implementation <https://users.ece.cmu.edu/~byronyu/software.shtml>`_.

References
----------
.. [Yu2009] `Yu, Byron M and Cunningham, John P and Santhanam, Gopal and
    Ryu, Stephen and Shenoy, Krishna V and Sahani, Maneesh
    "Gaussian-process factor analysis for low-dimensional single-trial
    analysis of neural population activity"
    In Journal of Neurophysiology, Vol. 102, Issue 1. pp. 614-635.
    <https://doi.org/10.1152/jn.90941.2008>`_

.. [Denker2018] `Denker, M., Yegenoglu, A., and Gr\"un, S.
    "Collaborative HPC-enabled workflows on the HBP Collaboratory using the Elephant framework."
    In Neuroinformatics 2018, 2018, p. P19.
    <https://abstracts.g-node.org/conference/NI2018/abstracts#/uuid/023bec4e-0c35-4563-81ce-2c6fac282abd>`_


.. [Even2022] `Even-Chen, Nir and Sheffer, Blue and Vyas, Saurabh and Ryu, Stephen
    and Shenoy, Krishna (2022)
    "Structure and variability of delay activity in premotor cortex"
    (Version 0.220124.2156) [Data set]. DANDI archive.
    <https://doi.org/10.48324/dandi.000121/0.220124.2156>`_

.. [Even2019] `Even-Chen, Nir and Sheffer, Blue and Vyas, Saurabh and Ryu, Stephen
    and Shenoy, Krishna (2022)
    "Structure and variability of delay activity in premotor cortex"
    PLoS computational biology, 15(2), e1006808. <https://doi.org/10.1371/journal.pcbi.1006808>`_

.. toctree::
    :maxdepth: 2
    :hidden:

    blockinv_gpfa_class
    preprocessing_class
    neural_example
    installation
    contribute
    acknowledgments
    authors
    citation