Generated Files
===============

In this section, we will review the different files that are generated by the algorithm, and at the end of which step. In all the following, we will assume that the data are ``path/mydata.extension``. All data are generated in the path ``path/mydata/``. To know more about what is performed during the different steps of the algorithm, please see :doc:`details on the algorithm <../advanced/algorithm>`, or wait for the publication. 

Whitening
---------

At the end of that step, a single HDF5_ file ``mydata.basis.hdf5`` is produced, containing several objects

    * ``/thresholds`` the *N* thresholds, for all *N* electrodes. Note that values are positive, and should be multiply by the threshold parameter in the configuration file (see :doc:`documentation on parameters <../code/config>`)
    * ``/spatial`` The spatial matrix used for whitening the data (size *N* x *N*)
    * ``/temporal`` The temporal filter used for whitening the data (size *Nt* if *Nt* is the temporal width of the template)
    * ``/proj`` and ``/rec`` The projection matrix obtained  by PCA, and also its inverse, to represent a single waveform. (Size *Nt* x *F* if *F* is the number of features kept (5 by default))
    * ``/waveforms`` 1000 randomly chosen waveforms over all channels

Clustering
----------

At the end of that step, several files are produced
    * ``mydata.clusters.hdf5`` A HDF5_ file that will encapsulates a lot of informations about the clusters, for every electrodes. What were the points selected, the spike times of those points, what was the labels assigned by the clustering, and also the rho and delta values resulting of the clustering algorithm used `[Rodriguez et Laio, 2014] <http://www.sciencemag.org/content/344/6191/1492.short>`_. To be more precise, the file has the following fields

        * ``/data_i``: the data points collected on electrode *i*, after PCA
        * ``/clusters_i``: the labels of those points after clustering
        * ``/times_i``: the spike times at which those spikes are
        * ``/debug_i``: a 2D array with rhos and deltas for those points (see clustering algorithm)
        * ``/electrodes``: an array with the prefered electrodes of all *K* templates
    * ``mydata.templates.hdf5`` A HDF5_ file storing all the templates, and also their orthogonal projections. So this matrix has a size that is twice the number of templates *2k*. Only the first *k* elements are the real templates. Note also that every templates has a given range of allowed amplitudes ``limits``, and we are also saving the norms ``norms`` for internal purposes. To be more precise, the file has the following fields

        * ``/temp_shape``: the dimension of the template matrix *N* x *Nt* x *2K* if *N* is the number of electrodes, *Nt* the temporal width of the templates, and *K* the number of templates. Only the first *K* components are real templates
        * ``/temp_x``: the x values to reconstruct the sparse matrix
        * ``/temp_y``: the y values to reconstruct the sparse matrix
        * ``/temp_data``: the values to reconstruct the sparse matrix
        * ``/norms`` : the *2K* norms of all templates
        * ``/limits``: the *K* limits [amin, amax] of the real templates
        * ``/maxoverlap``: a *K* x *K* matrix with only the maximum value of the overlaps accross the temporal dimension
        * ``/maxlag``: a *K* x *K* matrix with the indices leading to the ``maxoverlap`` values obtained. In a nutshell, for all pairs of templates, those are the temporal shifts leading to the maximum of the cross-correlation between templates 

    * ``mydata.overlap.hdf5`` A HDF5_ file used internally during the fitting procedure. This file can be pretty big, and is also saved using a sparse structure. To be more precise, the file has the following fields

        * ``/over_shape``: the dimension of the overlap matrix *2K* x *2K* x *2Nt - 1* if *K* is the number of templates, and *Nt* the temporal width of the templates
        * ``/over_x``: the x values to reconstruct the sparse matrix
        * ``/over_y``: the y values to reconstruct the sparse matrix
        * ``/over_data``: the values to reconstruct the sparse matrix

Fitting
-------

At the end of that step, a single HDF5_ file ``mydata.result.hdf5`` is produced, containing several objects

    * ``/spiketimes/temp_i`` for a template *i*, the times at which this particular template has been fitted.
    * ``/amplitudes/temp_i`` for a template *i*, the amplitudes used at the given spike times. Note that those amplitudes has two component, but only the first one is relevant. The second one is the one used for the orthogonal template, and does not need to be analyzed.
    * ``/gspikes/elec_i`` if the ``collect_all`` mode was activated, then for electrode *i*, the times at which spikes peaking there have not been fitted.

.. note:: Spike times are saved in time steps


Converting
----------

At the end of that step, several numpy_ files are produced in a path ``path/mydata.GUI``. They are all related to phy_, so see the devoted documentation


.. _MATLAB: http://fr.mathworks.com/products/matlab/
.. _phy: https://github.com/cortex-lab/phy
.. _numpy: http://www.numpy.org/
.. _HDF5: https://www.hdfgroup.org/HDF5/