From CrystalFp Wiki

CrystalFpLib: Scatterplot

How to obtain a scatterplot with CrystalFp

First of all, CrystalFp driver program (for short: cfp) should have options to load data files, select entries based on energy and then compute fingerprint and distances.

To enable scatterplot computation, add the "--scatterplot" or "-sc" option.

Then set the output file with: "--scatterplot-file=<your file>" or "-sf <your file>" The output file <your file> will be in CSV format (values separated by comma) with a header line. An example follows:

x,y,value
3.025685e-001,4.755944e-001,-2.358848e+004
-3.927093e-001,-4.379296e-001,-2.353936e+004
-5.891228e-001,4.140906e-001,-2.367398e+004
...

To select which variable will be taken as value pass the "kind" scatterplot parameter as: "--scatterplot-param kind <n>" where <n> thakes the following values:

 0 The total energy associated to the structure
 1 Per atom energy associated to the structure
 2 The stress computed by the multidimensional scaling algorithm
 3 Group to which the point pertains
 4 The step value

Other parameters influencing the scatterplot production can be passed adding more "--scatterplot-param <param> <value>" lines. Usually the defaults are fine. The defaults are listed below.

 retry       Number of retries of the scatterplot relaxation after position perturbation (1)
 mass        Ball mass (10)
 stiffness   Of the spring (1)
 damping     Damping factor for the movement (0.7)
 perturb     Perturb scale for the retries (it perturb the initial position of the masses) (0.2)
 energy      Max kinetic energy to end iterations (1e-6)
 iterations  Max number of iterations (600)
 timestep    Timestep for the iterations (0.02)

The only parameter that needs adjustment is timestep that should be reduced if there are more than 1000 points. Unfortunately there is no visual feedback to judge the correct value.

To check if the scatterplot maps correctly distances in the fingerprint space to scatterplot space, a diagnostic chart could be produced by adding the option: "--diagnostic-file=<diagnostic file>" The file is again in CSV format as below:

x,y,value
1.544209e-001,5.634693e-003,1.000000e+000
2.052964e-001,5.676321e-003,1.000000e+000
...

The chart shows projected distances vs. real distances both normalized between 0 and 1. Thart details are set with the option: "--scatterplot-param diagnostic <n>" where:

 0 Return all inter-point distances (point value is the distance from the projected == real line)
 1 Bin the interpoint distances (value is the number of points in the bin)

For chart number 1 the following additional parameters can be changed with "--scatterplot-param <param> <value>" lines. Usually the defaults are fine. The defaults are listed below.

 bins        Number of bins for the binned distances diagnostics (100)
 wobble      Wobble scale for the position of the binned points (0.001)
Retrieved from http://mariovalle.name/CrystalFp/index.php/CrystalFpLib/Scatterplot
Page last modified on July 05, 2011, at 06:47 AM