Test Script #4 Simulated Annealing with large Synthetic Data
Name: SECTORS_TEST_ANNEALING_001.M
Authors: Bongani Mbambiso, Tim Gebbie
This test script is part of the "Sector and State" toolbox SECTOR.ZIP. The data set considered has correlated Assets with four clusters. The test script is developed to show that the algorithm picks up those clusters.
References:
1. J. D. Noh, Phys. Rev. E 61, 5981 (2000) 1. L. Giada & M. Marsili (2005) ... 2. Marsili (2002) ...
See Also: ANNEALING, CHANGE, SWEEP, MERGE, SPLIT, GROUND, RECALCULATE
Contents
Clear the workspace
clear all; clc;
The algorithm used to control the entire algorithm
help annealing
ANNEALING Find the maximum likelihood cluster configuration [A] = ANNEALING(VARARGIN) Find the Maximum likelihood configuration, using the Metropolis Scheme or Simulated annealing method and the J.D. Noh's Model. The algorithm finds the ground state,which is the best configuration at the end of the algorithm simulation. [A] = ANNEALING(DATA,IB,FB,T_STEPS,N_CYCLES,CF,CORR_M ) Uses the data to compute the correlation matrix, which is used to find the maximum likelihood configuration of the data. CORR_M specifies the method used to compute correlation matrix. It takes values 1, 2, and 3, which represent 'Pearson', 'Kendal' and 'Spearman', respectively. All the other variables are annealing variables and they are described below. [A] = ANNEALING(C,IB,FB,T_STEPS,N_CYCLES,CF) Takes the correlation matrix, and use it to find the maximum likelihood configuration of the data. Other variables are Annealing input varaibles (see below). A is the configuration data structure and has structure described below. A(i)=j gives that the i-th object is in the j-th cluster. The configuration index I, where I(i,j)=k gives k object as the j-th element of the i-th cluster. E(i) is the likelihood per element of the i-th cluster. C is the similarity the correlation matrix. Here it can be taken as the Pearson matrix or Kendal's Coefficient. This is taken to be a square matrix of the size of the number of objects. ANNEALING VARIABLES: IB - initial beta FB - final beta T_STEPS - Steps of Temperature changes CF - factor for change in temperature DATA STRUTURE OF THE CONFIGURATION [A] Status of the system during the annealing process: These variables keep track of the current solution of the maximum likelihood problem. A.N - Original data size A.a - the current solution configuration A.nc - number of (non-empty) clusters in the current configuration A.nec - Index array of Non-empty clusters A.b - current inverse temperature (beta). A.c - internal correlations A.C - Correlation matrix A.e - energy (minus log of likelihood), which is being minimized. A.n - number of elements per cluster A.I - configuration index A.t - time, counts the number of steps. A.cf - cooling factor for delta_temperature A.updates - no. of spin updates per sweep A.merges - number of merges per sweep A.splits - number of splits per sweep A.t_steps - number of sweeps between changes in temperature A.cycle - number of annealing/ temperature clycles A.n_cycles- maximum number of annealing/ temperature clycles Ground state parameters: These paramters represent the final solution, the best configuration that gives the best maximum likelihood. A.gs.a - the best configuration A.gs.nc - number of clusters in the final solution configuration A.gs.nec - Index array of Non-empty clusters A.gs.b - Ground state energy (per spin). A.gs.c - internal correlations A.gs.C - Correlation matrix A.gs.e - energy (minus log of likelihood),which is being minimized. A.gs.I - configuration index A.gs.n - number of ground state elements per cluster cluster A.gs.t - time, counts the number of steps. A.gs.updates - no. of spin updates, per spin, per sweep, or the no. of times a better configuration was found in the same temperature or cooling schedule A.gs.merges - number of mergings, per sweep A.gs.splits - number of splitting, per sweep A.gs.cycle - number of annealing / temperature clycles See Also : PEARSON, KENDALL, SPEARMAN, LIKELIHOOD, AVERAGING CHANGE, ENERGY. References: 1) J. D. Noh (2000) 2) Marsili M. (2002) 3) L. Giada & Marsili M (2005) 4) Metropolis, N., Rosenbluth. A. W. (1953)
Rearrange the configuration several times for each cooling schedule
help change
CHANGE Find a new configuration. [A] = CHANGE(A) Change the configuration A by cycling through the current cluster configuration, with a constant temperature, find a new configuration A, with a better clustering solution. This function rearranges a lattice configuration A, and finds a configuration which increased energy, at a constant temperation or cooling schedule. Configuration membership index, INDEX, initial internal energies, E, at a given inverse temperature BETA. NR is the number of configuration changes. At time T. First a random lattice configuration change is attempted by swapping two elements between different clusters via a lattice sweep. A random cluster merge is attempted. A random cluster split is attempted. See Also: CFSWEEP, CFMERGE, CFSPLIT
Input Data and Variables
Initial beta
ib = 1; % Final beta fb = 5; % Number of sweeps between changes in temperature t_steps = 15; % temperature changes every t_steps % Max. number of annealing cycles n_cycles= 1; % Cooling Factor for delta_temperature cf = 0.997;
Test 4.1: Correlated data with Four Clusters
C41 = [[ 1, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0.45]; ... [0.4, 1, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0.4, 0.4, 1, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0.4, 0.4, 0.4, 1, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0.4, 0.4, 0.4, 0.4, 1, 0.4, 0.4, 0.4, 0.4, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0.4, 0.4, 0.4, 0.4, 0.4, 1, 0.4, 0.4, 0.4, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 1, 0.4, 0.4, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 1, 0.4, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 1, 0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,-.3,0,0,0,0,0,0,0,0]; ... [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 1, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, -.4,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, -0.1,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, -0.2,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, -0.3,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, -0.4,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 0.5, 0.5, -0.6,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 0.5, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, 0.5, -0.5,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1, 0.5, 0.5, 0.5, -0.4,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1, 0.5, 0.5, -0.3,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1, 0.5, -0.2,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 1, -0.1,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, -.1,-.2,-.3,-.4,-.6,-.5,-.4,-.3,-.2,-0.1, 1, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 1, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.3, 1, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.3, 0.3, 1, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.3, 0.3, 0.3, 1, 0.3, 0.3, 0.3, 0.3, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.3, 0.3, 0.3, 0.3, 1, 0.3, 0.3, 0.3, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 1, 0.3, 0.3, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 1, 0.3, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 1, 0.3, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 1, 0,0,0,0,0,0,0,0,0,0]; ... [0,0,0,0,0,0,0,0,0,-0.4, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 1, 0.1, 0.1, 0.1,-0.1,-0.2, 0.1, 0.3, 0.4, 0.5]; ... [0,0,0,0,0,0,0,0,-.3,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.1, 1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.4]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.1, 0.1, 1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.3]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.1, 0.1, 0.1, 1, 0.1, 0.1, 0.1, 0.1, 0.1,-0.1]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.1, 0.1, 0.1, 0.1, 1, 0.1, 0.1, 0.1, 0.1,-0.2]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, -0.2, 0.1, 0.1, 0.1, 0.1, 1, 0.1, 0.1, 0.1, 0.1]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, -0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 1, 0.1, 0.1, 0.1]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.3, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 1, 0.1, 0.1]; ... [0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.4, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 1, 0.1]; ... [0.45,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0, 0.5, 0.4, 0.3,-0.1,-0.2, 0.1, 0.1, 0.1, 0.1, 1]]; % Generating Correlated Data data41 = mvnrnd(zeros(40,1),C41,100); % Test 4.1.1: Use ANNEALING to generate clusters from the correlations [A] = annealing(C41,ib,fb,t_steps,n_cycles,cf); % Membership Matrix: Rows-Clusters, Columns-Objects A.gs.I
Error using ==> mvnrnd at 118 SIGMA must be a symmetric positive semi-definite matrix. Error in ==> sectors_test_annealing_002 at 91 data41 = mvnrnd(zeros(40,1),C41,100);