During the implementation of the complete correlation matrix memory I had to create the following scaling formula of the matrix's values where denotes the Euclidean norm of a vector :
After executing the tests, I noted a scaling error when certain keys were used. Keys whose elements consisted of 0 or 2 worked correctly. Keys that had values other than 0 and 2 could not be used to store values inside the matrix memories correctly. Given such a provoking key the recalled vector was scaled by an unknown scalar factor. When the key was used in order to store the value then the extracted value for that key was . Thus, the result has a faulty scaling factor of 4.
During my first attempt of fixing this error on the next weekend I had little time. I skimmed through the code and could not find any error. Then I looked at creation of the memory matrix and the recall in greater detail without taking a second look on the scaling factor. I though that the scaling was simpler to implement than the rest. Therefore, I concluded that there is higher chance that there is an error in the rest of the code, compared to the code defining the scaling factor. The following weekend I checked the calculated scaling factors and noticed that the numbers were of. The faulty scaling factor was directly proportional to the length of the vectors. So I looked at scaling factor implementation:
def c(self, vector): rBase = 0. for i in vector: rBase += i ** 2 return (rBase ** 1/2) ** -2
First the dot product of the given vector and itself is calculated. The result is raised to the power of 1 and then divided by 2 and lastly raised to the power of -2. The right-hand side shows the implemented formula and the left side shows the desired formula:
The faulty implementation was encouraged by 2 factors. First of all I
used white space
in order to group sections of the formula. While this
is a visual appealing tactic
it, does not work as intended because the
presence or absence of white space does not influence
the precedence
of operators. While the white space indicate that
rBase
is raised
to the power of
1/2
, it is not actually the case as the power operator has an higher
precedence than the division operator. The correct way to implement
the math formula exactly as stated
in Kohonen's paper:
def c(self, vector): rBase = 0. for i in vector: rBase += i ** 2 return (rBase ** (1/2)) ** -2
Also, it is the correct way to implement the scaling factor, it is not a good implementation. It does things more complicated than it really is and is therefore not very efficient. The formula can be simplified in order to reduce the risk of implementing it incorrectly:
If this is done the implementation becomes a bit easier:
def c(self, vector): rBase = 0. for i in vector: rBase += i ** 2 return 1. / rBase
So why did I overlook the error in the scaling factor on the first examination? While I was skimming through the code I was searching for the algorithm described by Kohonen. I did not read the source code, determine the implemented algorithm and compare the result with the algorithm in Kohonen's paper. I did it in order to gain some seconds and in order to do the given task with minimal thinking resources. This tactic ultimately did not work as intended and even caused resource waste (time is money and money is time).