This week on Friday was the deadline for delivering what was specified in the project plan.
Graphical visualization of learning and decision making.
Postmortem and presentation document delivered.
These statements don’t really say much about what I was supposed to deliver and it also misses one document; the project report. Graphical visualization in this case was referred to the application, either binary or video on it.
After week 8 I got much better results from the AI testing, it was clearly learning. I spent both Monday and Tuesday with testing the AI and making sure the application wouldn’t crash even if I specified that I was to be working with the report and postmortem.
From Wednesday I worked only on project report and postmortem until delivery day, it became some late nights. On Friday which was the Milestone 3 delivery day everything was completed 1 hour before the presentation. What was delivered on SVN was;
In my point of view I did really well on the presentation, I had more to talk about this time than on Milestone 2 and this time it was less technical.
In conclusion; even if it was a hectic week I still had everything delivered on time.
At the start of the week I added functionality to speed up the testing and changed input values to the neural network to have values in percent, between 0.0 and 1.0. The behavior didn’t change with the change to smaller values.
The rest of the week I have spent tweaking and testing.
To see the impact (implementation time) of using reinforcement learning (self learning) I implement two other decision making methods; using only if statements and an already trained neural network. To have both this implementations behave like I wanted it took;
- “If” statements ~2 minutes.
- Pre-learned neural network, ~20 minutes
On Thursday I come to the conclusion that the result, reinforcement learning, isn’t even close to what I want to behave like. I do have the results I said to deliver in the project plan but I still have a few days left to continue testing.
Due to sickness this week no progress has happened.
This week has been all about AI logic, it has been a slow week with not much progress.
Testing of the AI agent showing two patterns.
- Stuck at repeated detection. Q-Learning values stop fluctuating and becomes stable, seems to stop learning.
- Instantly changing nodes. Never exploiting the nodes.
Started by looking at the connections inside the neural network, the weights of the connections are incomprehensible.
Went over to activation function values. Noticed that I was using an activation function in FANN that can’t be used when training the neural network. Also looked into the activation cost of nodes. Found a post on stackoverflow; depending on the type of ANN you have to normalize input values. This will be compared, scaled vs non-scaled, during the weekend over a longer time period.
The project plan in MS1 have been updated, corrected time plan and changed from using Unity engine to my old graphic assignment. These decisions were done during Milestone 2 but haven’t been updated until now.
This also means I have been developing in Linux instead of Windows. There haven’t been much trouble developing in Linux until last Friday. I have been using Netbeans IDE but during Friday I got problem with the IDE, showing errors when there have been no errors. During the weekend I changed from using Netbeans IDE to using only CMake, make and a text editor.
I created a bash script to simplify the process of building the source to running the application. The script uses bash ‘select’ for selecting action.
|Script: Main menu
||Script: Action menu
The virtual network is loaded from a text file, see other post regarding network parsing. Each node is represented by a white hexagon with a white outline and white lines as links.
When the AI agent enters a node the white hexagon scales down and shows a hidden red hexagon. When the white hexagon is scaled down to nothing the AI is detected and moved to it’s spawn position.
The AI agent was first a blue pyramid that was rotated upside down but was changed during the weekend to a blue sphere instead.
In the beginning the representation of detection was that the white node changed color between red and white. This was changed due to it was hard telling when the AI was detected on the node.
Since I use my graphics assignment I tried using the scene graph structure to render everything. Due to the structure of the AI and nodes I noticed that changing it would take a lot more time than just letting each network entity implement their own render function.
Update: With some feedback from my classmates I updated the node visuals.
The parser reads a network from a text file. The file consist of 3 parts; number of nodes, node data and linking data.
Number of nodes
This value is located on the first row. This is the amount of nodes in the network.
< N >
This starts on the second row and continues for the number of nodes.
< x0 > < y0 > < z0 > < difficulty0 > < increase0 > < decrease0 > < detection0 >
< xN > < yN > < zN > < difficultyN > < increaseN > < decreaseN > < detectionN >
Last in the file is the links located, each row representing an index. The row starts off with the number of links the node will have and after it comes the index to the node. If a node will have no links the row must start with the value 0.
< links0 > < linkIndex0 > < … > <linkIndexN >
< linksN > < linkIndex0 > < … > < linkIndexN >
-1.0 30.0 2.0 1.0 1.0 1.0 100.0
0.0 30.0 -10.0 1.0 1.0 1.0 100.0
-10.0 30.0 -5.0 1.0 1.0 1.0 100.0
-16.0 30.0 7.0 1.0 1.0 1.0 100.0
3 1 2 3
Since Tuesday this week I started merging the prototype with my old assignment. The merging haven’t given much of a problem just time consuming due to a lot of methods have been created, renamed or redesigned. The neural network and states now lie inside the merged project and the prototype GUI window have been thrown away.
At the moment only the AI agent got a representation inside the environment, a red triangle moving between two hidden nodes. Next week I start implementing functionality to load an entire virtual network from a text file along with giving a node it’s visual representation.
Everything I said in the project plan was delivered in time, delivered my terminal output that shows that the decision and learning. The presentation had a lot of formulas and floating point values so it felt that I maybe was too detailed for the listeners.
I decided during the last week is that I’m not going to use Unity but instead use my previous graphics assignment for this project.
Time for the update on the prototype. Since the last blog post was added I have been implementing the learning part of the AI agent. In my project plan my deliveries for MS2 are a terminal output that the AI agent chose actions and learn.
From left to right; Generation, current detection, detection rate, detected value, reinforcement reward, exploit Qvalue, leave Qvalue, action taken, target Qvalue, output Qvalue and error.
The AI agent uses the technique called Q-learning. Q-learning is a technique that iteratively calculates a quality value from estimations of the actions, Qvalues. The quality values between two updates are calculated and used to calculate an error value. The error value is used to train the neural network.
target Qvalue = reinforcement + discount * max(exploit Qvalue, leave Qvalue)
output Qvalue = ((1-learning rate) * previous output Qvalue) + (learning rate * target Qvalue)
error = 0.5 * (target Qvalue – previous Qvalue)^2
- current suspicious value
- detection rate
- detected value
The ANN have 2 hidden layers and each layer consists of 4 nodes.
The ANN is used in the function to calculate the Qvalues for the actions;
Qvalue = Qfunction(current, rate, max, action)
update: Added missing content to this post.