Updated readme use effectively as a lab journal

2025-12-22 14:11:59 +00:00 · 2024-06-18 14:09:31 +01:00
parent b113c21862
commit cb74559222
7 changed files with 81 additions and 2 deletions
--- a/README.md
+++ b/README.md
@@ -6,13 +6,32 @@ This is a testing sandbox for developing various methods of injecting symbolic k

 ## Experiment 1 - Semantically weighted loss

+### Planning
+
 - Simple semantic loss based on intuitively noticeable properties of numeric characters
 - Makes use of manually made reward matrices to weight rewards depending on how "close" the model was
 - Very rudimentary example of semantic loss
- Appeared to work perfectly

-## Experiment 2 - Semantic loss function
+### Results

+- Training loss
+![Training loss plot for experiment 1](./results/Experiment1/train_loss.png)
+- Validation loss
+![Validation loss plot for experiment 1](./results/Experiment1/val_loss.png)
+- Test loss
+![Test loss plot for experiment 1](./results/Experiment1/test_loss.png)
+
+### Conclusions
+
+- Seems to have worked
+    - Clear improvement in training rate with semantics added
+    - Similarity cross entropy in particular shows clear signs in validation loss of being on a similar complementary CDF to the normal cross-entropy loss, but training faster
+- Interestingly: the "garbage" cross entropy seems to have also produced a very good result! This is likely because it hasn't been normalized to 1, so it may be simply amplifying the gradient by random amounts at all times. Basically acting as a fuzzy gradient booster.
+- I would consider this experiment a success, with some interesting open questions remaining worth further examination
+
+## Experiment 2 - Dataset qualitative characteristic derived semantic loss functions
+
+### Planning
 - Makes use of known physics equations that partially describe the problem to guide the model
    - Reduces the need for the model to learn known physics, allowing it to focus on learning the unknown physics
    - Should accelerate training
@@ -25,3 +44,63 @@ This is a testing sandbox for developing various methods of injecting symbolic k
    - [Molecular Properties](https://www.kaggle.com/datasets/burakhmmtgl/predict-molecular-properties)
    - [Nuclear Binding Energy](https://www.kaggle.com/datasets/iitm21f1003401/nuclear-binding-energy)
    - [Body Fat Prediction](https://www.kaggle.com/datasets/fedesoriano/body-fat-prediction-dataset)
+- Decided to use Molecular Properties dataset as it is quite familiar to me
+- Training with semantics added to renationship between molecular energy and differential electronegativity
+    - Semantics being injected are:
+        - These values should be positively corellated
+        - These values should be weighted towards a high r^2 with an adaptive penalty
+    - Multiple attempts carried out:
+        - Simple penalties. Variations tested include:
+        ```math
+        Loss = ( Softplus( -m ) + 1 ) * Smooth_L1_Loss
+        ```
+        ```math
+        Loss = ( Relu( -m ) + 1 ) * Smooth_L1_Loss
+        ```
+        ```math
+        Loss = ( \frac{1}{Sech(|r|)} + 1 ) * Smooth_L1_Loss
+        ```
+        ```math
+        Loss = ( {r}^2 + 1) * Smooth_L1_Loss
+        ```
+        - Adaptive, self training penalties tuned by various methods. Best method found was optimisation by a random forest regressor. These tunable variants include:
+        ```math
+        Loss = ( Softplus( \alpha * -m ) + 1 ) * Smooth_L1_Loss
+        ```
+        ```math
+        Loss = ( Relu( \alpha * -m ) + 1 ) * Smooth_L1_Loss
+        ```
+        ```math
+        Loss = ( \frac{ 1 }{ Sech( \alpha * |r| ) } + 1 ) * Smooth_L1_Loss
+        ```
+        ```math
+        Loss = ( \alpha * { r }^2 + 1) * Smooth_L1_Loss
+        ```
+        - Final adaptive semantic loss function tested was the following:
+        ```math
+        Loss = ( \alpha * { r }^2 + 1) * ( \frac{ 1 }{ \beta } * log( 1 + exp( \beta * \gamma * -m ) ) + 1 ) * Smooth_L1_Loss
+        ```
+
+### Results
+
+- Training loss
+![Training loss plot for experiment 2](./results/Experiment2/train_loss.png)
+- Validation loss
+![Validation loss plot for experiment 2](./results/Experiment2/val_loss.png)
+- Test loss
+![Test loss plot for experiment 2](./results/Experiment2/test_loss.png)
+
+### Conclusions
+
+- Method didn't appear to work too well because:
+    - Simple loss functions tested were likely suboptimal for effectively influencing model
+    - Guesses at parameters in simple functions need to be optimised, basically turning this into a hyperparameter optimisation problem, which defeats the purpose of semantic loss
+    - Adaptive, ML based loss functions do not appear to be converging quickly enough to train the model faster than the normal loss functions
+- For this reason, I would conclude this experiment as a failure
+
+## Experiment 3 - Physics informed semantic loss functions
+
+### Planning
+
+- Attempt to use more mathematically rigorous, formalised, and literature based approach to semantic loss functions.
+- Although understudied, semantic loss has had a lot of theoretical maths done exploring the concept, and we need to figure out how to put this into code.
--- a/results/Experiment1/test_loss.png
+++ b/results/Experiment1/test_loss.png
--- a/results/Experiment1/train_loss.png
+++ b/results/Experiment1/train_loss.png
--- a/results/Experiment1/val_loss.png
+++ b/results/Experiment1/val_loss.png
--- a/results/Experiment2/test_loss.png
+++ b/results/Experiment2/test_loss.png
--- a/results/Experiment2/train_loss.png
+++ b/results/Experiment2/train_loss.png
--- a/results/Experiment2/val_loss.png
+++ b/results/Experiment2/val_loss.png