Updated readme use effectively as a lab journal

2025-12-22 14:11:59 +00:00 · 2024-06-18 14:09:31 +01:00
parent b113c21862
commit cb74559222
7 changed files with 81 additions and 2 deletions
--- a/README.md
+++ b/README.md
@@ -6,13 +6,32 @@ This is a testing sandbox for developing various methods of injecting symbolic k
 ## Experiment 1 - Semantically weighted loss
 ### Planning
 - Simple semantic loss based on intuitively noticeable properties of numeric characters
 - Makes use of manually made reward matrices to weight rewards depending on how "close" the model was
 - Very rudimentary example of semantic loss
 - Appeared to work perfectly
-## Experiment 2 - Semantic loss function
+### Results
 - Training loss
 ![Training loss plot for experiment 1](./results/Experiment1/train_loss.png)
 - Validation loss
 ![Validation loss plot for experiment 1](./results/Experiment1/val_loss.png)
 - Test loss
 ![Test loss plot for experiment 1](./results/Experiment1/test_loss.png)
 ### Conclusions
 - Seems to have worked
    - Clear improvement in training rate with semantics added
    - Similarity cross entropy in particular shows clear signs in validation loss of being on a similar complementary CDF to the normal cross-entropy loss, but training faster
 - Interestingly: the "garbage" cross entropy seems to have also produced a very good result! This is likely because it hasn't been normalized to 1, so it may be simply amplifying the gradient by random amounts at all times. Basically acting as a fuzzy gradient booster.
 - I would consider this experiment a success, with some interesting open questions remaining worth further examination
 ## Experiment 2 - Dataset qualitative characteristic derived semantic loss functions
 ### Planning
 - Makes use of known physics equations that partially describe the problem to guide the model
    - Reduces the need for the model to learn known physics, allowing it to focus on learning the unknown physics
    - Should accelerate training
@@ -25,3 +44,63 @@ This is a testing sandbox for developing various methods of injecting symbolic k
    - [Molecular Properties](https://www.kaggle.com/datasets/burakhmmtgl/predict-molecular-properties)
    - [Nuclear Binding Energy](https://www.kaggle.com/datasets/iitm21f1003401/nuclear-binding-energy)
    - [Body Fat Prediction](https://www.kaggle.com/datasets/fedesoriano/body-fat-prediction-dataset)
 - Decided to use Molecular Properties dataset as it is quite familiar to me
 - Training with semantics added to renationship between molecular energy and differential electronegativity
    - Semantics being injected are:
        - These values should be positively corellated
        - These values should be weighted towards a high r^2 with an adaptive penalty
    - Multiple attempts carried out:
        - Simple penalties. Variations tested include:
        ```math
        Loss = ( Softplus( -m ) + 1 ) * Smooth_L1_Loss
        ```
        ```math
        Loss = ( Relu( -m ) + 1 ) * Smooth_L1_Loss
        ```
        ```math
        Loss = ( \frac{1}{Sech(|r|)} + 1 ) * Smooth_L1_Loss
        ```
        ```math
        Loss = ( {r}^2 + 1) * Smooth_L1_Loss
        ```
        - Adaptive, self training penalties tuned by various methods. Best method found was optimisation by a random forest regressor. These tunable variants include:
        ```math
        Loss = ( Softplus( \alpha * -m ) + 1 ) * Smooth_L1_Loss
        ```
        ```math
        Loss = ( Relu( \alpha * -m ) + 1 ) * Smooth_L1_Loss
        ```
        ```math
        Loss = ( \frac{ 1 }{ Sech( \alpha * |r| ) } + 1 ) * Smooth_L1_Loss
        ```
        ```math
        Loss = ( \alpha * { r }^2 + 1) * Smooth_L1_Loss
        ```
        - Final adaptive semantic loss function tested was the following:
        ```math
        Loss = ( \alpha * { r }^2 + 1) * ( \frac{ 1 }{ \beta } * log( 1 + exp( \beta * \gamma * -m ) ) + 1 ) * Smooth_L1_Loss
        ```
 ### Results
 - Training loss
 ![Training loss plot for experiment 2](./results/Experiment2/train_loss.png)
 - Validation loss
 ![Validation loss plot for experiment 2](./results/Experiment2/val_loss.png)
 - Test loss
 ![Test loss plot for experiment 2](./results/Experiment2/test_loss.png)
 ### Conclusions
 - Method didn't appear to work too well because:
    - Simple loss functions tested were likely suboptimal for effectively influencing model
    - Guesses at parameters in simple functions need to be optimised, basically turning this into a hyperparameter optimisation problem, which defeats the purpose of semantic loss
    - Adaptive, ML based loss functions do not appear to be converging quickly enough to train the model faster than the normal loss functions
 - For this reason, I would conclude this experiment as a failure
 ## Experiment 3 - Physics informed semantic loss functions
 ### Planning
 - Attempt to use more mathematically rigorous, formalised, and literature based approach to semantic loss functions.
 - Although understudied, semantic loss has had a lot of theoretical maths done exploring the concept, and we need to figure out how to put this into code.
--- a/results/Experiment1/test_loss.png
+++ b/results/Experiment1/test_loss.png
--- a/results/Experiment1/train_loss.png
+++ b/results/Experiment1/train_loss.png
--- a/results/Experiment1/val_loss.png
+++ b/results/Experiment1/val_loss.png
--- a/results/Experiment2/test_loss.png
+++ b/results/Experiment2/test_loss.png
--- a/results/Experiment2/train_loss.png
+++ b/results/Experiment2/train_loss.png
--- a/results/Experiment2/val_loss.png
+++ b/results/Experiment2/val_loss.png