mirror of
https://github.com/Cian-H/symbolic_nn_tests.git
synced 2025-12-22 14:11:59 +00:00
Updated readme use effectively as a lab journal
This commit is contained in:
83
README.md
83
README.md
@@ -6,13 +6,32 @@ This is a testing sandbox for developing various methods of injecting symbolic k
|
|||||||
|
|
||||||
## Experiment 1 - Semantically weighted loss
|
## Experiment 1 - Semantically weighted loss
|
||||||
|
|
||||||
|
### Planning
|
||||||
|
|
||||||
- Simple semantic loss based on intuitively noticeable properties of numeric characters
|
- Simple semantic loss based on intuitively noticeable properties of numeric characters
|
||||||
- Makes use of manually made reward matrices to weight rewards depending on how "close" the model was
|
- Makes use of manually made reward matrices to weight rewards depending on how "close" the model was
|
||||||
- Very rudimentary example of semantic loss
|
- Very rudimentary example of semantic loss
|
||||||
- Appeared to work perfectly
|
|
||||||
|
|
||||||
## Experiment 2 - Semantic loss function
|
### Results
|
||||||
|
|
||||||
|
- Training loss
|
||||||
|

|
||||||
|
- Validation loss
|
||||||
|

|
||||||
|
- Test loss
|
||||||
|

|
||||||
|
|
||||||
|
### Conclusions
|
||||||
|
|
||||||
|
- Seems to have worked
|
||||||
|
- Clear improvement in training rate with semantics added
|
||||||
|
- Similarity cross entropy in particular shows clear signs in validation loss of being on a similar complementary CDF to the normal cross-entropy loss, but training faster
|
||||||
|
- Interestingly: the "garbage" cross entropy seems to have also produced a very good result! This is likely because it hasn't been normalized to 1, so it may be simply amplifying the gradient by random amounts at all times. Basically acting as a fuzzy gradient booster.
|
||||||
|
- I would consider this experiment a success, with some interesting open questions remaining worth further examination
|
||||||
|
|
||||||
|
## Experiment 2 - Dataset qualitative characteristic derived semantic loss functions
|
||||||
|
|
||||||
|
### Planning
|
||||||
- Makes use of known physics equations that partially describe the problem to guide the model
|
- Makes use of known physics equations that partially describe the problem to guide the model
|
||||||
- Reduces the need for the model to learn known physics, allowing it to focus on learning the unknown physics
|
- Reduces the need for the model to learn known physics, allowing it to focus on learning the unknown physics
|
||||||
- Should accelerate training
|
- Should accelerate training
|
||||||
@@ -25,3 +44,63 @@ This is a testing sandbox for developing various methods of injecting symbolic k
|
|||||||
- [Molecular Properties](https://www.kaggle.com/datasets/burakhmmtgl/predict-molecular-properties)
|
- [Molecular Properties](https://www.kaggle.com/datasets/burakhmmtgl/predict-molecular-properties)
|
||||||
- [Nuclear Binding Energy](https://www.kaggle.com/datasets/iitm21f1003401/nuclear-binding-energy)
|
- [Nuclear Binding Energy](https://www.kaggle.com/datasets/iitm21f1003401/nuclear-binding-energy)
|
||||||
- [Body Fat Prediction](https://www.kaggle.com/datasets/fedesoriano/body-fat-prediction-dataset)
|
- [Body Fat Prediction](https://www.kaggle.com/datasets/fedesoriano/body-fat-prediction-dataset)
|
||||||
|
- Decided to use Molecular Properties dataset as it is quite familiar to me
|
||||||
|
- Training with semantics added to renationship between molecular energy and differential electronegativity
|
||||||
|
- Semantics being injected are:
|
||||||
|
- These values should be positively corellated
|
||||||
|
- These values should be weighted towards a high r^2 with an adaptive penalty
|
||||||
|
- Multiple attempts carried out:
|
||||||
|
- Simple penalties. Variations tested include:
|
||||||
|
```math
|
||||||
|
Loss = ( Softplus( -m ) + 1 ) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
```math
|
||||||
|
Loss = ( Relu( -m ) + 1 ) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
```math
|
||||||
|
Loss = ( \frac{1}{Sech(|r|)} + 1 ) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
```math
|
||||||
|
Loss = ( {r}^2 + 1) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
- Adaptive, self training penalties tuned by various methods. Best method found was optimisation by a random forest regressor. These tunable variants include:
|
||||||
|
```math
|
||||||
|
Loss = ( Softplus( \alpha * -m ) + 1 ) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
```math
|
||||||
|
Loss = ( Relu( \alpha * -m ) + 1 ) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
```math
|
||||||
|
Loss = ( \frac{ 1 }{ Sech( \alpha * |r| ) } + 1 ) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
```math
|
||||||
|
Loss = ( \alpha * { r }^2 + 1) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
- Final adaptive semantic loss function tested was the following:
|
||||||
|
```math
|
||||||
|
Loss = ( \alpha * { r }^2 + 1) * ( \frac{ 1 }{ \beta } * log( 1 + exp( \beta * \gamma * -m ) ) + 1 ) * Smooth_L1_Loss
|
||||||
|
```
|
||||||
|
|
||||||
|
### Results
|
||||||
|
|
||||||
|
- Training loss
|
||||||
|

|
||||||
|
- Validation loss
|
||||||
|

|
||||||
|
- Test loss
|
||||||
|

|
||||||
|
|
||||||
|
### Conclusions
|
||||||
|
|
||||||
|
- Method didn't appear to work too well because:
|
||||||
|
- Simple loss functions tested were likely suboptimal for effectively influencing model
|
||||||
|
- Guesses at parameters in simple functions need to be optimised, basically turning this into a hyperparameter optimisation problem, which defeats the purpose of semantic loss
|
||||||
|
- Adaptive, ML based loss functions do not appear to be converging quickly enough to train the model faster than the normal loss functions
|
||||||
|
- For this reason, I would conclude this experiment as a failure
|
||||||
|
|
||||||
|
## Experiment 3 - Physics informed semantic loss functions
|
||||||
|
|
||||||
|
### Planning
|
||||||
|
|
||||||
|
- Attempt to use more mathematically rigorous, formalised, and literature based approach to semantic loss functions.
|
||||||
|
- Although understudied, semantic loss has had a lot of theoretical maths done exploring the concept, and we need to figure out how to put this into code.
|
||||||
|
|||||||
BIN
results/Experiment1/test_loss.png
Normal file
BIN
results/Experiment1/test_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 62 KiB |
BIN
results/Experiment1/train_loss.png
Normal file
BIN
results/Experiment1/train_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 343 KiB |
BIN
results/Experiment1/val_loss.png
Normal file
BIN
results/Experiment1/val_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 140 KiB |
BIN
results/Experiment2/test_loss.png
Normal file
BIN
results/Experiment2/test_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 49 KiB |
BIN
results/Experiment2/train_loss.png
Normal file
BIN
results/Experiment2/train_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 93 KiB |
BIN
results/Experiment2/val_loss.png
Normal file
BIN
results/Experiment2/val_loss.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 92 KiB |
Reference in New Issue
Block a user