Falko Victor Habel
|
3e78a595c9
|
removed base aiia class and replaced it with transformer support
|
2025-04-13 22:18:59 +02:00 |
Falko Victor Habel
|
040ac478b9
|
removed loading and saving functions since tf will take over
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s
Details
|
2025-04-11 22:37:44 +02:00 |
Falko Victor Habel
|
0852ddb109
|
removed the chunked and recursive model, since they are now depreacted. for the transfomer switch I decided to focus on my idea with Sparse Mixutre of Experts with shared Params.
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s
Details
|
2025-04-11 22:36:41 +02:00 |
Falko Victor Habel
|
10967ea880
|
added sparse moe
|
2025-03-26 21:26:00 +01:00 |
Falko Victor Habel
|
19bffa99d6
|
fixed model loading to support all models
|
2025-03-15 18:53:26 +01:00 |
Falko Victor Habel
|
e5181c3066
|
corrected modelloading to also accept kwargs when e.g. using base models in Combination with expert models
|
2025-03-11 22:13:17 +01:00 |
Falko Victor Habel
|
35edadf727
|
gates where not saved as well
|
2025-03-03 19:01:29 +01:00 |
Falko Victor Habel
|
cf41d0f6f6
|
correctly init moe class
|
2025-03-03 17:45:36 +01:00 |
Falko Victor Habel
|
81c9ae9d9d
|
updated model for moe
|
2025-03-03 17:39:04 +01:00 |
Falko Victor Habel
|
899f714554
|
fixed model output
|
2025-03-03 17:05:35 +01:00 |
Falko Victor Habel
|
66762775e9
|
bugfix
|
2025-03-02 14:27:26 +01:00 |
Falko Victor Habel
|
50e91b10e8
|
fixed model loading due to a bug
|
2025-02-24 14:13:10 +01:00 |
Falko Victor Habel
|
b1c486afee
|
added fp16 and bf16 support when loading model
|
2025-02-24 13:41:11 +01:00 |
Falko Victor Habel
|
f8e59c5896
|
added cpu support when loading the model
|
2025-01-31 09:13:58 +01:00 |
Falko Victor Habel
|
1e665c4604
|
added first pip install version 0.1
|
2025-01-28 10:58:33 +01:00 |
Falko Victor Habel
|
3749ba9c5f
|
updated base models MaxPool2D
|
2025-01-27 08:39:42 +01:00 |
Falko Victor Habel
|
59b2784e92
|
fixed spelling error
|
2025-01-26 13:10:24 +01:00 |
Falko Victor Habel
|
e5a5618160
|
correct copying
|
2025-01-26 13:09:23 +01:00 |
Falko Victor Habel
|
de3d58f6db
|
improved cnn
|
2025-01-24 18:49:25 +01:00 |
Falko Victor Habel
|
8ac31c5bf1
|
improved shared model to have ~10% of params
|
2025-01-24 18:23:54 +01:00 |
Falko Victor Habel
|
599b8c4835
|
working shared model (with way to few params)
|
2025-01-24 18:04:44 +01:00 |
Falko Victor Habel
|
6e6f4c4a21
|
updated models and config to improve parameter handling and adding a copy function to use the same base config for mutliple models
|
2025-01-22 14:23:03 +01:00 |
Falko Victor Habel
|
ab58d352c4
|
updated saving and first implementation of new additonal parameter handling
|
2025-01-22 14:16:56 +01:00 |
Falko Victor Habel
|
74973a325b
|
updated models for improved config
|
2025-01-22 11:19:55 +01:00 |
Falko Victor Habel
|
4c19838dab
|
converted to cnn models
|
2025-01-20 13:25:36 +01:00 |
Falko Victor Habel
|
b371d747fd
|
models for training
|
2025-01-12 20:49:22 +01:00 |