Falko Victor Habel
|
3e78a595c9
|
removed base aiia class and replaced it with transformer support
|
2025-04-13 22:18:59 +02:00 |
Falko Victor Habel
|
a8ddf2b559
|
updated config to not handle ustom model_name; removed ln rate and made it transformer compatible
|
2025-04-13 22:18:26 +02:00 |
Falko Victor Habel
|
30695154b4
|
removed depcreated models from init files
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s
Details
|
2025-04-12 21:44:33 +02:00 |
Falko Victor Habel
|
040ac478b9
|
removed loading and saving functions since tf will take over
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s
Details
|
2025-04-11 22:37:44 +02:00 |
Falko Victor Habel
|
0852ddb109
|
removed the chunked and recursive model, since they are now depreacted. for the transfomer switch I decided to focus on my idea with Sparse Mixutre of Experts with shared Params.
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s
Details
|
2025-04-11 22:36:41 +02:00 |
Falko Victor Habel
|
5987a130f6
|
updated init to remove expeort model and move the smoe higher
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 31s
Details
|
2025-03-26 21:57:17 +01:00 |
Falko Victor Habel
|
10967ea880
|
added sparse moe
|
2025-03-26 21:26:00 +01:00 |
Falko Victor Habel
|
1fcb31b044
|
removed aiia from being able to get called
|
2025-03-26 21:25:52 +01:00 |
Falko Victor Habel
|
19bffa99d6
|
fixed model loading to support all models
|
2025-03-15 18:53:26 +01:00 |
Falko Victor Habel
|
e5181c3066
|
corrected modelloading to also accept kwargs when e.g. using base models in Combination with expert models
|
2025-03-11 22:13:17 +01:00 |
Falko Victor Habel
|
35edadf727
|
gates where not saved as well
|
2025-03-03 19:01:29 +01:00 |
Falko Victor Habel
|
cf41d0f6f6
|
correctly init moe class
|
2025-03-03 17:45:36 +01:00 |
Falko Victor Habel
|
81c9ae9d9d
|
updated model for moe
|
2025-03-03 17:39:04 +01:00 |
Falko Victor Habel
|
899f714554
|
fixed model output
|
2025-03-03 17:05:35 +01:00 |
Falko Victor Habel
|
66762775e9
|
bugfix
|
2025-03-02 14:27:26 +01:00 |
Falko Victor Habel
|
1faf34749a
|
added serilatsation
|
2025-02-24 15:16:46 +01:00 |
Falko Victor Habel
|
50e91b10e8
|
fixed model loading due to a bug
|
2025-02-24 14:13:10 +01:00 |
Falko Victor Habel
|
b1c486afee
|
added fp16 and bf16 support when loading model
|
2025-02-24 13:41:11 +01:00 |
Falko Victor Habel
|
f8e59c5896
|
added cpu support when loading the model
|
2025-01-31 09:13:58 +01:00 |
Falko Victor Habel
|
1e665c4604
|
added first pip install version 0.1
|
2025-01-28 10:58:33 +01:00 |
Falko Victor Habel
|
3749ba9c5f
|
updated base models MaxPool2D
|
2025-01-27 08:39:42 +01:00 |
Falko Victor Habel
|
29f0d86ff7
|
kernel size as large as channel size
|
2025-01-26 23:26:42 +01:00 |
Falko Victor Habel
|
338ac5dee5
|
corrected imports
|
2025-01-26 13:26:01 +01:00 |
Falko Victor Habel
|
59b2784e92
|
fixed spelling error
|
2025-01-26 13:10:24 +01:00 |
Falko Victor Habel
|
e5a5618160
|
correct copying
|
2025-01-26 13:09:23 +01:00 |
Falko Victor Habel
|
de3d58f6db
|
improved cnn
|
2025-01-24 18:49:25 +01:00 |
Falko Victor Habel
|
8ac31c5bf1
|
improved shared model to have ~10% of params
|
2025-01-24 18:23:54 +01:00 |
Falko Victor Habel
|
599b8c4835
|
working shared model (with way to few params)
|
2025-01-24 18:04:44 +01:00 |
Falko Victor Habel
|
6e6f4c4a21
|
updated models and config to improve parameter handling and adding a copy function to use the same base config for mutliple models
|
2025-01-22 14:23:03 +01:00 |
Falko Victor Habel
|
ab58d352c4
|
updated saving and first implementation of new additonal parameter handling
|
2025-01-22 14:16:56 +01:00 |
Falko Victor Habel
|
26b701fd77
|
corrected activation function
|
2025-01-22 11:20:05 +01:00 |
Falko Victor Habel
|
74973a325b
|
updated models for improved config
|
2025-01-22 11:19:55 +01:00 |
Falko Victor Habel
|
b87ce68c82
|
updated config t add kwrags to support future changes to the config for different models
|
2025-01-21 21:44:16 +01:00 |
Falko Victor Habel
|
99c3ec38c7
|
updated config for cnn
|
2025-01-21 20:06:53 +01:00 |
Falko Victor Habel
|
4c19838dab
|
converted to cnn models
|
2025-01-20 13:25:36 +01:00 |
Falko Victor Habel
|
b371d747fd
|
models for training
|
2025-01-12 20:49:22 +01:00 |
Falko Victor Habel
|
7287ba543f
|
first code commi. First protoyp for AIIA Config
|
2025-01-09 21:53:44 +01:00 |