Commit Graph

26 Commits

Author SHA1 Message Date
Falko Victor Habel 3e78a595c9 removed base aiia class and replaced it with transformer support 2025-04-13 22:18:59 +02:00
Falko Victor Habel 040ac478b9 removed loading and saving functions since tf will take over
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s Details
2025-04-11 22:37:44 +02:00
Falko Victor Habel 0852ddb109 removed the chunked and recursive model, since they are now depreacted. for the transfomer switch I decided to focus on my idea with Sparse Mixutre of Experts with shared Params.
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s Details
2025-04-11 22:36:41 +02:00
Falko Victor Habel 10967ea880 added sparse moe 2025-03-26 21:26:00 +01:00
Falko Victor Habel 19bffa99d6 fixed model loading to support all models 2025-03-15 18:53:26 +01:00
Falko Victor Habel e5181c3066 corrected modelloading to also accept kwargs when e.g. using base models in Combination with expert models 2025-03-11 22:13:17 +01:00
Falko Victor Habel 35edadf727 gates where not saved as well 2025-03-03 19:01:29 +01:00
Falko Victor Habel cf41d0f6f6 correctly init moe class 2025-03-03 17:45:36 +01:00
Falko Victor Habel 81c9ae9d9d updated model for moe 2025-03-03 17:39:04 +01:00
Falko Victor Habel 899f714554 fixed model output 2025-03-03 17:05:35 +01:00
Falko Victor Habel 66762775e9 bugfix 2025-03-02 14:27:26 +01:00
Falko Victor Habel 50e91b10e8 fixed model loading due to a bug 2025-02-24 14:13:10 +01:00
Falko Victor Habel b1c486afee added fp16 and bf16 support when loading model 2025-02-24 13:41:11 +01:00
Falko Victor Habel f8e59c5896 added cpu support when loading the model 2025-01-31 09:13:58 +01:00
Falko Victor Habel 1e665c4604 added first pip install version 0.1 2025-01-28 10:58:33 +01:00
Falko Victor Habel 3749ba9c5f updated base models MaxPool2D 2025-01-27 08:39:42 +01:00
Falko Victor Habel 59b2784e92 fixed spelling error 2025-01-26 13:10:24 +01:00
Falko Victor Habel e5a5618160 correct copying 2025-01-26 13:09:23 +01:00
Falko Victor Habel de3d58f6db improved cnn 2025-01-24 18:49:25 +01:00
Falko Victor Habel 8ac31c5bf1 improved shared model to have ~10% of params 2025-01-24 18:23:54 +01:00
Falko Victor Habel 599b8c4835 working shared model (with way to few params) 2025-01-24 18:04:44 +01:00
Falko Victor Habel 6e6f4c4a21 updated models and config to improve parameter handling and adding a copy function to use the same base config for mutliple models 2025-01-22 14:23:03 +01:00
Falko Victor Habel ab58d352c4 updated saving and first implementation of new additonal parameter handling 2025-01-22 14:16:56 +01:00
Falko Victor Habel 74973a325b updated models for improved config 2025-01-22 11:19:55 +01:00
Falko Victor Habel 4c19838dab converted to cnn models 2025-01-20 13:25:36 +01:00
Falko Victor Habel b371d747fd models for training 2025-01-12 20:49:22 +01:00