Commit Graph

37 Commits

Author SHA1 Message Date
Falko Victor Habel 3e78a595c9 removed base aiia class and replaced it with transformer support 2025-04-13 22:18:59 +02:00
Falko Victor Habel a8ddf2b559 updated config to not handle ustom model_name; removed ln rate and made it transformer compatible 2025-04-13 22:18:26 +02:00
Falko Victor Habel 30695154b4 removed depcreated models from init files
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s Details
2025-04-12 21:44:33 +02:00
Falko Victor Habel 040ac478b9 removed loading and saving functions since tf will take over
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s Details
2025-04-11 22:37:44 +02:00
Falko Victor Habel 0852ddb109 removed the chunked and recursive model, since they are now depreacted. for the transfomer switch I decided to focus on my idea with Sparse Mixutre of Experts with shared Params.
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 0s Details
2025-04-11 22:36:41 +02:00
Falko Victor Habel 5987a130f6 updated init to remove expeort model and move the smoe higher
Gitea Actions For AIIA / Explore-Gitea-Actions (push) Failing after 31s Details
2025-03-26 21:57:17 +01:00
Falko Victor Habel 10967ea880 added sparse moe 2025-03-26 21:26:00 +01:00
Falko Victor Habel 1fcb31b044 removed aiia from being able to get called 2025-03-26 21:25:52 +01:00
Falko Victor Habel 19bffa99d6 fixed model loading to support all models 2025-03-15 18:53:26 +01:00
Falko Victor Habel e5181c3066 corrected modelloading to also accept kwargs when e.g. using base models in Combination with expert models 2025-03-11 22:13:17 +01:00
Falko Victor Habel 35edadf727 gates where not saved as well 2025-03-03 19:01:29 +01:00
Falko Victor Habel cf41d0f6f6 correctly init moe class 2025-03-03 17:45:36 +01:00
Falko Victor Habel 81c9ae9d9d updated model for moe 2025-03-03 17:39:04 +01:00
Falko Victor Habel 899f714554 fixed model output 2025-03-03 17:05:35 +01:00
Falko Victor Habel 66762775e9 bugfix 2025-03-02 14:27:26 +01:00
Falko Victor Habel 1faf34749a added serilatsation 2025-02-24 15:16:46 +01:00
Falko Victor Habel 50e91b10e8 fixed model loading due to a bug 2025-02-24 14:13:10 +01:00
Falko Victor Habel b1c486afee added fp16 and bf16 support when loading model 2025-02-24 13:41:11 +01:00
Falko Victor Habel f8e59c5896 added cpu support when loading the model 2025-01-31 09:13:58 +01:00
Falko Victor Habel 1e665c4604 added first pip install version 0.1 2025-01-28 10:58:33 +01:00
Falko Victor Habel 3749ba9c5f updated base models MaxPool2D 2025-01-27 08:39:42 +01:00
Falko Victor Habel 29f0d86ff7 kernel size as large as channel size 2025-01-26 23:26:42 +01:00
Falko Victor Habel 338ac5dee5 corrected imports 2025-01-26 13:26:01 +01:00
Falko Victor Habel 59b2784e92 fixed spelling error 2025-01-26 13:10:24 +01:00
Falko Victor Habel e5a5618160 correct copying 2025-01-26 13:09:23 +01:00
Falko Victor Habel de3d58f6db improved cnn 2025-01-24 18:49:25 +01:00
Falko Victor Habel 8ac31c5bf1 improved shared model to have ~10% of params 2025-01-24 18:23:54 +01:00
Falko Victor Habel 599b8c4835 working shared model (with way to few params) 2025-01-24 18:04:44 +01:00
Falko Victor Habel 6e6f4c4a21 updated models and config to improve parameter handling and adding a copy function to use the same base config for mutliple models 2025-01-22 14:23:03 +01:00
Falko Victor Habel ab58d352c4 updated saving and first implementation of new additonal parameter handling 2025-01-22 14:16:56 +01:00
Falko Victor Habel 26b701fd77 corrected activation function 2025-01-22 11:20:05 +01:00
Falko Victor Habel 74973a325b updated models for improved config 2025-01-22 11:19:55 +01:00
Falko Victor Habel b87ce68c82 updated config t add kwrags to support future changes to the config for different models 2025-01-21 21:44:16 +01:00
Falko Victor Habel 99c3ec38c7 updated config for cnn 2025-01-21 20:06:53 +01:00
Falko Victor Habel 4c19838dab converted to cnn models 2025-01-20 13:25:36 +01:00
Falko Victor Habel b371d747fd models for training 2025-01-12 20:49:22 +01:00
Falko Victor Habel 7287ba543f first code commi. First protoyp for AIIA Config 2025-01-09 21:53:44 +01:00