Merge pull request 'develop' (#1) from develop into main

Reviewed-on: #1
This commit is contained in:
Falko Victor Habel 2025-02-11 20:13:37 +00:00
commit 9795a748d5
4 changed files with 77 additions and 2 deletions

4
.gitignore vendored
View File

@ -153,6 +153,10 @@ dmypy.json
# Cython debug symbols
cython_debug/
#model
Godola
Godola-moe
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore

View File

@ -1,3 +1,58 @@
# Godala
# Godala-moe: A Mixture of Experts LLM Model
Godala is a custom AI model based on SmollM2, designed specifically for single-turn conversations. It has undergone various steps of continued pretraining and fine-tuning to enhance its capabilities. Godala specializes in answering questions about the Godot Game Engine and providing guidance on coding with GDScript.
## Overview
Godala-moe is an edge-ready Large Language Model (LLM) designed to run efficiently on various platforms. It is based on the [Hugging Face HuggingFaceTB/SmolLM2-1.7B Base model](https://huggingface.co/HuggingFaceTB/SmolLM2-1.7B) and has undergone continued pretraining stages, multiple finetuning steps, and merging processes to create a mixture of experts.
## Features
- **Edge-Ready**: Designed to run on edge devices with minimal resource requirements.
- **Mixture of Experts**: Combines multiple expert models for enhanced performance.
- **Continued Pretraining**: Underwent continued pretraining stages to improve model quality.
- **Finetuning**: Multiple finetuning steps to specialize the model for specific tasks.
## Current Status
The model has currently been trained on approximately 400 million tokens. While it shows promise, the quality is not yet at its optimal level due to the limited training data. Additionally, the code generation feature does not currently produce output in Markdown format.
## Future Improvements
- **Increase Training Tokens**: Plan to increase the number of training tokens to improve model performance.
- **Markdown Highlighting**: Implement correct markdown highlighting for better readability.
- **Increased Parameters**: Aim to increase the model parameters for enhanced capabilities.
## Usage
To use Godala-moe, follow these steps:
1. **Install Dependencies**:
Ensure you have the necessary dependencies installed. You can install them using pip:
```bash
pip install transformers torch
```
2. **Run the Model**:
Use the provided script to run the model. Here is an example script:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
location = "Godala-moe"
device = "cuda" # Use "cpu" when not using GPU
tokenizer = AutoTokenizer.from_pretrained(location)
model = AutoModelForCausalLM.from_pretrained(location).to(device)
messages = [{"role": "user", "content": "What can you tell me about Godot?"}]
input_text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = model.generate(inputs, max_new_tokens=510, temperature=0.2, top_p=0.9, do_sample=True)
print(tokenizer.decode(outputs[0]))
```
## Download
You can download the Godala-moe model from [this Link](https://gitea.fabelous.app/Maschine-Learning/Godola/releases/download/latest/Godola-moe.zip).
## License
This project is licensed under the [Creative Commons Attribution 4.0 International License](https://creativecommons.org/licenses/by/4.0/). The original model was licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).

15
example.py Normal file
View File

@ -0,0 +1,15 @@
from transformers import AutoModelForCausalLM, AutoTokenizer
location = "Godala-moe"
device = "gpu" # or cpu
tokenizer = AutoTokenizer.from_pretrained(location)
model = AutoModelForCausalLM.from_pretrained(location).to(device)
messages = [{"role": "user", "content": "What can you tell me about Godot?"}]
input_text=tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
outputs = model.generate(inputs, max_new_tokens=510, temperature=0.2, top_p=0.9, do_sample=True)
print(tokenizer.decode(outputs[0]))

1
requirements.txt Normal file
View File

@ -0,0 +1 @@
transformers