48 lines
1.4 KiB
Markdown
48 lines
1.4 KiB
Markdown
# Fabelous Embedder Mini
|
||
|
||
This repository contains a custom sentence transformer model named **fabelous-mini-embedder**, trained on 13 different programming languages and English.
|
||
|
||
## Description
|
||
|
||
The model was built using the Sentence Transformer library, which provides an easy-to-use interface for working with sentence embeddings, including pre-trained models.
|
||
|
||
In addition to **fabelous-mini-embedder**, there is also have a proprietary model called **fabelous-embedder-base**. Both models are trained on a vast dataset covering various programming languages, including:
|
||
|
||
- Python
|
||
- Java
|
||
- Go
|
||
- C++
|
||
- TypeScript
|
||
|
||
|
||
## Example Usage
|
||
|
||
Here’s how to use the model to generate sentence embeddings:
|
||
|
||
```python
|
||
from sentence_transformers import SentenceTransformer
|
||
|
||
model = SentenceTransformer("fabelous-mini-embedder")
|
||
|
||
instruction = "This is an example sentence"
|
||
embeddings = model.encode(instruction)
|
||
|
||
print(embeddings) # Output: (array of numerical embeddings)
|
||
```
|
||
|
||
The generated embeddings can be used for tasks like semantic search or classification.
|
||
|
||
## Future Improvements
|
||
|
||
The second generation is in the works, focusing on:
|
||
|
||
- **Enhanced Multilingual Support:** Adding support for German in addition to English.
|
||
- **Expanded Dataset:** Increasing the code dataset significantly to improve model performance and accuracy.
|
||
|
||
## Installation
|
||
|
||
To install the required libraries, run:
|
||
|
||
```bash
|
||
pip install sentence-transformers
|
||
``` |