**Experiment26-7B**

An experiment for testing and refining a specific training and evaluation pipeline research framework.

This experiment aims to identify potential optimizations, focusing on data engineering, architecture efficiency, and evaluation performance.

The goal is to evaluate the effectiveness of a new training / evaluation pipeline for LLMs.

The experiment will explore adjustments in data preprocessing, model training algorithms, and evaluation metrics to test methods for improvement.

More details in the future experiments.

* * *

[](#license-apache-20)license: apache-2.0
-----------------------------------------

## Model overview

`Experiment26-7B` is an experiment by maintainer [yam-peleg](https://aimodels.fyi/creators/huggingFace/yam-peleg) to test and refine a specific training and evaluation pipeline research framework for large language models (LLMs). The goal is to identify potential optimizations in areas like data engineering, architecture efficiency, and evaluation performance. This experiment will explore adjustments to data preprocessing, training algorithms, and evaluation metrics to find ways to improve the effectiveness of the LLM training and evaluation pipeline.

Similar models like [llama-7b-hf](https://aimodels.fyi/models/huggingFace/llama-7b-hf-yahma), [llava-13b](https://aimodels.fyi/models/huggingFace/llava-13b-yorickvp), [llama-7b-se-rl-peft](https://aimodels.fyi/models/huggingFace/llama-7b-se-rl-peft-trl-lib), and [llama-7b-hf-transformers-4.29](https://aimodels.fyi/models/huggingFace/llama-7b-hf-transformers-429-elinas) are also focused on exploring and optimizing large language models, though with different specific goals and approaches.

## Model inputs and outputs

### Inputs
- Text prompts for the model to generate output from

### Outputs
- Generated text responses based on the provided prompts

## Capabilities

The `Experiment26-7B` model is designed to evaluate and improve the training and evaluation pipeline for large language models. While the specific capabilities of this experiment model are not detailed, the overall goal is to explore techniques for enhancing the effectiveness and performance of LLMs in areas like natural language understanding, question answering, and generation.

## What can I use it for?

As an experimental model, the primary intended use of `Experiment26-7B` is for research and development on large language models. Researchers in natural language processing, machine learning, and AI could use this model to test new approaches for training, evaluating, and improving the performance of LLMs. The insights gained from this experiment could then be applied to develop more capable and reliable language models for a variety of applications.

## Things to try

Since `Experiment26-7B` is an experimental model, some interesting things to try would be:

- Providing the model with a diverse set of prompts to evaluate its flexibility and generalization capabilities
- Analyzing the model's outputs to identify areas for improvement in terms of coherence, factual accuracy, and alignment with human preferences
- Experimenting with different data preprocessing techniques, training algorithms, and evaluation metrics to see their impact on the model's performance
- Comparing the results of this experiment to those of similar models to gain insights into effective strategies for LLM development and optimization.

[](#hebrew-mistral-7b)Hebrew-Mistral-7B
=======================================

Hebrew-Mistral-7B is an open-source Large Language Model (LLM) pretrained in hebrew and english pretrained with 7B billion parameters, based on Mistral-7B-v1.0 from Mistral.

It has an extended hebrew tokenizer with 64,000 tokens and is continuesly pretrained from Mistral-7B on tokens in both English and Hebrew.

The resulting model is a powerful general-purpose language model suitable for a wide range of natural language processing tasks, with a focus on Hebrew language understanding and generation.

### [](#usage)Usage

Below are some code snippets on how to get quickly started with running the model.

First make sure to `pip install -U transformers`, then copy the snippet from the section that is relevant for your usecase.

### [](#running-on-cpu)Running on CPU

    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    tokenizer = AutoTokenizer.from_pretrained("yam-peleg/Hebrew-Mistral-7B")
    model = AutoModelForCausalLM.from_pretrained("yam-peleg/Hebrew-Mistral-7B")
    
    input_text = "!   ?"
    input_ids = tokenizer(input_text, return_tensors="pt")
    
    outputs = model.generate(**input_ids)
    print(tokenizer.decode(outputs[0]))
    

### [](#running-on-gpu)Running on GPU

    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    tokenizer = AutoTokenizer.from_pretrained("yam-peleg/Hebrew-Mistral-7B")
    model = AutoModelForCausalLM.from_pretrained("yam-peleg/Hebrew-Mistral-7B", device_map="auto")
    
    input_text = "!   ?"
    input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
    
    outputs = model.generate(**input_ids)
    print(tokenizer.decode(outputs[0]))
    

### [](#running-with-4-bit-precision)Running with 4-Bit precision

    from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
    
    tokenizer = AutoTokenizer.from_pretrained("yam-peleg/Hebrew-Mistral-7B")
    model = AutoModelForCausalLM.from_pretrained("yam-peleg/Hebrew-Mistral-7B", quantization_config = BitsAndBytesConfig(load_in_4bit=True))
    
    input_text = "!   ?"
    input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
    
    outputs = model.generate(**input_ids)
    print(tokenizer.decode(outputs[0])
    

### [](#notice)Notice

Hebrew-Mistral-7B is a pretrained base model and therefore does not have any moderation mechanisms.

### [](#authors)Authors

*   Trained by Yam Peleg.
*   In collaboration with Jonathan Rouach and Arjeo, inc.

## Model overview

`Hebrew-Mistral-7B` is an open-source Large Language Model (LLM) pretrained in Hebrew and English with 7 billion parameters. It is based on the [Mistral-7B-v1.0](https://aimodels.fyi/models/huggingFace/mistral-7b-v01-mistralai) model from [Mistral AI](https://aimodels.fyi/creators/huggingFace/yam-peleg). The model has an extended Hebrew tokenizer with 64,000 tokens and is continuously pretrained on tokens in both English and Hebrew, making it a powerful general-purpose language model suitable for a wide range of natural language processing tasks with a focus on Hebrew language understanding and generation.

## Model inputs and outputs

`Hebrew-Mistral-7B` is a text-to-text model that can be used for a variety of natural language processing tasks. It takes textual inputs and generates textual outputs.

### Inputs
- Arbitrary text in Hebrew or English

### Outputs
- Generated text in Hebrew or English, depending on the input

## Capabilities

`Hebrew-Mistral-7B` is a capable language model that can be used for tasks such as text generation, translation, summarization, and more. It has strong performance on Hebrew language tasks due to its specialized pretraining.

## What can I use it for?

You can use `Hebrew-Mistral-7B` for a wide range of natural language processing applications, such as:

- Generating Hebrew text for creative writing, conversational agents, or other applications
- Translating between Hebrew and English
- Summarizing Hebrew text
- Answering questions about Hebrew language and culture

## Things to try

One interesting thing to try with `Hebrew-Mistral-7B` is using it for multilingual applications that involve both Hebrew and English. The model's strong performance on both languages makes it a good choice for tasks that require understanding and generation in multiple languages.