This repository is the first model in the OpenHathi series of models that will be released by Sarvam AI. This is a 7B parameter, based on Llama2, trained on Hindi, English, and Hinglish. More details about the model, its training procedure, and evaluations can be found [here](https://www.sarvam.ai/blog/announcing-openhathi-series).

Note: this is a base model and not meant to be used as is. We recommend first finetuning it on task(s) you are interested in.

    # Usage
    import torch
    from transformers import LlamaTokenizer, LlamaForCausalLM
    
    tokenizer = LlamaTokenizer.from_pretrained('sarvamai/OpenHathi-7B-Hi-v0.1-Base')
    model = LlamaForCausalLM.from_pretrained('sarvamai/OpenHathi-7B-Hi-v0.1-Base', torch_dtype=torch.bfloat16)
    
    prompt = "    "
    inputs = tokenizer(prompt, return_tensors="pt")
    
    # Generate
    generate_ids = model.generate(inputs.input_ids, max_length=30)
    tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

## Model overview

`OpenHathi-7B-Hi-v0.1-Base` is a large language model developed by [Sarvam AI](https://aimodels.fyi/creators/huggingFace/sarvamai) that is based on Llama2 and trained on Hindi, English, and Hinglish data. It is a 7 billion parameter model, making it a mid-sized model compared to similar offerings like the [alpaca-30b](https://aimodels.fyi/models/huggingFace/alpaca-30b-baseten) and [PMC_LLAMA_7B](https://aimodels.fyi/models/huggingFace/pmcllama7b-chaoyi-wu) models. This base model is designed to be fine-tuned on specific tasks, rather than used directly.

## Model inputs and outputs

`OpenHathi-7B-Hi-v0.1-Base` is a text-to-text model, meaning it takes in text and generates new text in response. The model can handle a variety of language inputs, including Hindi, English, and code.

### Inputs
- Text prompts in Hindi, English, or Hinglish

### Outputs
- Generated text in response to the input prompt

## Capabilities

`OpenHathi-7B-Hi-v0.1-Base` has broad capabilities in language generation, from open-ended conversation to task-oriented outputs. The model can be used for tasks like text summarization, question answering, and creative writing. It also has the potential to be fine-tuned for more specialized use cases, such as code generation or domain-specific language modeling.

## What can I use it for?

The `OpenHathi-7B-Hi-v0.1-Base` model could be useful for a variety of applications that require language understanding and generation in Hindi, English, or a mix of the two. Some potential use cases include:

- Building virtual assistants or chatbots that can communicate in Hindi and English
- Generating content like news articles, product descriptions, or creative writing in multiple languages
- Translating between Hindi and English
- Providing language support for applications targeting Indian users

## Things to try

One interesting thing to try with `OpenHathi-7B-Hi-v0.1-Base` would be to fine-tune it on a specific domain or task, such as customer service, technical writing, or programming. This could help the model learn the nuances and specialized vocabulary of that area, allowing it to generate more relevant and useful text. Additionally, exploring the model's performance on code-switching between Hindi and English could yield insights into its language understanding capabilities.