Mamba-Chat is the first chat language model based on a state-space model architecture, not a transformer.

The model is a fine-tune of Albert Gu's and Tri Dao's model [Mamba-2.8B](https://github.com/state-spaces/mamba) from their paper _Mamba: Linear-Time Sequence Modeling with Selective State Spaces_.

Check our our [Github repository](https://github.com/havenhq/mamba-chat/tree/main) for training and inference code.

The prompt format is the zephyr format:

    <|user|> {user_message}
    <|assistant|> {assistant_message}
    <|user|> {user_message}
    <|assistant|> {assistant_message}

## Model overview

`mamba-chat` is a chat language model developed by [havenhq](https://aimodels.fyi/creators/huggingFace/havenhq) that is based on a state-space model architecture, rather than a transformer. It is a fine-tune of the [Mamba-2.8B](https://github.com/state-spaces/mamba) model from the paper "Mamba: Linear-Time Sequence Modeling with Selective State Spaces". This model differs from similar transformer-based chat models in its unique architecture, which aims to provide more efficient and scalable language modeling capabilities.

## Model inputs and outputs

`mamba-chat` is a text-to-text model, taking user messages as input and generating assistant responses. The model uses a specific prompt format called the "zephyr format", which structures the conversation as a sequence of user and assistant messages.

### Inputs
- User messages in the zephyr format: `<|user|> {user_message}`

### Outputs
- Assistant responses in the zephyr format: `<|assistant|> {assistant_message}`

## Capabilities

`mamba-chat` is capable of engaging in open-ended conversations, answering questions, and generating coherent and relevant responses. Its state-space architecture aims to provide more efficient and scalable language modeling compared to transformer-based models, particularly for information-dense tasks.

## What can I use it for?

`mamba-chat` can be used for a variety of natural language processing tasks, such as chatbots, personal assistants, and language generation applications. Its unique architecture may make it well-suited for applications that require efficient and scalable language modeling, such as large-scale conversational systems or text generation tasks.

## Things to try

Experiment with the model's capabilities by providing it with a variety of prompts and messages in the zephyr format, and observe the quality and coherence of the generated responses. You can also compare the performance of `mamba-chat` to other chat models, such as the [Llama-2-7b-chat](https://aimodels.fyi/models/huggingFace/llama-2-7b-chat-meta) model, to better understand its unique strengths and limitations.