![DeepSeek Chat](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/images/logo.png?raw=true)

[\[Homepage\]](https://www.deepseek.com/) | [\[ Chat with DeepSeek LLM\]](https://chat.deepseek.com/) | [\[Discord\]](https://discord.gg/Tc7c45Zzu5) | [\[Wechat()\]](https://github.com/deepseek-ai/DeepSeek-LLM/blob/main/images/qr.jpeg)

* * *

### [](#1-introduction-of-deepseek-llm)1\. Introduction of Deepseek LLM

Introducing DeepSeek LLM, an advanced language model comprising 67 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese. In order to foster research, we have made DeepSeek LLM 7B/67B Base and DeepSeek LLM 7B/67B Chat open source for the research community.

### [](#2-model-summary)2\. Model Summary

`deepseek-llm-67b-chat` is a 67B parameter model initialized from `deepseek-llm-67b-base` and fine-tuned on extra instruction data.

*   **Home Page:** [DeepSeek](https://deepseek.com/)
*   **Repository:** [deepseek-ai/deepseek-LLM](https://github.com/deepseek-ai/deepseek-LLM)
*   **Chat With DeepSeek LLM:** [DeepSeek-LLM](https://chat.deepseek.com/)

### [](#3-how-to-use)3\. How to Use

Here give some examples of how to use our model.

#### [](#chat-completion)Chat Completion

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
    
    model_name = "deepseek-ai/deepseek-llm-67b-chat"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
    model.generation_config = GenerationConfig.from_pretrained(model_name)
    model.generation_config.pad_token_id = model.generation_config.eos_token_id
    
    messages = [
        {"role": "user", "content": "Who are you?"}
    ]
    input_tensor = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
    outputs = model.generate(input_tensor.to(model.device), max_new_tokens=100)
    
    result = tokenizer.decode(outputs[0][input_tensor.shape[1]:], skip_special_tokens=True)
    print(result)
    

Avoiding the use of the provided function `apply_chat_template`, you can also interact with our model following the sample template. Note that `messages` should be replaced by your input.

    User: {messages[0]['content']}
    
    Assistant: {messages[1]['content']}<endofsentence>User: {messages[2]['content']}
    
    Assistant:
    

**Note:** By default (`add_special_tokens=True`), our tokenizer automatically adds a `bos_token` (`<beginofsentence>`) before the input text. Additionally, since the system prompt is not compatible with this version of our models, we DO NOT RECOMMEND including the system prompt in your input.

### [](#4-license)4\. License

This code repository is licensed under the MIT License. The use of DeepSeek LLM models is subject to the Model License. DeepSeek LLM supports commercial use.

See the [LICENSE-MODEL](https://github.com/deepseek-ai/deepseek-LLM/blob/main/LICENSE-MODEL) for more details.

### [](#5-contact)5\. Contact

If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).

## Model overview

`deepseek-llm-67b-chat` is a 67 billion parameter language model created by [DeepSeek AI](https://aimodels.fyi/creators/huggingFace/deepseek-ai). It is an advanced model trained on a vast dataset of 2 trillion tokens in both English and Chinese. The model is fine-tuned on extra instruction data compared to the `deepseek-llm-67b-base` version, making it well-suited for conversational tasks.

Similar models include the `deepseek-coder-6.7b-instruct` and `deepseek-coder-33b-instruct` models, which are specialized for code generation and programming tasks. These models were also developed by DeepSeek AI and have shown state-of-the-art performance on various coding benchmarks.

## Model inputs and outputs

### Inputs
- **Text Prompts**: The model accepts natural language text prompts as input, which can include instructions, questions, or statements.
- **Chat History**: The model can maintain a conversation history, allowing it to provide coherent and contextual responses.

### Outputs
- **Text Generations**: The primary output of the model is generated text, which can range from short responses to longer form paragraphs or essays.

## Capabilities

The `deepseek-llm-67b-chat` model is capable of engaging in open-ended conversations, answering questions, and generating coherent text on a wide variety of topics. It has demonstrated strong performance on benchmarks evaluating language understanding, reasoning, and generation.

## What can I use it for?

The `deepseek-llm-67b-chat` model can be used for a variety of applications, such as:

- **Conversational AI Assistants**: The model can be used to power intelligent chatbots and virtual assistants that can engage in natural dialogue.
- **Content Generation**: The model can be used to generate text for articles, stories, or other creative writing tasks.
- **Question Answering**: The model can be used to answer questions on a wide range of topics, making it useful for educational or research applications.

## Things to try

One interesting aspect of the `deepseek-llm-67b-chat` model is its ability to maintain context and engage in multi-turn conversations. You can try providing the model with a series of related prompts and see how it responds, building upon the prior context. This can help showcase the model's coherence and understanding of the overall dialogue.

Another thing to explore is the model's performance on specialized tasks, such as code generation or mathematical problem-solving. By fine-tuning or prompting the model appropriately, you may be able to unlock additional capabilities beyond open-ended conversation.