![DeepSeek Coder](https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/pictures/logo.png?raw=true)

[\[Homepage\]](https://www.deepseek.com/) | [\[ Chat with DeepSeek Coder\]](https://coder.deepseek.com/) | [\[Discord\]](https://discord.gg/Tc7c45Zzu5) | [\[Wechat()\]](https://github.com/guoday/assert/blob/main/QR.png?raw=true)

* * *

### [](#1-introduction-of-deepseek-coder)1\. Introduction of Deepseek Coder

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.

*   **Massive Training Data**: Trained from scratch fon 2T tokens, including 87% code and 13% linguistic data in both English and Chinese languages.
    
*   **Highly Flexible & Scalable**: Offered in model sizes of 1.3B, 5.7B, 6.7B, and 33B, enabling users to choose the setup most suitable for their requirements.
    
*   **Superior Model Performance**: State-of-the-art performance among publicly available code models on HumanEval, MultiPL-E, MBPP, DS-1000, and APPS benchmarks.
    
*   **Advanced Code Completion Capabilities**: A window size of 16K and a fill-in-the-blank task, supporting project-level code completion and infilling tasks.
    

### [](#2-model-summary)2\. Model Summary

deepseek-coder-6.7b-instruct is a 6.7B parameter model initialized from deepseek-coder-6.7b-base and fine-tuned on 2B tokens of instruction data.

*   **Home Page:** [DeepSeek](https://deepseek.com/)
*   **Repository:** [deepseek-ai/deepseek-coder](https://github.com/deepseek-ai/deepseek-coder)
*   **Chat With DeepSeek Coder:** [DeepSeek-Coder](https://coder.deepseek.com/)

### [](#3-how-to-use)3\. How to Use

Here give some examples of how to use our model.

#### [](#chat-model-inference)Chat Model Inference

    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-6.7b-instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
    messages=[
        { 'role': 'user', 'content': "write a quick sort algorithm in python."}
    ]
    inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
    # tokenizer.eos_token_id is the id of <|EOT|> token
    outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
    print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
    

### [](#4-license)4\. License

This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the Model License. DeepSeek Coder supports commercial use.

See the [LICENSE-MODEL](https://github.com/deepseek-ai/deepseek-coder/blob/main/LICENSE-MODEL) for more details.

### [](#5-contact)5\. Contact

If you have any questions, please raise an issue or contact us at [agi\_code@deepseek.com](mailto:agi_code@deepseek.com).

## Model Overview

`deepseek-coder-6.7b-instruct` is a 6.7B parameter language model developed by [DeepSeek AI](https://aimodels.fyi/creators/huggingFace/deepseek-ai) that has been fine-tuned on 2B tokens of instruction data. It is part of the DeepSeek Coder family of code models, which are composed of models ranging from 1B to 33B parameters, all trained from scratch on a massive 2T token corpus of 87% code and 13% natural language data in English and Chinese.

The DeepSeek Coder models, including the `deepseek-coder-6.7b-instruct` model, are designed to excel at coding tasks. They achieve state-of-the-art performance on benchmarks like HumanEval, MultiPL-E, MBPP, DS-1000, and APPS, thanks to their large training data and advanced architecture. The models leverage a 16K window size and a fill-in-the-blank task to support project-level code completion and infilling.

Other similar models in the DeepSeek Coder family include the [deepseek-coder-33b-instruct](https://aimodels.fyi/models/huggingFace/deepseek-coder-33b-instruct-deepseek-ai) model, which is a larger 33B parameter version, and the [Magicoder-S-DS-6.7B](https://aimodels.fyi/models/huggingFace/magicoder-s-ds-67b-ise-uiuc) model, which was fine-tuned from the `deepseek-coder-6.7b-base` model using a novel approach called OSS-Instruct to generate more diverse and realistic instruction data.

## Model Inputs and Outputs

### Inputs
- **Natural language instructions**: The model can take in natural language instructions or prompts related to coding tasks, such as "write a quick sort algorithm in python."

### Outputs
- **Generated code**: The model outputs the generated code that attempts to fulfill the provided instruction or prompt.

## Capabilities

The `deepseek-coder-6.7b-instruct` model is highly capable at a wide range of coding tasks, from writing algorithms and functions to generating entire programs. Due to its large training dataset and advanced architecture, the model is able to produce high-quality, contextual code that often performs well on benchmarks. 

For example, when prompted to "write a quick sort algorithm in python", the model can generate the following code:

```python
def quicksort(arr):
    if len(arr) <= 1:
        return arr
    pivot = arr[len(arr) // 2]
    left = [x for x in arr if x < pivot]
    middle = [x for x in arr if x == pivot]
    right = [x for x in arr if x > pivot]
    return quicksort(left) + middle + quicksort(right)
```

This demonstrates the model's ability to understand coding concepts and generate complete, working solutions to algorithmic problems.

## What Can I Use It For?

The `deepseek-coder-6.7b-instruct` model can be leveraged for a variety of coding-related applications and tasks, such as:

- **Code generation**: Automatically generate code snippets, functions, or even entire programs based on natural language instructions or prompts.
- **Code completion**: Use the model to intelligently complete partially written code, suggesting the most relevant and appropriate next steps.
- **Code refactoring**: Leverage the model to help refactor existing code, improving its structure, readability, and performance.
- **Prototyping and ideation**: Quickly generate code to explore and experiment with new ideas, without having to start from scratch.

Companies or developers working on tools and applications related to software development, coding, or programming could potentially use this model to enhance their offerings and improve developer productivity.

## Things to Try

Some interesting things to try with the `deepseek-coder-6.7b-instruct` model include:

- **Exploring different programming languages**: Test the model's capabilities across a variety of programming languages, not just Python, to see how it performs.
- **Prompting for complex algorithms and architectures**: Challenge the model with more advanced coding tasks, like generating entire software systems or complex data structures, to push the limits of its abilities.
- **Combining with other tools**: Integrate the model into your existing development workflows and tools, such as IDEs or code editors, to streamline and enhance the coding process.
- **Experimenting with fine-tuning**: Try fine-tuning the model on your own datasets or tasks to further customize its performance for your specific needs.

By exploring the full range of the `deepseek-coder-6.7b-instruct` model's capabilities, you can unlock new possibilities for improving and automating your coding workflows.