![DeepSeek Coder](https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/pictures/logo.png?raw=true)

[\[Homepage\]](https://www.deepseek.com/) | [\[ Chat with DeepSeek Coder\]](https://coder.deepseek.com/) | [\[Discord\]](https://discord.gg/Tc7c45Zzu5) | [\[Wechat()\]](https://github.com/guoday/assert/blob/main/QR.png?raw=true)

* * *

### [](#1-introduction-of-deepseek-coder)1\. Introduction of Deepseek Coder

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese. We provide various sizes of the code model, ranging from 1B to 33B versions. Each model is pre-trained on project-level code corpus by employing a window size of 16K and a extra fill-in-the-blank task, to support project-level code completion and infilling. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.

*   **Massive Training Data**: Trained from scratch on 2T tokens, including 87% code and 13% linguistic data in both English and Chinese languages.
    
*   **Highly Flexible & Scalable**: Offered in model sizes of 1.3B, 5.7B, 6.7B, and 33B, enabling users to choose the setup most suitable for their requirements.
    
*   **Superior Model Performance**: State-of-the-art performance among publicly available code models on HumanEval, MultiPL-E, MBPP, DS-1000, and APPS benchmarks.
    
*   **Advanced Code Completion Capabilities**: A window size of 16K and a fill-in-the-blank task, supporting project-level code completion and infilling tasks.
    

### [](#2-model-summary)2\. Model Summary

deepseek-coder-1.3b-instruct is a 1.3B parameter model initialized from deepseek-coder-1.3b-base and fine-tuned on 2B tokens of instruction data.

*   **Home Page:** [DeepSeek](https://deepseek.com/)
*   **Repository:** [deepseek-ai/deepseek-coder](https://github.com/deepseek-ai/deepseek-coder)
*   **Chat With DeepSeek Coder:** [DeepSeek-Coder](https://coder.deepseek.com/)

### [](#3-how-to-use)3\. How to Use

Here give some examples of how to use our model.

#### [](#chat-model-inference)Chat Model Inference

    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-1.3b-instruct", trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-1.3b-instruct", trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()
    messages=[
        { 'role': 'user', 'content': "write a quick sort algorithm in python."}
    ]
    inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
    # tokenizer.eos_token_id is the id of <|EOT|> token
    outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
    print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
    

### [](#4-license)4\. License

This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the Model License. DeepSeek Coder supports commercial use.

See the [LICENSE-MODEL](https://github.com/deepseek-ai/deepseek-coder/blob/main/LICENSE-MODEL) for more details.

### [](#5-contact)5\. Contact

If you have any questions, please raise an issue or contact us at [agi\_code@deepseek.com](mailto:agi_code@deepseek.com).

## Model overview

The `deepseek-coder-1.3b-instruct` model is a 1.3 billion parameter language model trained by [DeepSeek AI](https://aimodels.fyi/creators/huggingFace/deepseek-ai) that is specifically designed for coding tasks. It is part of the DeepSeek Coder series, which includes models ranging from 1B to 33B parameters. The DeepSeek Coder models are trained on a massive dataset of 2 trillion tokens, with 87% of the data being code and 13% being natural language text in both English and Chinese. This allows the models to excel at a wide range of coding-related tasks.

Similar models in the DeepSeek Coder series include the [deepseek-coder-33b-instruct](https://aimodels.fyi/models/huggingFace/deepseek-coder-33b-instruct-deepseek-ai), [deepseek-coder-6.7b-instruct](https://aimodels.fyi/models/huggingFace/deepseek-coder-67b-instruct-deepseek-ai), [deepseek-coder-1.3b-base](https://aimodels.fyi/models/huggingFace/deepseek-coder-13b-base-deepseek-ai), [deepseek-coder-33b-base](https://aimodels.fyi/models/huggingFace/deepseek-coder-33b-base-deepseek-ai), and [deepseek-coder-6.7b-base](https://aimodels.fyi/models/huggingFace/deepseek-coder-67b-base-deepseek-ai). These models offer a range of sizes and capabilities to suit different needs.

## Model inputs and outputs

The `deepseek-coder-1.3b-instruct` model takes in natural language prompts and generates code outputs. The model can be used for a variety of coding-related tasks, such as code generation, code completion, and code insertion.

### Inputs
- Natural language prompts and instructions related to coding tasks

### Outputs
- Generated code in various programming languages
- Completed or inserted code snippets based on the input prompt

## Capabilities

The `deepseek-coder-1.3b-instruct` model excels at a wide range of coding-related tasks, including writing algorithms, implementing data structures, and solving coding challenges. For example, the model can generate a quick sort algorithm in Python when given the prompt "write a quick sort algorithm". It can also complete or insert code snippets into existing code, helping to streamline the programming workflow.

## What can I use it for?

The `deepseek-coder-1.3b-instruct` model can be used for a variety of applications that require coding or programming capabilities. Some potential use cases include:

- Developing prototypes or proofs of concept: The model can generate code to quickly test ideas and explore new concepts.
- Automating repetitive coding tasks: The model can assist with tasks like code formatting, refactoring, or boilerplate generation.
- Enhancing developer productivity: The model's code completion and insertion capabilities can help developers write code more efficiently.
- Educational and training purposes: The model can be used to teach programming concepts or provide feedback on coding assignments.

## Things to try

One interesting aspect of the `deepseek-coder-1.3b-instruct` model is its ability to work at the project level, thanks to its large training dataset and specialized pre-training tasks. This means the model can generate or complete code that is contextually relevant to a larger codebase, rather than just producing standalone snippets. Try providing the model with a partial code file and see how it can suggest relevant completions or insertions to extend the functionality.

Another interesting experiment would be to combine the `deepseek-coder-1.3b-instruct` model with other AI-powered tools, such as code editors or IDE plugins. This could create a powerful coding assistant that can provide intelligent, context-aware code suggestions and help streamline the development workflow.