![DeepSeek Coder](https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/pictures/logo.png?raw=true)

[\[Homepage\]](https://www.deepseek.com/) | [\[ Chat with DeepSeek Coder\]](https://coder.deepseek.com/) | [\[Discord\]](https://discord.gg/Tc7c45Zzu5) | [\[Wechat()\]](https://github.com/guoday/assert/blob/main/QR.png?raw=true)

* * *

### [](#1-introduction-of-deepseek-coder-7b-instruct-v15)1\. Introduction of Deepseek-Coder-7B-Instruct v1.5

Deepseek-Coder-7B-Instruct-v1.5 is continue pre-trained from Deepseek-LLM 7B on 2T tokens by employing a window size of 4K and next token prediction objective, and then fine-tuned on 2B tokens of instruction data.

*   **Home Page:** [DeepSeek](https://deepseek.com/)
*   **Repository:** [deepseek-ai/deepseek-coder](https://github.com/deepseek-ai/deepseek-coder)
*   **Chat With DeepSeek Coder:** [DeepSeek-Coder](https://coder.deepseek.com/)

### [](#2-evaluation-results)2\. Evaluation Results

![DeepSeek Coder](https://cdn-uploads.huggingface.co/production/uploads/6538815d1bdb3c40db94fbfa/xOtCTW5xdoLCKY4FR6tri.png)

### [](#3-how-to-use)3\. How to Use

Here give some examples of how to use our model.

#### [](#chat-model-inference)Chat Model Inference

    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-coder-7b-instruct-v1.5", trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-coder-7b-instruct-v1.5", trust_remote_code=True).cuda()
    messages=[
        { 'role': 'user', 'content': "write a quick sort algorithm in python."}
    ]
    inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
    
    outputs = model.generate(inputs, max_new_tokens=512, do_sample=False, top_k=50, top_p=0.95, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id)
    print(tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True))
    

### [](#4-license)4\. License

This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the Model License. DeepSeek Coder supports commercial use.

See the [LICENSE-MODEL](https://github.com/deepseek-ai/deepseek-coder/blob/main/LICENSE-MODEL) for more details.

### [](#5-contact)5\. Contact

If you have any questions, please raise an issue or contact us at [service@deepseek.com](mailto:service@deepseek.com).

## Model overview

The `deepseek-coder-7b-instruct-v1.5` is a large language model developed by DeepSeek AI, a creator focused on building advanced AI systems. This model was trained on a massive 2 trillion token dataset, with 87% code and 13% natural language in both English and Chinese. The model was first pre-trained on this large corpus using a next token prediction objective, and then fine-tuned on 2 billion tokens of instruction data to give it strong coding capabilities.

Compared to similar DeepSeek Coder models like the [deepseek-coder-6.7b-instruct](https://aimodels.fyi/models/huggingFace/deepseek-coder-67b-instruct-deepseek-ai), [deepseek-coder-33b-instruct](https://aimodels.fyi/models/huggingFace/deepseek-coder-33b-instruct-deepseek-ai), and [deepseek-coder-1.3b-base](https://aimodels.fyi/models/huggingFace/deepseek-coder-13b-base-deepseek-ai), the `deepseek-coder-7b-instruct-v1.5` lands in the middle of the size spectrum at 7 billion parameters. It aims to balance powerful coding capabilities with reasonable computational requirements.

## Model inputs and outputs

The `deepseek-coder-7b-instruct-v1.5` model is a text-to-text transformer that can generate natural language responses to prompts. Its key capabilities center around coding tasks like code completion, code generation, and code understanding.

### Inputs
- Natural language prompts describing a coding task or problem
- Partially completed code snippets with gaps for the model to fill in

### Outputs
- Generated code to complete a given task or fill in missing code
- Natural language responses explaining code or providing insights

## Capabilities

The `deepseek-coder-7b-instruct-v1.5` model excels at a variety of coding-related tasks. It can generate working code for algorithms and functions, complete partially written code, and even explain coding concepts in plain language. For example, you can prompt the model to "write a quicksort algorithm in Python" and it will generate a full implementation. Or you can give it a partially written function and ask it to "fill in the missing code".

Beyond just generating code, the model also demonstrates strong understanding of programming languages and concepts. You can ask it to "explain how a hash table works" or "compare the time complexity of bubble sort and quicksort", and it will provide clear and insightful explanations.

## What can I use it for?

The `deepseek-coder-7b-instruct-v1.5` model opens up a wide range of potential use cases for developers and data scientists. Some key applications include:

- Automating routine coding tasks like boilerplate generation, refactoring, and bug fixing
- Enabling more natural and conversational programming interfaces for users
- Powering intelligent programming assistants that can explain concepts and provide coding help
- Accelerating prototyping and ideation by generating starting points for new projects

The model's broad capabilities also make it useful beyond just coding, such as for technical writing, documentation generation, and even creative ideation for software products.

## Things to try

One interesting aspect of the `deepseek-coder-7b-instruct-v1.5` model is its ability to work at both the granular code level and the broader project/repository level. You can prompt it with just a few lines of code and have it complete or explain that specific snippet. But you can also give it a larger codebase context, like the sample project files provided, and have it generate relevant new code or provide overall insights.

This multi-scale capability allows for some unique experiments, like prompting the model with a partially written function and asking it to not just fill in the missing pieces, but to also suggest improvements or alternative implementations. Or you could have it analyze an entire project and propose higher-level refactorings or design changes.

The model's strong performance on benchmarks like HumanEval, MultiPL-E, and APPS also make it an intriguing subject for further testing and exploration by the developer community.