[](#codeshell)CodeShell
=======================

CodeShell[](http://se.pku.edu.cn/kcl/)AICodeShell70Tokens8194BenchmarkHumanEvalMBPPCodeShellCodeShellIDE[CodeShell](https://github.com/WisdomShell/codeshell)[Modelscope](https://modelscope.cn/organization/WisdomShell)[Wisemodel](https://www.wisemodel.cn/models/WisdomShell/CodeShell-7B/)CodeShell-7B

CodeShell is a multi-language code LLM developed by the [Knowledge Computing Lab](http://se.pku.edu.cn/kcl/) of Peking University. CodeShell has 7 billion parameters and was trained on 500 billion tokens with a context window length of 8194. On authoritative code evaluation benchmarks (HumanEval and MBPP), CodeShell achieves the best performance of its scale. Meanwhile, we provide deployment solutions and IDE plugins that complement CodeShell. Please refer to the [CodeShell code repository](https://github.com/WisdomShell/codeshell) for more details. This repository is for the CodeShell-7B base model.

[](#main-characteristics-of-codeshell)Main Characteristics of CodeShell
-----------------------------------------------------------------------

*   ****CodelShellHumanEvalMBPP7B
    
*   ****IDEVS CodeJetBrains
    
*   ****C++
    
*   ****
    
*   ****CodeShellToken
    
*   **Powerful Performance**: CodeShell achieves optimal performance for a 7B code base model on HumanEval and MBPP.
    
*   **Complete Ecosystem**: In addition to the mega code model, open-source IDE plugins (for VS Code and JetBrains) are also available, forming a comprehensive open-source full-stack technology system.
    
*   **Lightweight Deployment**: Supports local C++ deployment, offering a lightweight and fast localized software development assistant solution.
    
*   **Comprehensive Evaluation**: Provides a multi-task evaluation system that supports full project context, covering code generation, code defect detection and repair, test case generation, and other common software development activities (to be open-sourced soon).
    
*   **Efficient Training**: Based on an efficient data governance system, CodeShell, even when starting from scratch, achieved outstanding performance with training on just 500 trillion tokens.
    

[](#quickstart)Quickstart
-------------------------

### [](#code-generation)Code Generation

Codeshell Hugging Face

Codeshell offers a model in the Hugging Face format. Developers can load and use it with the following code.

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    tokenizer = AutoTokenizer.from_pretrained("WisdomShell/CodeShell-7B", trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained("WisdomShell/CodeShell-7B", trust_remote_code=True).cuda()
    inputs = tokenizer('def print_hello_world():', return_tensors='pt').cuda()
    outputs = model.generate(inputs)
    print(tokenizer.decode(outputs[0]))
    

### [](#fill-in-the-moddle)Fill in the Moddle

CodeShell Fill-in-the-Middle

CodeShell supports the Fill-in-the-Middle mode, thereby better facilitating the software development process.

    input_text = "<fim_prefix>def print_hello_world():\n    <fim_suffix>\n    print('Hello world!')<fim_middle>"
    inputs = tokenizer(input_text, return_tensors='pt').cuda()
    outputs = model.generate(inputs)
    print(tokenizer.decode(outputs[0]))
    

[](#model-details)Model Details
-------------------------------

Code ShellGPT-2Grouped-Query AttentionRoPE

Code Shell uses GPT-2 as its foundational architecture and incorporates technologies such as Grouped-Query Attention and RoPE relative position encoding.

Hyper-parameter

Value

n\_layer

42

n\_embd

4096

n\_inner

16384

n\_head

32

num\_query\_groups

8

seq-length

8192

vocab\_size

70144

[](#evaluation)Evaluation
-------------------------

HumanEvalMBPP7bCodeLllamaStarcoderCodeshell 

We selected the two most popular code evaluation datasets currently available (HumanEval and MBPP) to assess the model. Compared to the two most advanced 7b LLM for code, CodeLllama and Starcoder, Codeshell achieved the best results. The specific evaluation results are as follows.

### [](#pass1)Pass@1



CodeShell-7b

CodeLlama-7b

Starcoder-7b

humaneval

**34.32**

29.44

27.80

mbpp

**38.65**

37.60

34.16

multiple-js

**33.17**

31.30

27.02

multiple-java

**30.43**

29.24

24.30

multiple-cpp

**28.21**

27.33

23.04

multiple-swift

24.30

**25.32**

15.70

multiple-php

**30.87**

25.96

22.11

multiple-d

8.85

**11.60**

8.08

multiple-jl

22.08

**25.28**

22.96

multiple-lua

22.39

**30.50**

22.92

multiple-r

**20.52**

18.57

14.29

multiple-rkt

**17.20**

12.55

10.43

multiple-rs

24.55

**25.90**

22.82

[](#statement)Statement
=======================

CodeShellvscodeintellijiOSAndroidHarmonyOSWebCodeShellCodeShellCodeShell

CodeShell

We hereby declare that our development team has developed intelligent coding assistant plugins for vscode and intellij based on the CodeShell model, both of which have been open-sourced. Beyond this, whether for iOS, Android, HarmonyOS, Web, or any other platform, our development team has not developed any applications based on the CodeShell model. We strongly urge all users not to use the CodeShell model for activities that endanger national and social security or are illegal. At the same time, we request users not to use the CodeShell model in internet services that have not undergone proper security reviews and registration. We hope all users will adhere to this principle to ensure the development of technology in a compliant and legal environment.

Despite our significant efforts to ensure compliance in the data used during the model training process, unforeseen issues may arise due to the complexity of the models and data. Therefore, we are not responsible for any issues arising from the use of the open-sourced CodeShell model, including but not limited to data security issues, public opinion risks, or risks and problems related to the model being misused, abused, disseminated, or exploited improperly.

[](#license)License
===================

CodeShell[CodeShell](https://huggingface.co/WisdomShell/CodeShell-7B/resolve/main/CodeShell%E6%A8%A1%E5%9E%8B%E8%AE%B8%E5%8F%AF%E5%8D%8F%E8%AE%AE.pdf)[Apache 2.0 ](https://www.apache.org/licenses/LICENSE-2.0)CodeShellCodeShell

1.  DAU100
2.  
3.  

[codeshell.opensource@gmail.com](mailto:codeshell.opensource@gmail.com)CodeShell

Community use of the CodeShell model requires adherence to the ["CodeShell License Agreement"](https://huggingface.co/WisdomShell/CodeShell-7B/resolve/main/CodeShell%E6%A8%A1%E5%9E%8B%E8%AE%B8%E5%8F%AF%E5%8D%8F%E8%AE%AE.pdf) and the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). The CodeShell model is allowed for commercial use, but if you plan to use the CodeShell model or its derivatives for commercial purposes, you need to ensure that the entity meets the following conditions:

1.  The Daily Active Users (DAU) of your or your affiliate's service or product is less than 1 million.
2.  You and your affiliates must not be a software service provider or cloud service provider targeting individual users.
3.  You and your affiliates should not have the possibility of sub-licensing to other third parties without obtaining the commercial license granted.

Under the aforementioned conditions, you need to submit the application materials required by the "CodeShell License Agreement" by sending an email to [codeshell.opensource@gmail.com](mailto:codeshell.opensource@gmail.com). After approval, you will be granted a global, non-exclusive, non-transferable, non-sublicensable commercial copyright license.

## Model overview

`CodeShell-7B` is a multi-language code LLM developed by the [Knowledge Computing Lab](http://se.pku.edu.cn/kcl/) of Peking University. The model has 7 billion parameters and was trained on 500 billion tokens with a context window length of 8194. On authoritative code evaluation benchmarks (HumanEval and MBPP), `CodeShell-7B` achieves the best performance of its scale. 

Compared to similar models like [replit-code-v1-3b](https://aimodels.fyi/models/huggingFace/replit-code-v1-3b-replit), `CodeShell-7B` is a larger 7B parameter model trained on more data (500B vs 525B tokens). It also provides a more comprehensive ecosystem with open-source IDE plugins, local C++ deployment, and a multi-task evaluation system.

## Model inputs and outputs

`CodeShell-7B` is a text-to-text model designed for code generation. The model takes in text prompts and outputs generated code.

### Inputs
- Text prompts describing a coding task or providing context for the desired output

### Outputs
- Generated code in a variety of programming languages including C++, Python, JavaScript, and more
- The generated code is intended to be a solution to the given prompt or to continue the provided context

## Capabilities

`CodeShell-7B` demonstrates impressive code generation abilities, outperforming other models of its size on benchmarks like HumanEval and MBPP. It can generate functioning code across many languages to solve a wide range of programming problems.

## What can I use it for?

The `CodeShell-7B` model can be used for a variety of software development tasks, such as:

- Generating code snippets or entire functions based on natural language descriptions
- Assisting with coding by providing helpful completions and suggestions
- Automating repetitive coding tasks
- Prototyping new ideas and quickly generating working code
- Enhancing developer productivity by offloading mundane coding work

The model's strong performance and comprehensive ecosystem make it a powerful tool for both individual developers and teams working on software projects.

## Things to try

One interesting aspect of `CodeShell-7B` is its ability to generate code in multiple programming languages. You could experiment with prompting the model to translate a code snippet from one language to another, or to generate implementations of the same algorithm in different languages.

Another compelling use case is to provide the model with high-level requirements or user stories and have it generate the corresponding working code. This could be a great way to rapidly prototype new features or explore different design approaches.

Overall, the robust capabilities and flexible deployment options of `CodeShell-7B` make it a valuable tool for advancing your software development workflows and boosting productivity.