WaveCoder: Widespread And Versatile Enhanced Code LLM
========================================================

[**\[ Paper\]**](https://arxiv.org/abs/2312.14187)  [**\[ GitHub\]**](https://github.com/microsoft/WaveCoder)  
[**\[ Twitter\]**](https://twitter.com/TeamCodeLLM_AI)  [**\[ Reddit\]**](https://www.reddit.com/r/LocalLLaMA/comments/19a1scy/wavecoderultra67b_claims_to_be_the_2nd_best_model/)  [\[ Unofficial Blog\]](https://www.analyticsvidhya.com/blog/2024/01/microsofts-wavecoder-and-codeocean-revolutionize-instruction-tuning/)

Repo for "[WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation](https://arxiv.org/abs/2312.14187)"

[](#-news) News
-------------------

*   \[2024/04/10\]  WaveCoder repo, models released at [ HuggingFace](https://huggingface.co/microsoft/wavecoder-ultra-6.7b)!
*   \[2023/12/26\] WaveCoder paper released.

[](#-introduction) Introduction
-----------------------------------

WaveCoder  is a series of large language models (LLMs) for the coding domain, designed to solve relevant problems in the field of code through instruction-following learning. Its training dataset was generated from a subset of code-search-net data using a generator-discriminator framework based on LLMs that we proposed, covering four general code-related tasks: code generation, code summary, code translation, and code repair.

Model

HumanEval

MBPP(500)

HumanEval  
Fix(Avg.)

HumanEval  
Explain(Avg.)

GPT-4

85.4

\-

47.8

52.1

[ WaveCoder-DS-6.7B](https://huggingface.co/microsoft/wavecoder-ds-6.7b)

65.8

63.0

49.5

40.8

[ WaveCoder-Pro-6.7B](https://huggingface.co/microsoft/wavecoder-pro-6.7b)

74.4

63.4

52.1

43.0

[ WaveCoder-Ultra-6.7B](https://huggingface.co/microsoft/wavecoder-ultra-6.7b)

79.9

64.6

52.3

45.7

[](#-evaluation) Evaluation
-------------------------------

Please refer to WaveCoder's [GitHub repo](https://github.com/microsoft/WaveCoder) for inference, evaluation, and training code.

    # Load model directly
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("microsoft/wavecoder-ultra-6.7b")
    model = AutoModelForCausalLM.from_pretrained("microsoft/wavecoder-ultra-6.7b")
    

[](#-license) License
-------------------------

This code repository is licensed under the MIT License. The use of DeepSeek Coder models is subject to the its [License](https://github.com/deepseek-ai/DeepSeek-Coder/blob/main/LICENSE-MODEL).

[](#-citation) Citation
---------------------------

If you find this repository helpful, please consider citing our paper:

    @article{yu2023wavecoder,
      title={Wavecoder: Widespread and versatile enhanced instruction tuning with refined data generation},
      author={Yu, Zhaojian and Zhang, Xin and Shang, Ning and Huang, Yangyu and Xu, Can and Zhao, Yishujie and Hu, Wenxiang and Yin, Qiufeng},
      journal={arXiv preprint arXiv:2312.14187},
      year={2023}
    }
    

[](#note)Note
-------------

WaveCoder models are trained on the synthetic data generated by OpenAI models. Please pay attention to OpenAI's [terms of use](https://openai.com/policies/terms-of-use) when using the models and the datasets.

## Model Overview

`wavecoder-ultra-6.7b` is a large language model (LLM) developed by [Microsoft](https://aimodels.fyi/creators/huggingFace/microsoft) for the coding domain. It is part of the WaveCoder series of models designed to solve relevant problems in the field of code through instruction-following learning. The model was trained on a dataset generated from a subset of the Code Search Net data using a generator-discriminator framework based on LLMs, covering four general code-related tasks: code generation, code summary, code translation, and code repair.

The `wavecoder-ultra-6.7b` model demonstrates strong performance on benchmarks like HumanEval, MBPP, and HumanEval Fix and Explain tasks, outperforming GPT-4 in some cases. It is a larger and more capable variant in the WaveCoder series, with the base `wavecoder-ds-6.7b` and `wavecoder-pro-6.7b` models also available.

Similar models such as [deepseek-coder-6.7b-instruct](https://aimodels.fyi/models/huggingFace/deepseek-coder-67b-instruct-deepseek-ai), [Magicoder-S-DS-6.7B](https://aimodels.fyi/models/huggingFace/magicoder-s-ds-67b-ise-uiuc), and [deepseek-coder-6.7b-base](https://aimodels.fyi/models/huggingFace/deepseek-coder-67b-base-deepseek-ai) also focus on code-related tasks, with various approaches and training data.

## Model Inputs and Outputs

The `wavecoder-ultra-6.7b` model is a text-to-text transformer that can be used for a variety of code-related tasks. It takes natural language instructions or prompts as input and generates the corresponding code or code-related output.

### Inputs
- Natural language instructions or prompts related to code generation, code summarization, code translation, or code repair.

### Outputs
- Generated code or code-related text, such as:
  - Code snippets
  - Code summaries
  - Translated code
  - Code fixes or repairs

## Capabilities

The `wavecoder-ultra-6.7b` model is capable of performing a wide range of code-related tasks, including:

- **Code Generation**: Given a natural language prompt, the model can generate relevant code snippets in various programming languages.
- **Code Summarization**: The model can summarize the functionality of a given code snippet in natural language.
- **Code Translation**: The model can translate code from one programming language to another.
- **Code Repair**: The model can identify and fix bugs or errors in a given code snippet.

These capabilities are demonstrated by the model's strong performance on benchmarks like HumanEval, MBPP, and HumanEval Fix and Explain tasks.

## What Can I Use It For?

The `wavecoder-ultra-6.7b` model can be useful for a variety of applications and use cases in the software development and programming domains, such as:

- **Automated Code Generation**: Generating code snippets from natural language descriptions, which can assist developers in rapid prototyping or coding tasks.
- **Code Documentation and Summarization**: Automatically summarizing the functionality of code segments, which can improve code readability and maintainability.
- **Code Translation**: Translating code between different programming languages, which can facilitate cross-team collaboration or porting of projects.
- **Code Repair and Debugging**: Identifying and fixing bugs or errors in code, which can streamline the debugging process.

These capabilities can be leveraged in tools, services, or applications that require strong code-related AI capabilities, such as code editors, IDEs, developer productivity tools, or even low-code/no-code platforms.

## Things to Try

Here are some ideas for things to try with the `wavecoder-ultra-6.7b` model:

- **Generating Code from Natural Language Prompts**: Try providing the model with natural language descriptions of programming tasks or algorithms, and see how it generates the corresponding code.
- **Summarizing Code Functionality**: Take a code snippet, provide it as input to the model, and see how it summarizes the functionality of the code in natural language.
- **Translating Code between Languages**: Experiment with providing the model with code in one programming language and see how it translates it to another language.
- **Fixing Code Bugs**: Give the model a code snippet with known bugs or errors, and observe how it identifies and repairs the issues.

By experimenting with these capabilities, you can gain a deeper understanding of the model's strengths and limitations, and explore how it can be integrated into your own projects or workflows.