[](#starcoder2-instruct-fully-transparent-and-permissive-self-alignment-for-code-generation)StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation
====================================================================================================================================================================================

[![Banner](https://huggingface.co/datasets/bigcode/starcoder2-instruct-assets/resolve/main/banner.png)](https://huggingface.co/datasets/bigcode/starcoder2-instruct-assets/resolve/main/banner.png)

[](#model-summary)Model Summary
-------------------------------

We introduce StarCoder2-15B-Instruct-v0.1, the very first entirely self-aligned code Large Language Model (LLM) trained with a fully permissive and transparent pipeline. Our open-source pipeline uses StarCoder2-15B to generate thousands of instruction-response pairs, which are then used to fine-tune StarCoder-15B itself without any human annotations or distilled data from huge and proprietary LLMs.

*   **Model:** [bigcode/starcoder2-15b-instruct-v0.1](https://huggingface.co/bigcode/starcoder2-instruct-15b-v0.1)
*   **Code:** [bigcode-project/starcoder2-self-align](https://github.com/bigcode-project/starcoder2-self-align)
*   **Dataset:** [bigcode/self-oss-instruct-sc2-exec-filter-50k](https://huggingface.co/datasets/bigcode/self-oss-instruct-sc2-exec-filter-50k/)
*   **Authors:** [Yuxiang Wei](https://yuxiang.cs.illinois.edu), [Federico Cassano](https://federico.codes/), [Jiawei Liu](https://jw-liu.xyz), [Yifeng Ding](https://yifeng-ding.com), [Naman Jain](https://naman-ntc.github.io), [Harm de Vries](https://www.harmdevries.com), [Leandro von Werra](https://twitter.com/lvwerra), [Arjun Guha](https://www.khoury.northeastern.edu/home/arjunguha/main/home/), [Lingming Zhang](https://lingming.cs.illinois.edu).

[![self-alignment pipeline](https://huggingface.co/datasets/bigcode/starcoder2-instruct-assets/resolve/main/method.png)](https://huggingface.co/datasets/bigcode/starcoder2-instruct-assets/resolve/main/method.png)

[](#use)Use
-----------

### [](#intended-use)Intended use

The model is designed to respond to **coding-related instructions in a single turn**. Instructions in other styles may result in less accurate responses.

Here is an example to get started with the model using the [transformers](https://huggingface.co/docs/transformers/index) library:

    import transformers
    import torch
    
    pipeline = transformers.pipeline(
        model="bigcode/starcoder2-15b-instruct-v0.1",
        task="text-generation",
        torch_dtype=torch.bfloat16,
        device_map="auto",
    )
    
    def respond(instruction: str, response_prefix: str) -> str:
        messages = [{"role": "user", "content": instruction}]
        prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False)
        prompt += response_prefix
    
        teminators = [
            pipeline.tokenizer.eos_token_id,
            pipeline.tokenizer.convert_tokens_to_ids("###"),
        ]
    
        result = pipeline(
            prompt,
            max_length=256,
            num_return_sequences=1,
            do_sample=False,
            eos_token_id=teminators,
            pad_token_id=pipeline.tokenizer.eos_token_id,
            truncation=True,
        )
        response = response_prefix + result[0]["generated_text"][len(prompt) :].split("###")[0].rstrip()
        return response
    
    
    instruction = "Write a quicksort function in Python with type hints and a 'less_than' parameter for custom sorting criteria."
    response_prefix = ""
    
    print(respond(instruction, response_prefix))
    

Here is the expected output:

    Here's how you can implement a quicksort function in Python with type hints and a 'less_than' parameter for custom sorting criteria:
    
    ```python
    from typing import TypeVar, Callable
    
    T = TypeVar('T')
    
    def quicksort(items: list[T], less_than: Callable[[T, T], bool] = lambda x, y: x < y) -> list[T]:
        if len(items) <= 1:
            return items
    
        pivot = items[0]
        less = [x for x in items[1:] if less_than(x, pivot)]
        greater = [x for x in items[1:] if not less_than(x, pivot)]
        return quicksort(less, less_than) + [pivot] + quicksort(greater, less_than)
    ```
    

### [](#bias-risks-and-limitations)Bias, Risks, and Limitations

StarCoder2-15B-Instruct-v0.1 is primarily finetuned for Python code generation tasks that can be verified through execution, which may lead to certain biases and limitations. For example, the model might not adhere strictly to instructions that dictate the output format. In these situations, it's beneficial to provide a **response prefix** or a **one-shot example** to steer the models output. Additionally, the model may have limitations with other programming languages and out-of-domain coding tasks.

The model also inherits the bias, risks, and limitations from its base StarCoder2-15B model. For more information, please refer to the [StarCoder2-15B model card](https://huggingface.co/bigcode/starcoder2-15b).

[](#evaluation-on-evalplus-livecodebench-and-ds-1000)Evaluation on EvalPlus, LiveCodeBench, and DS-1000
-------------------------------------------------------------------------------------------------------

[![EvalPlus](https://huggingface.co/datasets/bigcode/starcoder2-instruct-assets/resolve/main/evalplus.png)](https://huggingface.co/datasets/bigcode/starcoder2-instruct-assets/resolve/main/evalplus.png)

[![LiveCodeBench and DS-1000](https://huggingface.co/datasets/bigcode/starcoder2-instruct-assets/resolve/main/lcb-ds1000.png)](https://huggingface.co/datasets/bigcode/starcoder2-instruct-assets/resolve/main/lcb-ds1000.png)

[](#training-details)Training Details
-------------------------------------

### [](#hyperparameters)Hyperparameters

*   **Optimizer:** Adafactor
*   **Learning rate:** 1e-5
*   **Epoch:** 4
*   **Batch size:** 64
*   **Warmup ratio:** 0.05
*   **Scheduler:** Linear
*   **Sequence length:** 1280
*   **Dropout**: Not applied

### [](#hardware)Hardware

1 x NVIDIA A100 80GB

[](#resources)Resources
-----------------------

*   **Model:** [bigcode/starCoder2-15b-instruct-v0.1](https://huggingface.co/bigcode/starcoder2-instruct-15b-v0.1)
*   **Code:** [bigcode-project/starcoder2-self-align](https://github.com/bigcode-project/starcoder2-self-align)
*   **Dataset:** [bigcode/self-oss-instruct-sc2-exec-filter-50k](https://huggingface.co/datasets/bigcode/self-oss-instruct-sc2-exec-filter-50k/)

## Model overview

`starcoder2-15b-instruct-v0.1` is the very first entirely self-aligned code Large Language Model (LLM) trained with a fully permissive and transparent pipeline. It was developed by [bigcode](https://aimodels.fyi/creators/huggingFace/bigcode), an organization focused on building open-source AI models. The model was trained using an open-source pipeline that generates thousands of instruction-response pairs, which are then used to fine-tune the base `starcoder2-15b` model without any human annotations or distilled data from huge proprietary LLMs.

This self-alignment approach contrasts with the typical instruction-tuning process, which often relies on distilled data from large, closed-source models. By using a fully transparent and permissive pipeline, `starcoder2-15b-instruct-v0.1` aims to provide a more ethical and accountable code generation model.

The `starcoder2-15b` model, which serves as the base for this instructed version, is a 15B parameter model trained on over 600 programming languages from [The Stack v2](https://huggingface.co/datasets/bigcode/the-stack-v2-train) dataset. It uses advanced transformer architectures like Grouped Query Attention and a sliding window context of 16,384 tokens to enable efficient and high-quality code generation.

## Model inputs and outputs

### Inputs

- **Instruction**: A natural language description of a task or request, such as "Write a function that computes the square root."

### Outputs

- **Generated code**: The model's attempt to generate code that fulfills the given instruction, such as a function that computes the square root.

## Capabilities

`starcoder2-15b-instruct-v0.1` is designed to respond to coding-related instructions in a single turn. It can generate code snippets across a wide range of programming languages to help with tasks like algorithm implementation, data processing, and software development. However, the generated code is not guaranteed to be correct or efficient, as the model may introduce bugs or suboptimal solutions.

## What can I use it for?

You can use `starcoder2-15b-instruct-v0.1` to help with a variety of coding-related tasks, such as:

- Prototyping new algorithms or features
- Automating repetitive coding tasks
- Generating boilerplate code or scaffolding
- Exploring different programming approaches to a problem

While the model can be a useful tool, it's important to review and test any generated code before using it in a production environment. The [search index](https://huggingface.co/spaces/bigcode/search-v2) provided by the BigCode project can help you identify the origin of generated code and ensure proper attribution.

## Things to try

One interesting aspect of `starcoder2-15b-instruct-v0.1` is its ability to generate code in a fully self-aligned and transparent manner. This approach aims to address some of the ethical and accountability concerns surrounding large language models trained on proprietary data.

You could try providing the model with more complex or open-ended instructions to see how it responds, or experiment with the model's ability to generate code in different programming languages. Additionally, you could explore using the model in conjunction with other tools, such as unit testing frameworks, to validate the correctness of the generated code.