[](#phi-2-super-sft--cdpo)Phi-2-super (SFT + cDPO)
==================================================

Base Model: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2)

[![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/5-LQCMrXi8FN_ewcWL47v.png)](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/5-LQCMrXi8FN_ewcWL47v.png)

[](#how-to-run-inference)How to run inference:
==============================================

    import transformers
    import torch
    
    if __name__ == "__main__":
      model_name = "abacaj/phi-2-super"
      tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
      
      model = (
          transformers.AutoModelForCausalLM.from_pretrained(
              model_name,
          )
          .to("cuda:0")
          .eval()
      )
      
      messages = [
          {"role": "user", "content": "Hello, who are you?"}
      ]
      inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
      input_ids_cutoff = inputs.size(dim=1)
      
      with torch.no_grad():
          generated_ids = model.generate(
              input_ids=inputs,
              use_cache=True,
              max_new_tokens=512,
              temperature=0.2,
              top_p=0.95,
              do_sample=True,
              eos_token_id=tokenizer.eos_token_id,
              pad_token_id=tokenizer.pad_token_id,
          )
      
      completion = tokenizer.decode(
          generated_ids[0][input_ids_cutoff:],
          skip_special_tokens=True,
      )
      
      print(completion)
    

[](#chat-template)Chat template
===============================

The model uses the same chat template as found in Mistral instruct models:

    text = "<|endoftext|>[INST] What is your favourite condiment? [/INST]"
    "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!<|endoftext|> "
    "[INST] Do you have mayonnaise recipes? [/INST]"
    

You don't need to do it manually if you use the HF transformers tokenizer:

      messages = [
          {"role": "user", "content": "Hello, who are you?"},
          {"role": "assistant": "content": "I am ..."}
      ]
      inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
    

[](#mt-bench--heval)MT-bench / heval
====================================

[![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/lnFu3x1ufdpQVysIrX4-G.png)](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/lnFu3x1ufdpQVysIrX4-G.png) [![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/mJfBpH8dIW7Ii2KAGI_A7.png)](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/mJfBpH8dIW7Ii2KAGI_A7.png)

## Model overview

`phi-2-super` is a Transformer-based language model developed by maintainer [abacaj](https://aimodels.fyi/creators/huggingFace/abacaj). It is based on the [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) model, with additional fine-tuning using supervised fine-tuning (SFT) and contrastive data-based prompt optimization (cDPO). This combination of techniques aims to improve the model's safety, controllability, and overall performance.

The `phi-2-super` model was trained on a diverse dataset that includes NLP synthetic texts and filtered web data, with a focus on safety and educational value. Compared to the original `phi-2` model, `phi-2-super` showcases improved performance on benchmarks testing common sense, language understanding, and logical reasoning.

Similar models include the original [microsoft/phi-2](https://aimodels.fyi/models/huggingFace/phi-2-microsoft) and the [phi-2-dpo](https://aimodels.fyi/models/huggingFace/phi-2-dpo) model, which also incorporates cDPO fine-tuning.

## Model inputs and outputs

### Inputs

- **Chat template**: The model uses a chat template with the format `<|endoftext|>[INST] <prompt> [/INST]` for user prompts, and a continuation from the assistant starting with `<|endoftext|>`.
- **Messages**: The model can take a list of messages in a conversational format, with each message having a "role" (either "user" or "assistant") and "content".

### Outputs

The model generates completions in response to the provided inputs, continuing the conversation or providing a relevant answer to the user's prompt. The output is a textual continuation, which can include natural language responses, code snippets, or a combination of both.

## Capabilities

The `phi-2-super` model is designed to be capable of handling a variety of tasks, including question answering, chatting, and code generation. It can provide informative and coherent responses to a wide range of prompts, drawing upon its broad knowledge base.

For example, the model can engage in natural conversations, answering questions, and providing explanations on topics like mathematics, science, and current events. It can also generate Python code snippets to solve programming problems or demonstrate certain concepts.

## What can I use it for?

The `phi-2-super` model can be a useful tool for researchers, developers, and educators working on natural language processing and generation tasks. It can be used as a starting point for building conversational agents, question-answering systems, or code-generation applications.

However, it's important to note that the model has not been thoroughly tested for production-level use cases. Users should treat the model's outputs as suggestions or starting points, and exercise caution when integrating the model into real-world applications.

## Things to try

One interesting aspect of the `phi-2-super` model is its ability to generate both natural language responses and code snippets. You could try providing it with prompts that combine text and code, and observe how it handles the mix of different formats and content types.

Another avenue to explore is the model's performance on specialized tasks or domains, such as mathematics, science, or creative writing. By carefully crafting prompts and evaluating the model's outputs, you can gain insights into its strengths, weaknesses, and potential areas for improvement.

Dataset credits go to: [theblackcat102](https://huggingface.co/theblackcat102)

How to run inference:

    import transformers
    import torch
    
    
    def fmt_prompt(prompt: str) -> str:
        return f"""[Instructions]:\n{prompt}\n\n[Response]:"""
    
    
    if __name__ == "__main__":
        model_name = "abacaj/starcoderbase-1b-sft"
        tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
    
        model = (
            transformers.AutoModelForCausalLM.from_pretrained(
                model_name,
            )
            .to("cuda:0")
            .eval()
        )
    
        prompt = "Write a python function to sort the following array in ascending order, don't use any built in sorting methods: [9,2,8,1,5]"
        prompt_input = fmt_prompt(prompt)
        inputs = tokenizer(prompt_input, return_tensors="pt").to(model.device)
        input_ids_cutoff = inputs.input_ids.size(dim=1)
    
        with torch.no_grad():
            generated_ids = model.generate(
                **inputs,
                use_cache=True,
                max_new_tokens=512,
                temperature=0.2,
                top_p=0.95,
                do_sample=True,
                eos_token_id=tokenizer.eos_token_id,
                pad_token_id=tokenizer.pad_token_id,
            )
    
        completion = tokenizer.decode(
            generated_ids[0][input_ids_cutoff:],
            skip_special_tokens=True,
        )
    
        print(completion)
    

Evals: [![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/U7L1aOV7UxBEBcLGqOZ2s.png)](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/U7L1aOV7UxBEBcLGqOZ2s.png)

Training charts: [![image/png](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/PLkFqE7_34-hJmFW7_opG.png)](https://cdn-uploads.huggingface.co/production/uploads/62ceeb27e7f6014c0e9d9268/PLkFqE7_34-hJmFW7_opG.png)

Link to charts: [https://api.wandb.ai/links/abacaj1/c4nkcs9r](https://api.wandb.ai/links/abacaj1/c4nkcs9r)

Code to train model: [https://github.com/abacaj/train-with-fsdp](https://github.com/abacaj/train-with-fsdp)

## Model overview

The `starcoderbase-1b-sft` is a large language model developed by [abacaj](https://aimodels.fyi/creators/huggingFace/abacaj) and trained on the Stack dataset, a large collection of programming code from GitHub. This model is a variant of the StarCoder model, a 15.5B parameter model also trained on the Stack dataset using advanced techniques like multi-query attention and a fill-in-the-middle objective.

The `starcoderbase-1b-sft` model is specifically a smaller 1B parameter version that has been further fine-tuned on a specific dataset, likely for code-related tasks. This model can be seen as a more specialized version of the larger StarCoder model, tailored for certain applications.

Similar models include the [phi-2-super](https://aimodels.fyi/models/huggingFace/phi-2-super-abacaj) model, which is also a large language model fine-tuned for code-related tasks, as well as the [FalCoder 7B](https://aimodels.fyi/models/huggingFace/falcoder-7b-mrm8488) and the various [StarCoder](https://aimodels.fyi/models/huggingFace/starcoder-bigcode) and [StarCoderBase](https://aimodels.fyi/models/huggingFace/starcoderbase-bigcode) models developed by the BigCode project.

## Model inputs and outputs

### Inputs
- The `starcoderbase-1b-sft` model takes in natural language prompts related to coding tasks, such as instructions for writing a specific function or algorithm.

### Outputs
- The model generates text completions that attempt to provide the requested code or solution, based on the input prompt and the model's training on programming languages and code.
- The output is not guaranteed to be completely correct or functional, as the model may hallucinate or generate code with errors. However, the outputs can serve as a starting point for further development and refinement.

## Capabilities

The `starcoderbase-1b-sft` model is capable of generating code snippets and solutions in response to natural language prompts. It can be used to assist with various programming tasks, such as writing functions, implementing algorithms, or solving coding problems.

The model's capabilities stem from its training on a large corpus of programming code, which allows it to understand and generate code in a variety of languages and styles. However, the model is not an instruction-following model, and commands like "Write a function that computes the square root" may not work as well as more specific prompts.

## What can I use it for?

The `starcoderbase-1b-sft` model can be a valuable tool for programmers and developers, helping to streamline various coding tasks and provide a starting point for further development. Some potential use cases include:

- Generating code snippets or prototypes based on natural language descriptions
- Assisting with debugging and troubleshooting by generating potential solutions
- Exploring different approaches to solving a coding problem
- Automating repetitive or boilerplate code tasks

When using the model's outputs, it's important to carefully review and validate the generated code to ensure it meets your requirements and is free of errors or security vulnerabilities.

## Things to try

Some interesting things to try with the `starcoderbase-1b-sft` model include:

- Providing the model with prompts that combine natural language instructions with specific technical requirements or constraints, and see how it responds.
- Exploring the model's capabilities in different programming languages or domains, such as web development, data analysis, or machine learning.
- Experimenting with different generation parameters, such as temperature or top-k sampling, to see how they affect the quality and diversity of the model's outputs.
- Comparing the model's performance on your specific tasks or datasets to other similar models, such as the [phi-2-super](https://aimodels.fyi/models/huggingFace/phi-2-super-abacaj) or [FalCoder 7B](https://aimodels.fyi/models/huggingFace/falcoder-7b-mrm8488), to understand its relative strengths and weaknesses.

By experimenting and exploring the model's capabilities, you can gain a better understanding of how to effectively leverage it in your own projects and workflows.