SynthIA-7B-v1.3: Base model is Mistral-7B-v0.1

All SynthIA models are uncensored. Please use it with caution and with best intentions. You are responsible for how you use SynthIA.

To evoke generalized Tree of Thought + Chain of Thought reasoning, you may use the following system message:

    Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation.
    

[](#synthia-7b-v13)SynthIA-7B-v1.3
==================================

SynthIA (Synthetic Intelligent Agent) 7B-v1.3 is a Mistral-7B-v0.1 model trained on Orca style datasets. It has been fine-tuned for instruction following as well as having long-form conversations.

  

[![Synthia](https://huggingface.co/migtissera/Synthia-13B/resolve/main/Synthia.jpeg)](https://huggingface.co/migtissera/Synthia-13B/resolve/main/Synthia.jpeg)

  
  

#### [](#license-disclaimer)License Disclaimer:

This model is released under Apache 2.0, and comes with no warranty or gurantees of any kind.

  

[](#evaluation)Evaluation
-------------------------

We evaluated SynthIA-7B-v1.3 on a wide range of tasks using [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) from EleutherAI.

Here are the results on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)

**Task**

**Metric**

**Value**

_arc\_challenge_

acc\_norm

0.6237

_hellaswag_

acc\_norm

0.8349

_mmlu_

acc\_norm

0.6232

_truthfulqa\_mc_

mc2

0.5125

**Total Average**

\-

**0.6485**

  

[](#example-usage)Example Usage
-------------------------------

### [](#here-is-prompt-format)Here is prompt format:

    SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation.
    USER: How is a rocket launched from the surface of the earth to Low Earth Orbit?
    ASSISTANT:
    

### [](#below-shows-a-code-example-on-how-to-use-this-model)Below shows a code example on how to use this model:

    import torch, json
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "migtissera/SynthIA-7B-v1.3"
    output_file_path = "./SynthIA-7B-conversations.jsonl"
    
    model = AutoModelForCausalLM.from_pretrained(
        model_path,
        torch_dtype=torch.float16,
        device_map="auto",
        load_in_8bit=False,
        trust_remote_code=True,
    )
    
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    
    def generate_text(instruction):
        tokens = tokenizer.encode(instruction)
        tokens = torch.LongTensor(tokens).unsqueeze(0)
        tokens = tokens.to("cuda")
    
        instance = {
            "input_ids": tokens,
            "top_p": 1.0,
            "temperature": 0.75,
            "generate_len": 1024,
            "top_k": 50,
        }
    
        length = len(tokens[0])
        with torch.no_grad():
            rest = model.generate(
                input_ids=tokens,
                max_length=length + instance["generate_len"],
                use_cache=True,
                do_sample=True,
                top_p=instance["top_p"],
                temperature=instance["temperature"],
                top_k=instance["top_k"],
                num_return_sequences=1,
            )
        output = rest[0][length:]
        string = tokenizer.decode(output, skip_special_tokens=True)
        answer = string.split("USER:")[0].strip()
        return f"{answer}"
    
    
    conversation = f"SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation."
    
    
    while True:
        user_input = input("You: ")
        llm_prompt = f"{conversation} \nUSER: {user_input} \nASSISTANT: "
        answer = generate_text(llm_prompt)
        print(answer)
        conversation = f"{llm_prompt}{answer}"
        json_data = {"prompt": user_input, "answer": answer}
    
        ## Save your conversation
        with open(output_file_path, "a") as output_file:
            output_file.write(json.dumps(json_data) + "\n")
    

  

#### [](#limitations--biases)Limitations & Biases:

While this model aims for accuracy, it can occasionally produce inaccurate or misleading results.

Despite diligent efforts in refining the pretraining data, there remains a possibility for the generation of inappropriate, biased, or offensive content.

Exercise caution and cross-check information when necessary. This is an uncensored model.

  

### [](#citiation)Citiation:

Please kindly cite using the following BibTeX:

    @misc{SynthIA-7B-v1.3,
      author = {Migel Tissera},
      title = {SynthIA-7B-v1.3: Synthetic Intelligent Agent},
      year = {2023},
      publisher = {GitHub, HuggingFace},
      journal = {GitHub repository, HuggingFace repository},
      howpublished = {\url{https://huggingface.co/migtissera/Synthia-13B},
    }
    

    @misc{mukherjee2023orca,
          title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4}, 
          author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
          year={2023},
          eprint={2306.02707},
          archivePrefix={arXiv},
          primaryClass={cs.CL}
    }
    

[](#open-llm-leaderboard-evaluation-results)Open LLM Leaderboard Evaluation Results
===================================================================================

Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_migtissera__SynthIA-7B-v1.3)

Metric

Value

Avg.

57.11

ARC (25-shot)

62.12

HellaSwag (10-shot)

83.45

MMLU (5-shot)

62.65

TruthfulQA (0-shot)

51.37

Winogrande (5-shot)

78.85

GSM8K (5-shot)

17.59

DROP (3-shot)

43.76

## Model overview

The `SynthIA-7B-v1.3` is a Mistral-7B-v0.1 model trained on Orca style datasets. It has been fine-tuned for instruction following as well as having long-form conversations. The model is released by [migtissera](https://aimodels.fyi/creators/huggingFace/migtissera) under the Apache 2.0 license.

Similar models include the [neural-chat-7b-v3-1](https://aimodels.fyi/models/huggingFace/neural-chat-7b-v3-1-intel) and [neural-chat-7b-v3-3](https://aimodels.fyi/models/huggingFace/neural-chat-7b-v3-3-intel) models, which are also fine-tuned 7B language models. However, the SynthIA-7B-v1.3 is focused on instruction following and open-ended conversations, rather than the more specialized tasks of those models.

## Model inputs and outputs

### Inputs
- **Instruction**: The model accepts instructions or prompts for the AI assistant to elaborate on using a Tree of Thoughts and Chain of Thought reasoning.

### Outputs
- **Natural language response**: The model generates a coherent, step-by-step response that addresses the given instruction or prompt.

## Capabilities

The SynthIA-7B-v1.3 model demonstrates strong capabilities in open-ended instruction following and long-form conversation. It can break down complex topics, explore relevant sub-topics, and construct a clear reasoning to answer questions or address prompts. The model's performance is evaluated to be on par with other leading 7B language models.

## What can I use it for?

The SynthIA-7B-v1.3 model would be well-suited for applications that require an AI assistant to engage in substantive, multi-turn dialogues. This could include virtual agents, chatbots, or question-answering systems that need to provide detailed, thoughtful responses. The model's ability to follow instructions and reason through problems makes it a good fit for educational or research applications as well.

## Things to try

One interesting aspect of the SynthIA-7B-v1.3 model is its use of a "Tree of Thoughts" and "Chain of Thought" reasoning approach. You could experiment with prompts that ask the model to explicitly outline its step-by-step reasoning, exploring how it builds a logical flow of ideas to arrive at the final response. Additionally, you could test the model's ability to handle open-ended, multi-part instructions or prompts that require it to demonstrate flexible, contextual understanding.

[](#helixnet)HelixNet
=====================

[![HelixNet](https://huggingface.co/migtissera/HelixNet/resolve/main/media/HelixNet.png)](https://huggingface.co/migtissera/HelixNet/resolve/main/media/HelixNet.png)

HelixNet is a Deep Learning architecture consisting of 3 x Mistral-7B LLMs. It has an `actor`, a `critic`, and a `regenerator`. The `actor` LLM produces an initial response to a given system-context and a question. The `critic` then takes in as input, a tuple of (system-context, question, response) and provides a critique based on the provided answer to the given system-context and the question. Its job is not to criticize, but to provide an intelligent critique so that the answer can be modified/regenerated to address the question better. Finally, the `regenerator` takes in a tuple of (system-context, question, response, critique) and regenerates the answer.

HelixNet is insprired from an actor-critic architecture most prominent in Reinforcement Learning algorithms. The name derives from Helix, referring to the spiral structure of a DNA molecule. It symbolizes the intertwined nature of the three networks, working in tandem, much like the strands of a DNA molecule.

HelixNet regenerates very pleasing and accurate responses, due to the entropy preservation of the regenerator. The regenerator was only trained on a dataset of 1000 samples, similar to Meta's LIMA. The actor network here was trained on about 250K very high-quality samples, and the critic network was trained on further 10K samples.

[](#training-methodology)Training Methodology
=============================================

[](#phase-1-actor)Phase 1: Actor
--------------------------------

The actor network was trained with Supervised Fine-Tuning, on 250K very high-quality samples. It has 75K of Open-Orca's Chain-of-Thought data, and a mixture of Dolphin (GPT-4), SynthIA's Tree-of-Thought data.

Here are the results for the Actor network on metrics used by [HuggingFaceH4 Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)

**Task**

**Metric**

**Value**

_arc\_challenge_

acc\_norm

62.28

_hellaswag_

acc\_norm

83.22

_mmlu_

acc\_norm

63.10

_truthfulqa\_mc_

mc2

50.10

**Total Average**

\-

**0.64675**

[](#phase-2-critic)Phase 2: Critic
----------------------------------

To train the critic, the following process was followed:

*   Use Actor, and send 10K system-context and question pairs. Generate responses
*   Use the (system-context, question, response) tuples to generate critiques. Used OpenAI's GPT-4.

Using the above training dataset, a Mistral-7B was fine-tuned.

[](#phase-3-regenerator)Phase 3: Regenerator
--------------------------------------------

*   Use the (system-context, question, response, critique) tuples to regenerate the answers. Used OpenAI's GPT-4.

A thrid LLM was fine-tuned using the above data.

[](#reusability-of-the-critic-and-the-regenerator)Reusability of the critic and the regenerator
===============================================================================================

The `critic` and the `regenerator` was tested not only on the accopanying actor model, but 13B and 70B SynthIA models as well. They seem to be readily transferrable, as the function that it has learnt is to provide an intelligent critique and then a regeneration of the original response. Please feel free to try out other models as the `actor`. However, the architecture works best with all three as presented here in HelixNet.

[](#sample-generations)Sample Generations
=========================================

[![HelixNet](https://huggingface.co/migtissera/HelixNet/resolve/main/media/sample-answer.png)](https://huggingface.co/migtissera/HelixNet/resolve/main/media/sample-answer.png)

[![HelixNet](https://huggingface.co/migtissera/HelixNet/resolve/main/media/sample-critique.png)](https://huggingface.co/migtissera/HelixNet/resolve/main/media/sample-critique.png)

[![HelixNet](https://huggingface.co/migtissera/HelixNet/resolve/main/media/sample-regeneration.png)](https://huggingface.co/migtissera/HelixNet/resolve/main/media/sample-regeneration.png)

[](#prompt-format)Prompt format:
================================

    SYSTEM: Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation.
    USER: What is the relationship between Earth's atmosphere, magnetic field and gravity?
    ASSISTANT:
    

[](#example-usage)Example Usage
===============================

[](#code-example-verbose)Code example (Verbose):
------------------------------------------------

The following is a code example on how to use HelixNet. No special system-context messages are needed for the `critic` and the `regenerator`.

    import torch, json
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path_actor = "/home/ubuntu/llm/HelixNet/actor"
    model_path_critic = "/home/ubuntu/llm/HelixNet/critic"
    model_path_regenerator = "/home/ubuntu/llm/HelixNet/regenerator"
    
    def load_model(model_path):
        model = AutoModelForCausalLM.from_pretrained(
            model_path,
            torch_dtype=torch.float16,
            device_map="cuda",
            load_in_4bit=False,
            trust_remote_code=True,
        )
        return model
    
    def load_tokenizer(model_path):
        tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
        return tokenizer
    
    model_actor = load_model(model_path_actor)
    model_critic = load_model(model_path_critic)
    model_regenerator = load_model(model_path_regenerator)
    
    tokenizer_actor = load_tokenizer(model_path_actor)
    tokenizer_critic = load_tokenizer(model_path_critic)
    tokenizer_regenerator = load_tokenizer(model_path_regenerator)
    
    def generate_text(instruction, model, tokenizer):
        tokens = tokenizer.encode(instruction)
        tokens = torch.LongTensor(tokens).unsqueeze(0)
        tokens = tokens.to("cuda")
    
        instance = {
            "input_ids": tokens,
            "top_p": 1.0,
            "temperature": 0.75,
            "generate_len": 1024,
            "top_k": 50,
        }
    
        length = len(tokens[0])
        with torch.no_grad():
            rest = model.generate(
                input_ids=tokens,
                max_length=length + instance["generate_len"],
                use_cache=True,
                do_sample=True,
                top_p=instance["top_p"],
                temperature=instance["temperature"],
                top_k=instance["top_k"],
                num_return_sequences=1,
            )
        output = rest[0][length:]
        string = tokenizer.decode(output, skip_special_tokens=True)
        return f"{string}"
    
    system_prompt = "You are HelixNet. Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation."
      
    
    while True:
        user_input = input("You: ")
        
        prompt_actor = f"SYSTEM: {system_prompt} \nUSER: {user_input} \nASSISTANT: "
        actor_response = generate_text(prompt_actor, model_actor, tokenizer_actor)
        print(f"ACTOR: {actor_response}\n\n")
       
        prompt_critic = f"SYSTEM: {system_prompt} \nUSER: {user_input} \nRESPONSE: {actor_response} \nCRITIQUE:"
        critic_response = generate_text(prompt_critic, model_critic, tokenizer_critic)
        print(f"CRITIQUE: {critic_response}\n\n")
    
        prompt_regenerator = f"SYSTEM: {system_prompt} \nUSER: {user_input} \nRESPONSE: {actor_response} \nCRITIQUE: {critic_response} \nREGENERATOR: REGENERATED ANSWER:"
        regenerator_response = generate_text(prompt_regenerator, model_regenerator, tokenizer_regenerator)
        print(f"REGENERATION: {regenerator_response}")
    

[](#code-example-continuing-a-conversation)Code Example (Continuing a conversation)
-----------------------------------------------------------------------------------

To have a back-and-forth conversation, only carry forward the system-context, questions and regenerations as shown below.

    import torch, json
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path_actor = "/home/ubuntu/llm/HelixNet/actor"
    model_path_critic = "/home/ubuntu/llm/HelixNet/critic"
    model_path_regenerator = "/home/ubuntu/llm/HelixNet/regenerator"
    
    def load_model(model_path):
        model = AutoModelForCausalLM.from_pretrained(
            model_path,
            torch_dtype=torch.float16,
            device_map="cuda",
            load_in_4bit=False,
            trust_remote_code=True,
        )
        return model
    
    def load_tokenizer(model_path):
        tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
        return tokenizer
    
    model_actor = load_model(model_path_actor)
    model_critic = load_model(model_path_critic)
    model_regenerator = load_model(model_path_regenerator)
    
    tokenizer_actor = load_tokenizer(model_path_actor)
    tokenizer_critic = load_tokenizer(model_path_critic)
    tokenizer_regenerator = load_tokenizer(model_path_regenerator)
    
    def generate_text(instruction, model, tokenizer):
        tokens = tokenizer.encode(instruction)
        tokens = torch.LongTensor(tokens).unsqueeze(0)
        tokens = tokens.to("cuda")
    
        instance = {
            "input_ids": tokens,
            "top_p": 1.0,
            "temperature": 0.75,
            "generate_len": 1024,
            "top_k": 50,
        }
    
        length = len(tokens[0])
        with torch.no_grad():
            rest = model.generate(
                input_ids=tokens,
                max_length=length + instance["generate_len"],
                use_cache=True,
                do_sample=True,
                top_p=instance["top_p"],
                temperature=instance["temperature"],
                top_k=instance["top_k"],
                num_return_sequences=1,
            )
        output = rest[0][length:]
        string = tokenizer.decode(output, skip_special_tokens=True)
        return f"{string}"
    
    system_prompt = "You are HelixNet. Elaborate on the topic using a Tree of Thoughts and backtrack when necessary to construct a clear, cohesive Chain of Thought reasoning. Always answer without hesitation."
      
    conversation = f"SYSTEM:{system_prompt}"
    
    while True:
        user_input = input("You: ")
    
        prompt_actor = f"{conversation} \nUSER: {user_input} \nASSISTANT: "
        actor_response = generate_text(prompt_actor, model_actor, tokenizer_actor)
        print("Generated ACTOR RESPONSE")
    
        prompt_critic = f"SYSTEM: {system_prompt} \nUSER: {user_input} \nRESPONSE: {actor_response} \nCRITIQUE:"
        critic_response = generate_text(prompt_critic, model_critic, tokenizer_critic)
        print("Generated CRITIQUE")
    
        prompt_regenerator = f"SYSTEM: {system_prompt} \nUSER: {user_input} \nRESPONSE: {actor_response} \nCRITIQUE: {critic_response} \nREGENERATOR: REGENERATED ANSWER:"
        regenerator_response = generate_text(prompt_regenerator, model_regenerator, tokenizer_regenerator)
        print("Generated REGENERATION")
    
        conversation = f"{conversation} \nUSER: {user_input} \nASSISTANT: {regenerator_response}"
        print(conversation)
    

[](#quantized-versions-by-the-community)Quantized versions by the community
===========================================================================

*   [https://huggingface.co/LoneStriker?search\_models=helixnet](https://huggingface.co/LoneStriker?search_models=helixnet)

[](#updates)Updates:
====================

*   Added a fix suggested by user @mammour to stop returing "REGENERATED ANSWER" on the regenerator. Thanks for the fix!

## Model overview
`HelixNet` is a Deep Learning architecture consisting of 3 x Mistral-7B LLMs - an `actor`, a `critic`, and a `regenerator`. The `actor` LLM produces an initial response to a given system-context and a question. The `critic` then provides a critique based on the provided answer to help modify/regenerate the answer. Finally, the `regenerator` takes in the critique and regenerates the answer. This actor-critic architecture is inspired by Reinforcement Learning algorithms, with the name derived from the spiral structure of a DNA molecule, symbolizing the intertwined nature of the three networks.

## Model inputs and outputs

### Inputs
- **System-context**: The context for the task or question
- **Question**: The question or prompt to be answered

### Outputs
- **Response**: The initial response generated by the `actor` LLM
- **Critique**: The feedback provided by the `critic` LLM on the initial response
- **Regenerated response**: The final answer generated by the `regenerator` LLM based on the critique

## Capabilities
`HelixNet` regenerates very pleasing and accurate responses, due to the entropy preservation of the `regenerator`. The `actor` network was trained on a large, high-quality dataset, while the `critic` network was trained on a smaller but carefully curated dataset.

## What can I use it for?
`HelixNet` can be used for a variety of language generation tasks that benefit from an iterative refinement process, such as generating high-quality and coherent text responses. The architecture could be particularly useful for applications like conversational AI, question-answering, and content generation, where the model can leverage the feedback from the `critic` to improve the quality of the output.

## Things to try
One interesting aspect of `HelixNet` is the incorporation of the `critic` network, which provides intelligent feedback to refine the initial response. You could experiment with prompting the model with different types of questions or system contexts and observe how the `critic` and `regenerator` work together to improve the overall quality of the output.