[](#mixtral--moe--2x34b)Mixtral MOE 2x34B
=========================================

*   [One of Best Model reviewed by reddit community](https://www.reddit.com/r/LocalLLaMA/comments/1916896/llm_comparisontest_confirm_leaderboard_big_news/)
*   [Another review by reddit community](https://www.reddit.com/r/LocalLLaMA/comments/191mvlp/i_have_tried_mixtral_34bx2_moe_also_named_yi/)

Highest score Model ranked by Open LLM Leaderboard (2024-01-10)

*   [Average Score 76.66](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)

This is my first English & Chinese MoE Model based on

*   \[jondurbin/bagel-dpo-34b-v0.2\]
*   \[SUSTech/SUS-Chat-34B\]

gpu code example

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    import math
    
    ## v2 models
    model_path = "cloudyu/Mixtral_34Bx2_MoE_60B"
    
    tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
    model = AutoModelForCausalLM.from_pretrained(
        model_path, torch_dtype=torch.float32, device_map='auto',local_files_only=False, load_in_4bit=True
    )
    print(model)
    prompt = input("please input prompt:")
    while len(prompt) > 0:
      input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to("cuda")
    
      generation_output = model.generate(
        input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
      )
      print(tokenizer.decode(generation_output[0]))
      prompt = input("please input prompt:")
    

CPU example

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    import math
    
    ## v2 models
    model_path = "cloudyu/Mixtral_34Bx2_MoE_60B"
    
    tokenizer = AutoTokenizer.from_pretrained(model_path, use_default_system_prompt=False)
    model = AutoModelForCausalLM.from_pretrained(
            model_path, torch_dtype=torch.bfloat16, device_map='cpu'
    )
    print(model)
    prompt = input("please input prompt:")
    while len(prompt) > 0:
      input_ids = tokenizer(prompt, return_tensors="pt").input_ids
    
      generation_output = model.generate(
        input_ids=input_ids, max_new_tokens=500,repetition_penalty=1.2
      )
      print(tokenizer.decode(generation_output[0]))
      prompt = input("please input prompt:")
    

Output Examples:

    please input prompt:write a story about yosemite
    write a story about yosemite national park
    Yosemite National Park is located in the Sierra Nevada Mountains of California, USA. It was established on October 1st, 1890 and covers an area of approximately 747,956 acres (302,687 hectares). The park boasts some of America's most iconic natural wonders such as Yosemite Valley, Half Dome, El Capitan, Bridalveil Fall, Tuolumne Meadows, Glacier Point, Mariposa Grove, and many more breathtaking landscapes that attract millions of visitors each year.
    
    The history of Yosemite dates back to over seven million years ago when glaciers carved out its stunning granite cliffs and valleys. Native American tribes like Miwok and Paiute have lived here for thousands of years before European explorers arrived during the mid-nineteenth century. In fact, it was John Muir - one of Americas greatest conservationists who helped establish this region as a protected wilderness area by advocating for its preservation through his writings and activism.
    
    Today, Yosemite offers various recreational activities including hiking, rock climbing, camping, fishing, horseback riding, wildlife watching, photography, and winter sports like skiing and snowshoeing. Visitors can also enjoy ranger programs, guided tours, educational exhibits at visitor centers, or simply take time to appreciate nature while strolling along scenic trails surrounded by towering sequoia trees, cascading waterfalls, and crystal clear lakes.
    
    In addition to preserving these awe-inspiring vistas, Yosemite plays a crucial role in protecting numerous plant and animal species found within its boundaries. Some notable inhabitants include black bears, mountain lions, mule deer, coyotes, bobcats, golden eagles, peregrine falcons, bighorn sheep, and several types of fish native to the Merced River which runs through the heart of the valley.
    
    As we continue our journey into the future, let us remember the importance of safeguarding places like Yosemite so they may remain pristine sanctuaries where both humans and animals alike can thrive together amidst unspoiled beauty.</s>
    please input prompt:
    
    </s>

## Model overview

The `Mixtral_34Bx2_MoE_60B` is a large language model developed by the researcher `cloudyu`. It is a Mixture of Experts (MoE) model based on the [jondurbin/bagel-dpo-34b-v0.2](https://aimodels.fyi/models/huggingFace/bagel-dpo-34b-v02-jondurbin) and [SUSTech/SUS-Chat-34B](https://aimodels.fyi/models/huggingFace/sus-chat-34b-sustech) models. The model has been trained on a large corpus of data and has demonstrated strong performance on various benchmarks, ranking highly on the Open LLM Leaderboard.

## Model inputs and outputs

The `Mixtral_34Bx2_MoE_60B` model takes natural language text as input and generates coherent, contextual responses. The model can handle a wide range of tasks, from open-ended conversations to more specialized applications like language translation, question answering, and text generation.

### Inputs
- Natural language text

### Outputs
- Generated natural language text responses

## Capabilities

The `Mixtral_34Bx2_MoE_60B` model has demonstrated strong capabilities across a variety of tasks, including language understanding, generation, and reasoning. It has achieved high scores on benchmarks like MMLU, CMMLU, C-Eval, BBH, GSM-8K, and MATH, showcasing its abilities in areas like common sense reasoning, math, and general knowledge.

## What can I use it for?

The `Mixtral_34Bx2_MoE_60B` model can be used for a wide range of applications, from virtual assistants and chatbots to content generation and language translation. Its strong performance on benchmarks suggests it could be particularly useful for tasks that require language understanding and generation, such as:

- Conversational AI systems
- Automated writing and content generation
- Language translation
- Question answering and information retrieval
- Summarization and text simplification

## Things to try

One key aspect of the `Mixtral_34Bx2_MoE_60B` model is its use of a Mixture of Experts (MoE) architecture. This allows the model to leverage the strengths of multiple submodels, or "experts," to generate more diverse and contextually relevant responses. To take advantage of this, you could try:

- Experimenting with different prompts and tasks to see how the model performs across a range of applications
- Prompting the model to generate responses in different styles or tones to assess its flexibility
- Comparing the model's outputs to those of other large language models to understand its unique strengths and capabilities

By exploring the `Mixtral_34Bx2_MoE_60B` model in depth, you can uncover new ways to leverage its powerful language understanding and generation abilities for your own projects and research.