The Yi series models are large language models trained from scratch by developers at 01.AI.

## Model Overview

The `yi-34b-chat` model is a large language model trained from scratch by developers at [01.AI](https://aimodels.fyi/creators/replicate/01-ai). The Yi series models are the next generation of open-source large language models that show promise in language understanding, commonsense reasoning, and reading comprehension. For example, the `Yi-34B-Chat` model landed in second place (following GPT-4 Turbo) on the AlpacaEval Leaderboard, outperforming other LLMs like GPT-4, Mixtral, and Claude.

Similar models in the Yi series include the [`yi-6b`](https://aimodels.fyi/models/replicate/yi-6b-01-ai) and [`yi-34b`](https://aimodels.fyi/models/replicate/yi-34b-01-ai) models, which are also large language models trained by 01.AI. Other related models include the [`multilingual-e5-large`](https://aimodels.fyi/models/replicate/multilingual-e5-large-beautyyuyanli) text embedding model, the [`nous-hermes-2-yi-34b-gguf`](https://aimodels.fyi/models/replicate/nous-hermes-2-yi-34b-gguf-kcaverly) fine-tuned Yi-34B model, and the [`llava-13b`](https://aimodels.fyi/models/replicate/llava-13b-yorickvp) visual instruction tuning model.

## Model Inputs and Outputs

The `yi-34b-chat` model takes in a user prompt as input and generates a corresponding response. The input prompt can be a question, a statement, or any other text that the user wants the model to address. 

### Inputs
- **Prompt**: The text that the user wants the model to respond to.
- **Temperature**: A value that controls the randomness of the model's output. Lower temperatures result in more focused and deterministic responses, while higher temperatures lead to more diverse and creative outputs.
- **Top K**: The number of highest probability tokens to consider for generating the output. If > 0, only the top k tokens with the highest probability are kept (top-k filtering).
- **Top P**: A probability threshold for generating the output. If < 1.0, only the top tokens with cumulative probability >= top_p are kept (nucleus filtering).
- **Max New Tokens**: The maximum number of tokens the model should generate as output.
- **Prompt Template**: A template used to format the input prompt, with the actual prompt inserted using the `{prompt}` placeholder.
- **Repetition Penalty**: A value that penalizes the model for repeating the same tokens in the output.

### Outputs
The model generates a response text based on the provided input. The output can be a single sentence, a paragraph, or multiple paragraphs, depending on the complexity of the input prompt.

## Capabilities

The `yi-34b-chat` model demonstrates impressive capabilities in areas such as language understanding, commonsense reasoning, and reading comprehension. It has been shown to outperform other large language models in various benchmarks, including the AlpacaEval Leaderboard.

## What Can I Use It For?

The `yi-34b-chat` model can be used for a wide range of applications, including:

- **Conversational AI**: The model can be used to build chatbots and virtual assistants that can engage in natural language conversations.
- **Content Generation**: The model can be used to generate text content, such as articles, stories, or product descriptions.
- **Question Answering**: The model can be used to answer a variety of questions, drawing upon its strong language understanding and reasoning capabilities.
- **Summarization**: The model can be used to summarize long passages of text, capturing the key points and main ideas.
- **Code Generation**: The model can be used to assist developers by generating code snippets or even entire programs based on natural language prompts.

## Things to Try

One interesting aspect of the `yi-34b-chat` model is its ability to generate diverse and creative responses. By adjusting the temperature and other parameters, you can explore the model's versatility and see how it responds to different types of prompts. You can also try fine-tuning the model on your own dataset to customize its capabilities for your specific use case.

Another interesting aspect is the model's strong performance in commonsense reasoning and reading comprehension tasks. You can experiment with prompts that require the model to draw inferences, solve problems, or demonstrate its understanding of complex concepts.

Overall, the `yi-34b-chat` model offers a powerful and flexible platform for exploring the capabilities of large language models and developing innovative applications.

  ![specify theme context for images](https://raw.githubusercontent.com/01-ai/Yi/main/assets/img/Yi_logo_icon_light.svg)

Yi Vision Language Model
========================

### Better Bilingual Multimodal Model

 [Hugging Face](https://huggingface.co/01-ai)   [ModelScope](https://www.modelscope.cn/organization/01ai/)   [WiseModel](https://wisemodel.cn/organization/01.AI)

 Ask questions or discuss ideas on [GitHub](https://github.com/01-ai/Yi/discussions) !

 Join us  [WeChat (Chinese)](https://github.com/01-ai/Yi/issues/43#issuecomment-1827285245) !

 Grow at [Yi Learning Hub](https://github.com/01-ai/Yi/blob/main/docs/learning_hub.md) !

* * *

 Table of Contents

*   [What is Yi-VL?](#what-is-yi-vl)
    *   [Overview](#overview)
    *   [Models](#models)
    *   [Features](#features)
    *   [Architecture](#architecture)
    *   [Training](#training)
    *   [Limitations](#limitations)
*   [Why Yi-VL?](#why-yi-vl)
    *   [Tech report](#tech-report)
    *   [Benchmarks](#benchmarks)
    *   [Showcases](#showcases)
*   [How to use Yi-VL?](#how-to-use-yi-vl)
    *   [Quick start](#quick-start)
    *   [Hardware requirements](#hardware-requirements)
*   [Misc.](#misc)
    *   [Acknowledgements and attributions](#acknowledgements-and-attributions)
        *   [List of used open-source projects](#list-of-used-open-source-projects)
    *   [License](#license)

* * *

[](#what-is-yi-vl)What is Yi-VL?
================================

[](#overview)Overview
---------------------

*   **Yi Vision Language (Yi-VL)** model is the open-source, multimodal version of the Yi **Large Language Model (LLM)** series, enabling content comprehension, recognition, and multi-round conversations about images.
    
*   Yi-VL demonstrates exceptional performance, **ranking first** among all existing open-source models in the latest benchmarks including [MMMU](https://mmmu-benchmark.github.io/#leaderboard) in English and [CMMMU](https://mmmu-benchmark.github.io/#leaderboard) in Chinese (based on data available up to January 2024).
    
*   Yi-VL-34B is the **first** open-source 34B vision language model worldwide.
    

[](#models)Models
-----------------

Yi-VL has released the following versions.

Model

Download

Yi-VL-34B

 [ Hugging Face](https://huggingface.co/01-ai/Yi-VL-34B)  [ ModelScope](https://www.modelscope.cn/models/01ai/Yi-VL-34B/summary)

Yi-VL-6B

 [ Hugging Face](https://huggingface.co/01-ai/Yi-VL-6B)  [ ModelScope](https://www.modelscope.cn/models/01ai/Yi-VL-6B/summary)

[](#features)Features
---------------------

Yi-VL offers the following features:

*   Multi-round text-image conversations: Yi-VL can take both text and images as inputs and produce text outputs. Currently, it supports multi-round visual question answering with one image.
    
*   Bilingual text support: Yi-VL supports conversations in both English and Chinese, including text recognition in images.
    
*   Strong image comprehension: Yi-VL is adept at analyzing visuals, making it an efficient tool for tasks like extracting, organizing, and summarizing information from images.
    
*   Fine-grained image resolution: Yi-VL supports image understanding at a higher resolution of 448448.
    

[](#architecture)Architecture
-----------------------------

Yi-VL adopts the [LLaVA](https://github.com/haotian-liu/LLaVA) architecture, which is composed of three primary components:

*   Vision Transformer (ViT): it's initialized with [CLIP ViT-H/14 model](https://huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K) and used for image encoding.
    
*   Projection Module: it's designed to align image features with text feature space, consisting of a two-layer Multilayer Perceptron (MLP) with layer normalizations.
    
*   Large Language Model (LLM): it's initialized with [Yi-34B-Chat](https://huggingface.co/01-ai/Yi-34B-Chat) or [Yi-6B-Chat](https://huggingface.co/01-ai/Yi-6B-Chat), demonstrating exceptional proficiency in understanding and generating both English and Chinese.
    

[![image/png](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/EGVHSWG4kAcX01xDaoeXS.png)](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/EGVHSWG4kAcX01xDaoeXS.png)

[](#training)Training
---------------------

### [](#training-process)Training process

Yi-VL is trained to align visual information well to the semantic space of Yi LLM, which undergoes a comprehensive three-stage training process:

*   Stage 1: The parameters of ViT and the projection module are trained using an image resolution of 224224. The LLM weights are frozen. The training leverages an image caption dataset comprising 100 million image-text pairs from [LAION-400M](https://laion.ai/blog/laion-400-open-dataset/). The primary objective is to enhance the ViT's knowledge acquisition within our specified architecture and to achieve better alignment between the ViT and the LLM.
    
*   Stage 2: The image resolution of ViT is scaled up to 448448, and the parameters of ViT and the projection module are trained. It aims to further boost the model's capability for discerning intricate visual details. The dataset used in this stage includes about 25 million image-text pairs, such as [LAION-400M](https://laion.ai/blog/laion-400-open-dataset/), [CLLaVA](https://huggingface.co/datasets/LinkSoul/Chinese-LLaVA-Vision-Instructions), [LLaVAR](https://llavar.github.io/), [Flickr](https://www.kaggle.com/datasets/hsankesara/flickr-image-dataset), [VQAv2](https://paperswithcode.com/dataset/visual-question-answering-v2-0), [RefCOCO](https://github.com/lichengunc/refer/tree/master), [Visual7w](http://ai.stanford.edu/~yukez/visual7w/) and so on.
    
*   Stage 3: The parameters of the entire model (that is, ViT, projection module, and LLM) are trained. The primary goal is to enhance the model's proficiency in multimodal chat interactions, thereby endowing it with the ability to seamlessly integrate and interpret visual and linguistic inputs. To this end, the training dataset encompasses a diverse range of sources, totalling approximately 1 million image-text pairs, including [GQA](https://cs.stanford.edu/people/dorarad/gqa/download.html), [VizWiz VQA](https://vizwiz.org/tasks-and-datasets/vqa/), [TextCaps](https://opendatalab.com/OpenDataLab/TextCaps), [OCR-VQA](https://ocr-vqa.github.io/), [Visual Genome](https://homes.cs.washington.edu/~ranjay/visualgenome/api.html), [LAION GPT4V](https://huggingface.co/datasets/laion/gpt4v-dataset) and so on. To ensure data balancing, we impose a cap on the maximum data contribution from any single source, restricting it to no more than 50,000 pairs.
    

Below are the parameters configured for each stage.

Stage

Global batch size

Learning rate

Gradient clip

Epochs

Stage 1, 2

4096

1e-4

0.5

1

Stage 3

256

2e-5

1.0

2

### [](#training-resource-consumption)Training resource consumption

*   The training consumes 128 NVIDIA A800 (80G) GPUs.
    
*   The total training time amounted to approximately 10 days for Yi-VL-34B and 3 days for Yi-VL-6B.
    

[](#limitations)Limitations
---------------------------

This is the initial release of the Yi-VL, which comes with some known limitations. It is recommended to carefully evaluate potential risks before adopting any models.

*   Feature limitation
    
    *   Visual question answering is supported. Other features like text-to-3D and image-to-video are not yet supported.
        
    *   A single image rather than several images can be accepted as an input.
        
*   Hallucination problem
    
    *   There is a certain possibility of generating content that does not exist in the image.
        
    *   In scenes containing multiple objects, some objects might be incorrectly identified or described with insufficient detail.
        
*   Resolution issue
    
    *   Yi-VL is trained on images with a resolution of 448448. During inference, inputs of any resolution are resized to 448448. Low-resolution images may result in information loss, and more fine-grained images (above 448) do not bring in extra knowledge.
*   Other limitations of the Yi LLM.
    

[](#why-yi-vl)Why Yi-VL?
========================

[](#tech-report)Tech report
---------------------------

For detailed capabilities of the Yi series model, see [Yi: Open Foundation Models by 01.AI](https://arxiv.org/abs/2403.04652).

### [](#citation)Citation

    @misc{ai2024yi,
        title={Yi: Open Foundation Models by 01.AI},
        author={01. AI and : and Alex Young and Bei Chen and Chao Li and Chengen Huang and Ge Zhang and Guanwei Zhang and Heng Li and Jiangcheng Zhu and Jianqun Chen and Jing Chang and Kaidong Yu and Peng Liu and Qiang Liu and Shawn Yue and Senbin Yang and Shiming Yang and Tao Yu and Wen Xie and Wenhao Huang and Xiaohui Hu and Xiaoyi Ren and Xinyao Niu and Pengcheng Nie and Yuchi Xu and Yudong Liu and Yue Wang and Yuxuan Cai and Zhenyu Gu and Zhiyuan Liu and Zonghong Dai},
        year={2024},
        eprint={2403.04652},
        archivePrefix={arXiv},
        primaryClass={cs.CL}
    }
    

[](#benchmarks)Benchmarks
-------------------------

Yi-VL outperforms all existing open-source models in [MMMU](https://mmmu-benchmark.github.io) and [CMMMU](https://cmmmu-benchmark.github.io), two advanced benchmarks that include massive multi-discipline multimodal questions (based on data available up to January 2024).

*   MMMU

[![image/png](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/kCmXuwLbLvequ93kjh3mg.png)](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/kCmXuwLbLvequ93kjh3mg.png)

*   CMMMU

[![image/png](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/6YuSakMCg3D2AozixdoZ0.png)](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/6YuSakMCg3D2AozixdoZ0.png)

[](#showcases)Showcases
-----------------------

Below are some representative examples of detailed description and visual question answering, showcasing the capabilities of Yi-VL.

*   English

[![image/png](https://cdn-uploads.huggingface.co/production/uploads/64cc65d786d8dc0caa6ab3cd/F_2bIVwMtVamygbVqtb8E.png)](https://cdn-uploads.huggingface.co/production/uploads/64cc65d786d8dc0caa6ab3cd/F_2bIVwMtVamygbVqtb8E.png)

*   Chinese

[![image/png](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/l_tLzugFtHk1dkVsFJE7B.png)](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/l_tLzugFtHk1dkVsFJE7B.png)

[](#how-to-use-yi-vl)How to use Yi-VL?
======================================

[](#quick-start)Quick start
---------------------------

Please refer to [Yi GitHub Repo](https://github.com/01-ai/Yi/tree/main/VL) for details.

[](#hardware-requirements)Hardware requirements
-----------------------------------------------

For model inference, the recommended GPU examples are:

*   Yi-VL-6B: RTX 3090, RTX 4090, A10, A30
    
*   Yi-VL-34B: 4  RTX 4090, A800 (80 GB)
    

[](#misc)Misc.
==============

[](#acknowledgements-and-attributions)Acknowledgements and attributions
-----------------------------------------------------------------------

This project makes use of open-source software/components. We acknowledge and are grateful to these developers for their contributions to the open-source community.

### [](#list-of-used-open-source-projects)List of used open-source projects

1.  LLaVA

*   Authors: Haotian Liu, Chunyuan Li, Qingyang Wu, Yuheng Li, and Yong Jae Lee
*   Source: [https://github.com/haotian-liu/LLaVA](https://github.com/haotian-liu/LLaVA)
*   License: Apache-2.0 license
*   Description: The codebase is based on LLaVA code.

2.  OpenClip

*   Authors: Gabriel Ilharco, Mitchell Wortsman, Ross Wightman, Cade Gordon, Nicholas Carlini, Rohan Taori, Achal Dave, Vaishaal Shankar, Hongseok Namkoong, John Miller, Hannaneh Hajishirzi, Ali Farhadi, and Ludwig Schmidt
*   Source: [https://huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K](https://huggingface.co/laion/CLIP-ViT-H-14-laion2B-s32B-b79K)
*   License: MIT
*   Description: The ViT is initialized using the weights of OpenClip.

**Notes**

*   This attribution does not claim to cover all open-source components used. Please check individual components and their respective licenses for full details.
    
*   The use of the open-source components is subject to the terms and conditions of the respective licenses.
    

We appreciate the open-source community for their invaluable contributions to the technology world.

[](#license)License
-------------------

Please refer to the [acknowledgments and attributions](#acknowledgments_and_attributions) as well as individual components, for the license of source code.

The Yi series models are fully open for academic research and free for commercial use, permissions of which are automatically granted upon application.

All usage must adhere to the [Yi Series Models Community License Agreement 2.1](https://huggingface.co/01-ai/Yi-VL-34B/blob/main/LICENSE).

For free commercial use, you only need to send an email to get official commercial permission.

## Model overview

The `Yi-VL-34B` model is the open-source, multimodal version of the Yi Large Language Model (LLM) series developed by the team at [01.AI](https://aimodels.fyi/creators/huggingFace/01-ai). This model demonstrates exceptional performance, ranking first among all existing open-source models in the latest benchmarks including MMMU in English and CMMMU in Chinese. It is the first open-source 34B vision language model worldwide.

The Yi-VL series includes several model versions, such as the `Yi-VL-34B` and `Yi-VL-6B`. These models are capable of multi-round text-image conversations, allowing users to engage in visual question answering with a single image. Additionally, the Yi-VL models support bilingual text in both English and Chinese.

## Model inputs and outputs

### Inputs
- Text prompts
- Images

### Outputs
- Text responses based on the provided inputs

## Capabilities

The `Yi-VL-34B` model can handle multi-round text-image conversations, allowing users to engage in visual question answering with a single image. The model also supports bilingual text in both English and Chinese, making it a versatile tool for cross-language communication.

## What can I use it for?

The `Yi-VL-34B` model can be used in a variety of applications that require multimodal understanding and generation, such as visual question answering, image captioning, and language-guided image editing. Potential use cases include building interactive chatbots, developing AI-powered virtual assistants, and creating educational or entertainment applications that seamlessly integrate text and visual content.

## Things to try

Experiment with the `Yi-VL-34B` model's capabilities by engaging in multi-round conversations about images, asking questions about the content, and exploring its ability to understand and respond to both text and visual inputs. Additionally, try using the model's bilingual support to converse with users in different languages and facilitate cross-cultural communication.

![](https://raw.githubusercontent.com/01-ai/Yi/main/assets/img/Yi_logo_icon_light.svg)

[ GitHub](https://github.com/01-ai)  [ Discord](https://discord.gg/hYUwWddeAu)  [ Twitter](https://twitter.com/01ai_yi)  [ WeChat](https://github.com/01-ai/Yi-1.5/issues/2)  
[ Paper](https://arxiv.org/abs/2403.04652)  [ FAQ](https://github.com/01-ai/Yi/tree/main?tab=readme-ov-file#faq)  [ Learning Hub](https://github.com/01-ai/Yi/tree/main?tab=readme-ov-file#learning-hub)

[](#intro)Intro
===============

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Compared with Yi, Yi-1.5 delivers stronger performance in coding, math, reasoning, and instruction-following capability, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension.

Model

Context Length

Pre-trained Tokens

Yi-1.5

4K

3.6T

[](#models)Models
=================

*   Chat models
    
    Name
    
    Download
    
    Yi-1.5-34B-Chat
    
     [ Hugging Face](https://huggingface.co/collections/01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8)  [ ModelScope](https://www.modelscope.cn/organization/01ai)
    
    Yi-1.5-9B-Chat
    
     [ Hugging Face](https://huggingface.co/collections/01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8)  [ ModelScope](https://www.modelscope.cn/organization/01ai)
    
    Yi-1.5-6B-Chat
    
     [ Hugging Face](https://huggingface.co/collections/01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8)  [ ModelScope](https://www.modelscope.cn/organization/01ai)
    
*   Base models
    
    Name
    
    Download
    
    Yi-1.5-34B
    
     [ Hugging Face](https://huggingface.co/collections/01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8)  [ ModelScope](https://www.modelscope.cn/organization/01ai)
    
    Yi-1.5-9B
    
     [ Hugging Face](https://huggingface.co/collections/01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8)  [ ModelScope](https://www.modelscope.cn/organization/01ai)
    
    Yi-1.5-6B
    
     [ Hugging Face](https://huggingface.co/collections/01-ai/yi-15-2024-05-663f3ecab5f815a3eaca7ca8)  [ ModelScope](https://www.modelscope.cn/organization/01ai)
    

[](#benchmarks)Benchmarks
=========================

*   Chat models
    
    Yi-1.5-34B-Chat is on par with or excels beyond larger models in most benchmarks.
    
    [![image/png](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/KcsJ9Oc1VnEmfCDEJc5cd.png)](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/KcsJ9Oc1VnEmfCDEJc5cd.png)
    
    Yi-1.5-9B-Chat is the top performer among similarly sized open-source models.
    
    [![image/png](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/xf6pLg5jqRCwjlh6m3t6_.png)](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/xf6pLg5jqRCwjlh6m3t6_.png)
    
*   Base models
    
    Yi-1.5-34B is on par with or excels beyond larger models in some benchmarks.
    
    [![image/png](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/BwU7QM-03dZvZzwdIE1xY.png)](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/BwU7QM-03dZvZzwdIE1xY.png)
    
    Yi-1.5-9B is the top performer among similarly sized open-source models.
    
    [![image/png](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/y-EYSYPT-3aWLJ0x8R94F.png)](https://cdn-uploads.huggingface.co/production/uploads/656d9adce8bf55919aca7c3f/y-EYSYPT-3aWLJ0x8R94F.png)
    

[](#quick-start)Quick Start
===========================

For getting up and running with Yi-1.5 models quickly, see [README](https://github.com/01-ai/Yi-1.5).

## Model overview

`Yi-1.5-34B-Chat` is an upgraded version of the Yi language model, developed by the team at [01.AI](https://aimodels.fyi/creators/huggingFace/01-ai). Compared to the original Yi model, Yi-1.5-34B-Chat has been continuously pre-trained on a high-quality corpus of 500B tokens and fine-tuned on 3M diverse samples. This allows it to deliver stronger performance in areas like coding, math, reasoning, and instruction-following, while still maintaining excellent capabilities in language understanding, commonsense reasoning, and reading comprehension. The model is available in several different sizes, including `Yi-1.5-9B-Chat` and `Yi-1.5-6B-Chat`, catering to different use cases and hardware constraints.

## Model inputs and outputs

The `Yi-1.5-34B-Chat` model can accept a wide range of natural language inputs, including text prompts, instructions, and questions. It can then generate coherent and contextually appropriate responses, making it a powerful tool for conversational AI applications. The model's large scale and diverse training data allow it to engage in thoughtful discussions, provide detailed explanations, and even tackle complex tasks like coding and mathematical problem-solving.

### Inputs
- Natural language text prompts
- Conversational queries and instructions
- Requests for analysis, explanation, or task completion

### Outputs
- Coherent and contextually relevant responses
- Detailed explanations and task completions
- Creative and innovative solutions to open-ended problems

## Capabilities

The `Yi-1.5-34B-Chat` model demonstrates impressive capabilities across a variety of domains. It excels at language understanding, commonsense reasoning, and reading comprehension, allowing it to engage in natural, context-aware conversations. The model also shines in areas like coding, math, and reasoning, where it can provide insightful solutions and explanations. Additionally, the model's strong instruction-following capability makes it well-suited for tasks that require following complex guidelines or steps.

## What can I use it for?

The `Yi-1.5-34B-Chat` model has a wide range of potential applications, from conversational AI assistants and chatbots to educational tools and creative writing aids. Developers could leverage the model's language understanding and generation capabilities to build virtual assistants that can engage in natural, context-sensitive dialogues. Educators could use the model to create interactive learning experiences, providing personalized explanations and feedback to students. Businesses could explore using the model for customer service, content generation, or even internal task automation.

## Things to try

One interesting aspect of the `Yi-1.5-34B-Chat` model is its ability to engage in open-ended, contextual reasoning. Users can provide the model with complex prompts or instructions and observe how it formulates thoughtful, creative responses. For example, you could ask the model to solve a challenging math problem, provide a detailed analysis of a historical event, or generate a unique story based on a given premise. The model's versatility and problem-solving skills make it a valuable tool for exploring the boundaries of conversational AI and language understanding.

## Model overview

The `yi-6b` models are large language models trained from scratch by developers at [01.AI](https://aimodels.fyi/creators/replicate/01-ai). They are targeted as bilingual language models trained on a 3T multilingual corpus, aiming to be one of the strongest LLMs worldwide. The Yi series models show promise in language understanding, commonsense reasoning, reading comprehension, and more. For example, the Yi-34B-Chat model ranked second (following GPT-4 Turbo) on the AlpacaEval Leaderboard, outperforming other LLMs like GPT-4, Mixtral, and Claude.

The Yi series models adopt the Transformer architecture like the Llama models, reducing the effort required to build from scratch and enabling the utilization of the same tools within the AI ecosystem. However, the Yi series models are not derivatives of Llama, as they do not use Llama's weights. Instead, they have independently created their own high-quality training datasets, efficient training pipelines, and robust training infrastructure entirely from the ground up.

## Model inputs and outputs

The `yi-6b` models are designed to handle a wide range of natural language tasks, from text generation to question answering. They take a text prompt as input and generate a response as output.

### Inputs
- **Prompt**: The text that serves as the starting point for the model's generation.

### Outputs
- **Generated text**: The model's response to the input prompt, which can be of varying length depending on the use case.

## Capabilities

The `yi-6b` models demonstrate strong performance across a variety of benchmarks, including language understanding, commonsense reasoning, and reading comprehension. They are particularly adept at tasks that require coherent and contextual responses, such as open-ended conversations, summarization, and question answering.

## What can I use it for?

The `yi-6b` models can be used for a wide range of applications, including:

- **Content generation**: Generating engaging and coherent text for tasks like creative writing, article generation, and dialogue systems.
- **Question answering**: Answering questions on a variety of topics, drawing upon their broad knowledge base.
- **Summarization**: Concisely summarizing long-form text, such as articles or reports.
- **Language understanding**: Performing tasks that require deep language comprehension, like sentiment analysis, text classification, and natural language inference.

## Things to try

One interesting aspect of the `yi-6b` models is their ability to engage in open-ended conversations. You can try providing the models with a variety of prompts and see how they respond, exploring their conversational capabilities and ability to maintain context. Additionally, you can experiment with fine-tuning the models on specific datasets or tasks to further enhance their performance in areas of interest to you.