![](/internlm/internlm-xcomposer2d5-7b/resolve/main/logo_en.png)

**InternLM-XComposer-2.5**

[Github Repo](https://github.com/InternLM/InternLM-XComposer)

[Online Demo](https://huggingface.co/spaces/Willow123/InternLM-XComposer)

[Paper](https://huggingface.co/papers/2407.03320)

**InternLM-XComposer2.5** excels in various text-image comprehension and composition applications, achieving GPT-4V level capabilities with merely 7B LLM backend. IXC2.5 is trained with 24K interleaved image-text contexts, it can seamlessly extend to 96K long contexts via RoPE extrapolation. This long-context capability allows IXC-2.5 to excel in tasks requiring extensive input and output contexts.

### [](#import-from-transformers)Import from Transformers

To load the InternLM-XComposer2-4KHD model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    ckpt_path = "internlm/internlm-xcomposer2d5-7b"
    tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
    # Set `torch_dtype=torch.floatb16` to load model in bfloat16, otherwise it will be loaded as float32 and might cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.bfloat16, trust_remote_code=True).cuda()
    model = model.eval()
    

[](#quickstart)Quickstart
-------------------------

We provide a simple example to show how to use InternLM-XComposer2.5 with  Transformers.

**Video Understanding**

    import torch
    from transformers import AutoModel, AutoTokenizer
    
    torch.set_grad_enabled(False)
    
    # init model and tokenizer
    model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
    tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True)
    model.tokenizer = tokenizer
    
    query = 'Here are some frames of a video. Describe this video in detail'
    image = ['./examples/liuxiang.mp4',]
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response, his = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
    print(response)
    #The video opens with a shot of an athlete, dressed in a red and yellow uniform with the word "CHINA" emblazoned across the front, preparing for a race. 
    #The athlete, Liu Xiang, is seen in a crouched position, focused and ready, with the Olympic rings visible in the background, indicating the prestigious setting of the Olympic Games. As the race commences, the athletes are seen sprinting towards the hurdles, their determination evident in their powerful strides. 
    #The camera captures the intensity of the competition, with the athletes' numbers and times displayed on the screen, providing a real-time update on their performance. The race reaches a climax as Liu Xiang, still in his red and yellow uniform, triumphantly crosses the finish line, his arms raised in victory. 
    #The crowd in the stands erupts into cheers, their excitement palpable as they witness the athlete's success. The video concludes with a close-up shot of Liu Xiang, still basking in the glory of his victory, as the Olympic rings continue to symbolize the significance of the event.
    
    query = 'tell me the athlete code of Liu Xiang'
    image = ['./examples/liuxiang.mp4',]
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response, _ = model.chat(tokenizer, query, image, history=his, do_sample=False, num_beams=3, use_meta=True)
    print(response)
    #The athlete code of Liu Xiang, as displayed on his uniform in the video, is "1363".
**Multi-Image Mutli-Tune Dialog**

    import torch
    from transformers import AutoModel, AutoTokenizer
    
    torch.set_grad_enabled(False)
    
    # init model and tokenizer
    model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
    tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True)
    model.tokenizer = tokenizer
    
    query = 'Image1 <ImageHere>; Image2 <ImageHere>; Image3 <ImageHere>; I want to buy a car from the three given cars, analyze their advantages and weaknesses one by one'
    image = ['./examples/cars1.jpg',
            './examples/cars2.jpg',
            './examples/cars3.jpg',]
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response, his = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
    print(response)
    #To analyze the advantages and disadvantages of each car, we need to consider factors such as brand reputation, performance, design, cost, and maintenance. 
    #1. Mercedes-Benz: - Advantages: Known for its luxury and reliability, the Mercedes-Benz brand offers a high level of comfort, advanced technology, and superior craftsmanship. The vehicle in the image appears to be an SUV, which is versatile for both city driving and off-road conditions. - Disadvantages: Typically, Mercedes-Benz vehicles are more expensive compared to other brands, and they may require more frequent maintenance due to their luxury status. 
    #2. Bugatti: - Advantages: Bugatti is renowned for producing some of the fastest and most powerful cars in the world. The vehicle in the image is a sports car, likely offering an exhilarating driving experience with its high-performance engine and advanced aerodynamics. - Disadvantages: Bugatti cars are extremely expensive, making them less accessible to the average consumer. They also require specialized knowledge for maintenance and may not be suitable for everyday driving due to their high-performance nature. 
    #3. BMW: - Advantages: BMW is a well-respected manufacturer known for its innovative engineering and design. The car in the image appears to be an electric vehicle (EV), which offers advantages such as lower emissions, reduced noise pollution, and potentially lower operating costs. - Disadvantages: EVs like the one in the image may have a shorter range compared to traditional gasoline-powered vehicles, limiting their suitability for long-distance travel. 
    #Additionally, the initial cost of purchasing an EV can be higher than that of a conventional car. In conclusion, each car has its own set of advantages and disadvantages. The Mercedes-Benz offers luxury and versatility, the Bugatti provides high performance and exclusivity, and the BMW represents innovation in electric vehicle technology. The choice ultimately depends on the buyer's priorities, such as comfort, performance, environmental impact, and budget.
    
    query = 'Image4 <ImageHere>; How about the car in Image4'
    image.append('./examples/cars4.jpg')        
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, history= his, use_meta=True)
    print(response)
    #The car in Image4 is a red sports car, which appears to be a Ferrari. Ferrari is a renowned Italian manufacturer known for producing some of the most iconic and high-performance sports cars in the world. - Advantages: Ferrari vehicles are synonymous with speed, luxury, and engineering excellence. 
    #The car in the image likely offers an exhilarating driving experience with its powerful engine, advanced aerodynamics, and high-quality craftsmanship. The red color adds to the car's aesthetic appeal, making it stand out on the road. - Disadvantages: Ferrari cars are extremely expensive, making them less accessible to the average consumer. 
    #They also require specialized knowledge for maintenance and may not be suitable for everyday driving due to their high-performance nature. In conclusion, the Ferrari in Image4 represents a pinnacle of automotive engineering and design, offering unmatched performance and luxury. 
    #However, its high cost and specialized maintenance requirements make it less practical for everyday use compared to the other vehicles in the images.
**High Resolution Image Understanding**

    import torch
    from transformers import AutoModel, AutoTokenizer
    
    torch.set_grad_enabled(False)
    
    # init model and tokenizer
    model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
    tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True)
    model.tokenizer = tokenizer
    
    query = 'Analyze the given image in a detail manner'
    image = ['./examples/dubai.png']
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response, _ = model.chat(tokenizer, query, image, do_sample=False, num_beams=3, use_meta=True)
    print(response)
    #The infographic is a visual representation of various facts about Dubai. It begins with a statement about Palm Jumeirah, highlighting it as the largest artificial island visible from space. It then provides a historical context, noting that in 1968, there were only a few cars in Dubai, contrasting this with the current figure of more than 1.5 million vehicles. 
    #The infographic also points out that Dubai has the world's largest Gold Chain, with 7 of the top 10 tallest hotels located there. Additionally, it mentions that the crime rate is near 0%, and the income tax rate is also 0%, with 20% of the world's total cranes operating in Dubai. Furthermore, it states that 17% of the population is Emirati, and 83% are immigrants.
    #The Dubai Mall is highlighted as the largest shopping mall in the world, with 1200 stores. The infographic also notes that Dubai has no standard address system, with no zip codes, area codes, or postal services. It mentions that the Burj Khalifa is so tall that its residents on top floors need to wait longer to break fast during Ramadan. 
    #The infographic also includes information about Dubai's climate-controlled City, with the Royal Suite at Burj Al Arab costing $24,000 per night. Lastly, it notes that the net worth of the four listed billionaires is roughly equal to the GDP of Honduras.
**Instruction to Webpage**

    import torch
    from transformers import AutoModel, AutoTokenizer
    
    torch.set_grad_enabled(False)
    
    # init model and tokenizer
    model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
    tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True)
    model.tokenizer = tokenizer
    
    query = 'A website for Research institutions. The name is Shanghai AI lab. Top Navigation Bar is blue.Below left, an image shows the logo of the lab. In the right, there is a passage of text below that describes the mission of the laboratory.There are several images to show the research projects of Shanghai AI lab.'
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response = model.write_webpage(query, seed=202, task='Instruction-aware Webpage Generation', repetition_penalty=3.0)
    print(response)
    # see the Instruction-aware Webpage Generation.html 
    

See the [Instruction to Webpage](https://github.com/InternLM/InternLM-XComposer/blob/main/examples/Instruction-aware_Webpage_Generation.html) results here.

**Resume to Webpage**

    import torch
    from transformers import AutoModel, AutoTokenizer
    
    torch.set_grad_enabled(False)
    
    # init model and tokenizer
    model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
    tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True)
    model.tokenizer = tokenizer
    
    ## the input should be a resume in markdown format
    query = './examples/resume.md'
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response = model.resume_2_webpage(query, seed=202, repetition_penalty=3.0)
    print(response)
    

See the [Resume to Webpage](https://github.com/InternLM/InternLM-XComposer/blob/main/examples/Resume-to-Personal_Page.html) results here.

**Screenshot to Webpage**

    import torch
    from transformers import AutoModel, AutoTokenizer
    
    torch.set_grad_enabled(False)
    
    # init model and tokenizer
    model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
    tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True)
    model.tokenizer = tokenizer
    
    query = 'Generate the HTML code of this web image with Tailwind CSS.'
    image = ['./examples/screenshot.jpg']
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response = model.resume_2_webpage(query, image, seed=202, repetition_penalty=3.0)
    print(response)
    

See the [Screenshot to Webpage](https://github.com/InternLM/InternLM-XComposer/blob/main/examples/Screenshot-to-Webpage.html) results here.

**Write Article**

    import torch
    from transformers import AutoModel, AutoTokenizer
    
    torch.set_grad_enabled(False)
    
    # init model and tokenizer
    model = AutoModel.from_pretrained('internlm/internlm-xcomposer2d5-7b', torch_dtype=torch.bfloat16, trust_remote_code=True).cuda().eval()
    tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2d5-7b', trust_remote_code=True)
    model.tokenizer = tokenizer
    
    query = ' >> >?? :>;;; 800 '
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response = model.write_artical(query, seed=8192)
    print(response)
    #
    #:,,,,? ,
    #,,
    #,,,,,,,,;,,,,,,,,,,
    #,,
    #,,,,,:,,,,,,,,,?,,?
    #,,
    #,,,,;,,;,,?,,,,,
    #,,,,
    #!
    
    query = 'Please write a blog based on the title: French Pastries: A Sweet Indulgence'
    with torch.autocast(device_type='cuda', dtype=torch.float16):
        response = model.write_artical(query, seed=8192)
    print(response)
    #French Pastries: A Sweet Indulgence
    #The French are well known for their love of pastries, and its a love that is passed down through generations. When one visits France, they are treated to an assortment of baked goods that can range from the delicate macaron to the rich and decadent chocolate mousse. While there are many delicious types of pastries found in France, five stand out as being the most iconic. Each of these pastries has its own unique qualities that make it special.
    #1. Croissant
    #One of the most famous pastries from France is the croissant. It is a buttery, flaky pastry that is best enjoyed fresh from the bakery. The dough is laminated with butter, giving it its signature layers. Croissants are typically eaten for breakfast or brunch, often accompanied by coffee or hot chocolate.
    #2. Macaron
    #The macaron is a small, delicate French confection made from almond flour, powdered sugar, and egg whites. The macaron itself is sandwiched with a ganache or jam filling. They come in a variety of colors and flavors, making them a popular choice for both casual snacking and upscale desserts.
    #3. Madeleine
    #The madeleine is a small shell-shaped cake that is light and sponge-like. It is often flavored with lemon or orange zest and sometimes dipped in chocolate. Madeleines are perfect for an afternoon snack with tea or coffee.
    #4. clair
    #The clair is a long, thin pastry filled with cream and topped with chocolate glaze. It is a classic French treat that is both sweet and satisfying. clairs can be found in bakeries all over France and are often enjoyed with a cup of hot chocolate.
    #5. Tarte Tatin
    #The tarte Tatin is an apple tart that is known for its caramelized apples and puff pastry crust. It is named after the Tatin sisters who created the recipe in the late 19th century. Tarte Tatin is best served warm with a scoop of vanilla ice cream.
    #These pastries are just a few of the many delicious treats that France has to offer. Whether you are a seasoned traveler or a first-time visitor, indulging in French pastries is a must-do activity. So go ahead, treat yourselfyou deserve it!

### [](#open-source-license)Open Source License

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/. For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

## Model overview

`internlm-xcomposer2d5-7b` is a powerful text-image comprehension and composition model developed by [internlm](https://aimodels.fyi/creators/huggingFace/internlm). It is based on the [InternLM2](https://github.com/InternLM/InternLM) language model and excels at a variety of multimodal tasks, achieving GPT-4 level capabilities with just a 7B parameter LLM backbone. 

The model is trained on 24,000 interleaved image-text contexts and can seamlessly extend to 96,000 long contexts via RoPE extrapolation. This long-context capability allows `internlm-xcomposer2d5-7b` to excel at tasks requiring extensive input and output contexts, such as detailed video understanding and complex image description.

Similar models developed by the internlm team include the [internlm-xcomposer2-vl-7b](https://aimodels.fyi/models/huggingFace/internlm-xcomposer2-vl-7b-internlm), a vision-language large model (VLLM) for advanced text-image comprehension and composition, and the [internlm-xcomposer2-4khd-7b](https://aimodels.fyi/models/huggingFace/internlm-xcomposer2-4khd-7b-internlm), a VLLM with 4K resolution image understanding capabilities.

## Model inputs and outputs

### Inputs
- **Text query**: The text prompt describing the task or request, such as "Describe this video in detail."
- **Image(s)**: The image(s) to be processed and understood in the context of the text query.

### Outputs
- **Detailed response**: A long-form, coherent text response describing the image(s) in detail, tailored to the provided text query.

## Capabilities

`internlm-xcomposer2d5-7b` excels at a variety of text-image understanding and generation tasks. For example, it can provide detailed video summaries, as demonstrated in the quickstart example, where it generates a comprehensive description of a video featuring an athlete competing in the Olympics. The model's long-context capability allows it to maintain coherence and focus over lengthy inputs and outputs.

## What can I use it for?

`internlm-xcomposer2d5-7b` can be leveraged for a wide range of applications that require deep understanding and generation of text-image content. Some potential use cases include:

- **Content creation**: Generating detailed descriptions, captions, or stories to accompany images and videos for use in marketing, social media, or editorial content.
- **Visual question answering**: Answering complex questions about the contents and details of images.
- **Multimodal assistants**: Building AI assistants that can understand and respond to queries involving both text and visual information.
- **Artistic and creative applications**: Assisting with the ideation and description of conceptual artwork or illustrations.

## Things to try

One interesting aspect of `internlm-xcomposer2d5-7b` is its ability to engage in multi-turn, context-aware conversations about visual content. The quickstart example demonstrates how the model can provide an initial detailed description of an image, and then generate further explanations in response to follow-up queries about specific details. Exploring this interactive, iterative process of understanding and describing visual information could lead to fascinating applications.

Another key feature of the model is its long-context capability, which allows it to maintain coherence and focus over lengthy inputs and outputs. Experimenting with prompts that involve extensive background information or complex, multi-part queries could uncover the full extent of this capability and unlock new use cases.

**InternLM**

![](https://github.com/InternLM/InternLM/assets/22529082/b9788105-8892-4398-8b47-b513a292378e)

**InternLM** [_HOT_](https://internlm.intern-ai.org.cn/)

[![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)

[Github Repo](https://github.com/InternLM/InternLM)  [Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

[](#introduction)Introduction
-----------------------------

The Shanghai Artificial Intelligence Laboratory, in collaboration with SenseTime Technology, the Chinese University of Hong Kong, and Fudan University, has officially released the 20 billion parameter pretrained model, InternLM-20B. InternLM-20B was pre-trained on over **2.3T** Tokens containing high-quality English, Chinese, and code data. Additionally, the Chat version has undergone SFT and RLHF training, enabling it to better and more securely meet users' needs.

In terms of model structure, InternLM-20B opted for a deeper architecture, with a depth set at 60 layers. This surpasses the conventional 7B and 13B models that utilize 32 or 40 layers. When parameters are limited, increasing the number of layers can enhance the model's overall capability. Furthermore, compared to InternLM-7B, the pre-training data used for InternLM-20B underwent higher quality cleansing and was supplemented with data rich in knowledge and designed for reinforcing understanding and reasoning capabilities. As a result, it exhibits significant improvements in understanding, reasoning, mathematical, and programming abilitiesall of which test the technical proficiency of language models. Overall, InternLM-20B features the following characteristics:

*   Outstanding overall performance
*   Strong utility invocation capability
*   Supports a 16k context length (Through infererence extrapolation)
*   Better value alignment.

[](#performance-evaluation)Performance Evaluation
-------------------------------------------------

On the 5 capability dimensions proposed by OpenCompass, InternLM-20B has achieved excellent results (the bolded scores represent the best performances within the 13B-33B parameter range).

Capability

Llama-13B

Llama2-13B

Baichuan2-13B

InternLM-20B

Llama-33B

Llama-65B

Llama2-70B

Language

42.5

47

47.5

**55**

44.6

47.1

51.6

Knowledge

58.2

58.3

48.9

60.1

**64**

66

67.7

Understanding

45.5

50.9

58.1

**67.3**

50.6

54.2

60.8

Reasoning

42.7

43.6

44.2

**54.9**

46.4

49.8

55

Examination

37.3

45.2

51.8

**62.5**

47.4

49.7

57.3

Overall

43.8

47.3

49.4

**59.2**

48.9

51.9

57.4

The table below compares the performance of mainstream open-source models on some influential and typical datasets.

Benchmarks

Llama-13B

Llama2-13B

Baichuan2-13B

InternLM-20B

Llama-33B

Llama-65B

Llama2-70B

Examination

MMLU

47.73

54.99

59.55

**62.05**

58.73

63.71

69.75

C-Eval (val)

31.83

41.4

**59.01**

58.8

37.47

40.36

50.13

AGI-Eval

22.03

30.93

37.37

**44.58**

33.53

33.92

40.02

Knowledge

BoolQ

78.75

82.42

67

**87.46**

84.43

86.61

87.74

TriviaQA

52.47

59.36

46.61

57.26

**66.24**

69.79

70.71

NaturalQuestions

20.17

24.85

16.32

25.15

**30.89**

33.41

34.16

Understanding

CMRC

9.26

31.59

29.85

**68.78**

14.17

34.73

43.74

CSL

55

58.75

63.12

**65.62**

57.5

59.38

60

RACE (middle)

53.41

63.02

68.94

**86.35**

64.55

72.35

81.55

RACE (high)

47.63

58.86

67.18

**83.28**

62.61

68.01

79.93

XSum

20.37

23.37

25.23

**35.54**

20.55

19.91

25.38

Reasoning

WinoGrande

64.64

64.01

67.32

**69.38**

66.85

69.38

69.77

BBH

37.93

45.62

48.98

**52.51**

49.98

58.38

64.91

GSM8K

20.32

29.57

**52.62**

**52.62**

42.3

54.44

63.31

PIQA

79.71

79.76

78.07

80.25

**81.34**

82.15

82.54

Programming

HumanEval

14.02

18.9

17.07

**25.61**

17.68

18.9

26.22

MBPP

20.6

26.8

30.8

**35.6**

28.4

33.6

39.6

Overall, InternLM-20B comprehensively outperforms open-source models in the 13B parameter range in terms of overall capabilities, and on inference evaluation sets, it approaches or even surpasses the performance of Llama-65B.

[](#import-from-transformers)Import from Transformers
-----------------------------------------------------

To load the InternLM 20B model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-20b", trust_remote_code=True)
    # Set `torch_dtype=torch.bfloat16` to load model in bfloat16, otherwise it will be loaded as float32 and cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-20b", torch_dtype=torch.bfloat16, trust_remote_code=True).cuda()
    model = model.eval()
    output, history = model.chat(tokenizer, "Hello! Today is sunny, it is time to go out")
    print(output)
    # Hello! Today is sunny, and it sounds like a great day to go out an enjoy the weather. What would you like to do?
    

The responses can be streamed using `stream_chat`:

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm-chat-20b"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16, trust_remote_code=True)
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "Hello", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

[](#open-source-license)Open Source License
-------------------------------------------

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

[](#)
---------

200 InternLM-20B InternLM-20B  **2.3T** Tokens  Chat  SFT  RLHF 

InternLM 20B 607B13B3240InternLM-7BInternLM-20BInternLM-20B

*   
*   
*   16k
*   

[](#)
-------------

OpenCompass5InternLM-20B13B-33B



Llama-13B

Llama2-13B

Baichuan2-13B

InternLM-20B

Llama-33B

Llama-65B

Llama2-70B



42.5

47

47.5

**55**

44.6

47.1

51.6



58.2

58.3

48.9

60.1

**64**

66

67.7



45.5

50.9

58.1

**67.3**

50.6

54.2

60.8



42.7

43.6

44.2

**54.9**

46.4

49.8

55



37.3

45.2

51.8

**62.5**

47.4

49.7

57.3



43.8

47.3

49.4

**59.2**

48.9

51.9

57.4

 InternLM 20B 



Llama-13B

Llama2-13B

Baichuan2-13B

InternLM-20B

Llama-33B

Llama-65B

Llama2-70B



MMLU

47.73

54.99

59.55

**62.05**

58.73

63.71

69.75

C-Eval (val)

31.83

41.4

**59.01**

58.8

37.47

40.36

50.13

AGI-Eval

22.03

30.93

37.37

**44.58**

33.53

33.92

40.02



BoolQ

78.75

82.42

67

**87.46**

84.43

86.61

87.74

TriviaQA

52.47

59.36

46.61

57.26

**66.24**

69.79

70.71

NaturalQuestions

20.17

24.85

16.32

25.15

**30.89**

33.41

34.16



CMRC

9.26

31.59

29.85

**68.78**

14.17

34.73

43.74

CSL

55

58.75

63.12

**65.62**

57.5

59.38

60

RACE (middle)

53.41

63.02

68.94

**86.35**

64.55

72.35

81.55

RACE (high)

47.63

58.86

67.18

**83.28**

62.61

68.01

79.93

XSum

20.37

23.37

25.23

**35.54**

20.55

19.91

25.38



WinoGrande

64.64

64.01

67.32

**69.38**

66.85

69.38

69.77

BBH

37.93

45.62

48.98

**52.51**

49.98

58.38

64.91

GSM8K

20.32

29.57

**52.62**

**52.62**

42.3

54.44

63.31

PIQA

79.71

79.76

78.07

80.25

**81.34**

82.15

82.54



HumanEval

14.02

18.9

17.07

**25.61**

17.68

18.9

26.22

MBPP

20.6

26.8

30.8

**35.6**

28.4

33.6

39.6

InternLM-20B 13BLlama-65B

[](#-transformers-) Transformers 
-----------------------------------------

 InternLM 20B 

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-20b", trust_remote_code=True)
    # `torch_dtype=torch.bfloat16`  bfloat16  transformers  float32
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-20b", torch_dtype=torch.bfloat16, trust_remote_code=True).cuda()
    model = model.eval()
    output, history = model.chat(tokenizer, "")
    print(output)
    # 
    

 stream\_chat 

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm-chat-20b"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dype=torch.bfloat16, trust_remote_code=True)
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

**** 

[](#)
---------------

 Apache-2.0 [](https://wj.qq.com/s2/12725412/f7c1/) [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn)

## Model overview

`internlm-chat-20b` is a large language model developed by the Shanghai Artificial Intelligence Laboratory, in collaboration with SenseTime Technology, the Chinese University of Hong Kong, and Fudan University. The model has 20 billion parameters and was pre-trained on over 2.3 trillion tokens of high-quality English, Chinese, and code data. Compared to smaller 7B and 13B models, `internlm-chat-20b` has a deeper architecture with 60 layers, which can enhance the model's overall capability when parameters are limited.

The model has undergone SFT and RLHF training, enabling it to better and more securely meet users' needs. It exhibits significant improvements in understanding, reasoning, mathematical, and programming abilities compared to smaller models like [Llama-13B](https://aimodels.fyi/models/huggingFace/llama-2-13b-hf-nousresearch), [Llama2-13B](https://aimodels.fyi/models/huggingFace/llama-2-13b-hf-nousresearch), and [Baichuan2-13B](https://aimodels.fyi/models/huggingFace/baichuan2-13b-chat-baichuan-inc).

## Model inputs and outputs

### Inputs
- Text prompts in natural language

### Outputs
- Generated text responses to the input prompts

## Capabilities

`internlm-chat-20b` has demonstrated excellent overall performance, strong utility invocation capability, and supports a 16k context length through inference extrapolation. It also exhibits better value alignment compared to other large language models.

On the 5 capability dimensions proposed by OpenCompass, `internlm-chat-20b` has achieved the best performance within the 13B-33B parameter range, outperforming models like [Llama-13B](https://aimodels.fyi/models/huggingFace/llama-2-13b-hf-nousresearch), [Llama2-13B](https://aimodels.fyi/models/huggingFace/llama-2-13b-hf-nousresearch), and [Baichuan2-13B](https://aimodels.fyi/models/huggingFace/baichuan2-13b-chat-baichuan-inc).

## What can I use it for?

`internlm-chat-20b` can be used for a variety of natural language processing tasks, including text generation, question answering, language translation, and code generation. The model's strong performance on understanding, reasoning, and programming tasks makes it a powerful tool for developers and researchers working on advanced AI applications.

## Things to try

One interesting aspect of `internlm-chat-20b` is its ability to support a 16k context length through inference extrapolation, which is significantly longer than the 4096 context length of many other large language models. This could enable the model to handle longer-form text generation tasks or applications that require maintaining context over longer sequences.

[](#internlm)InternLM
=====================

![](https://github.com/InternLM/InternLM/assets/22529082/b9788105-8892-4398-8b47-b513a292378e)

**InternLM** [_HOT_](https://internlm.intern-ai.org.cn/)

[![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)

[Github Repo](https://github.com/InternLM/InternLM)  [Reporting Issues](https://github.com/InternLM/InternLM/issues/new)  [Technical Report](https://arxiv.org/abs/2403.17297)

 join us on [Discord](https://discord.gg/xa29JuW87d) and [WeChat](https://github.com/InternLM/InternLM/assets/25839884/a6aad896-7232-4220-ac84-9e070c2633ce)

[](#introduction)Introduction
-----------------------------

InternLM2.5 has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics:

*   **Outstanding reasoning capability**: State-of-the-art performance on Math reasoning, surpassing models like Llama3 and Gemma2-9B.
    
*   **1M Context window**: Nearly perfect at finding needles in the haystack with 1M-long context, with leading performance on long-context tasks like LongBench. Try it with [LMDeploy](https://github.com/InternLM/InternLM/blob/main/chat/lmdeploy.md) for 1M-context inference.
    
*   **Stronger tool use**: InternLM2.5 supports gathering information from more than 100 web pages, corresponding implementation will be released in [Lagent](https://github.com/InternLM/lagent/tree/main) soon. InternLM2.5 has better tool utilization-related capabilities in instruction following, tool selection and reflection. See [examples](https://github.com/InternLM/InternLM/blob/main/agent/lagent.md).
    

[](#internlm25-7b-chat)InternLM2.5-7B-Chat
------------------------------------------

### [](#performance-evaluation)Performance Evaluation

We conducted a comprehensive evaluation of InternLM using the open-source evaluation tool [OpenCompass](https://github.com/internLM/OpenCompass/). The evaluation covered five dimensions of capabilities: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. Here are some of the evaluation results, and you can visit the [OpenCompass leaderboard](https://rank.opencompass.org.cn) for more evaluation results.

Dataset\\Models

Qwen2-7B-Instruct

Yi-1.5-9B-Chat

GLM-4-9B-Chat

Llama-3-8B-Instruct

Gemma2-9B-IT

InternLM2.5-7B-Chat

Llama-3-70B-Instruct

MMLU

70.8

71.0

71.4

68.4

70.9

72.8

80.5

CMMLU

80.9

74.5

74.5

53.3

60.3

78.0

70.1

BBH

65

69.6

69.6

65.4

68.2

71.6

80.5

MATH

48.6

51.1

51.1

27.9

46.9

60.7

47.1

GSM8K

82.9

80.1

85.3

72.9

88.9

86.0

92.8

GPQA

38.4

37.9

36.9

26.3

33.8

38.4

38.9

*   The evaluation results were obtained from [OpenCompass](https://github.com/internLM/OpenCompass/) (some data marked with \*, which means come from the original papers), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/).
*   The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).

**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

### [](#import-from-transformers)Import from Transformers

To load the InternLM2.5 7B Chat model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2_5-7b-chat", trust_remote_code=True)
    # Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm2_5-7b-chat", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    response, history = model.chat(tokenizer, "hello", history=[])
    print(response)
    # Hello! How can I help you today?
    response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
    print(response)
    

The responses can be streamed using `stream_chat`:

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm2_5-7b-chat"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "Hello", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

[](#deployment)Deployment
-------------------------

### [](#lmdeploy)LMDeploy

LMDeploy is a toolkit for compressing, deploying, and serving LLM, developed by the MMRazor and MMDeploy teams.

    pip install lmdeploy
    

You can run batch inference locally with the following python code:

    import lmdeploy
    pipe = lmdeploy.pipeline("internlm/internlm2_5-7b-chat")
    response = pipe(["Hi, pls intro yourself", "Shanghai is"])
    print(response)
    

Or you can launch an OpenAI compatible server with the following command:

    lmdeploy serve api_server internlm/internlm2_5-7b-chat --model-name internlm2_5-7b-chat --server-port 23333 
    

Then you can send a chat request to the server:

    curl http://localhost:23333/v1/chat/completions \
        -H "Content-Type: application/json" \
        -d '{
        "model": "internlm2_5-7b-chat",
        "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Introduce deep learning to me."}
        ]
        }'
    

Find more details in the [LMDeploy documentation](https://lmdeploy.readthedocs.io/en/latest/)

### [](#vllm)vLLM

Launch OpenAI compatible server with `vLLM>=0.3.2`:

    pip install vllm
    

    python -m vllm.entrypoints.openai.api_server --model internlm/internlm2_5-7b-chat --served-model-name internlm2_5-7b-chat --trust-remote-code
    

Then you can send a chat request to the server:

    curl http://localhost:8000/v1/chat/completions \
        -H "Content-Type: application/json" \
        -d '{
        "model": "internlm2_5-7b-chat",
        "messages": [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Introduce deep learning to me."}
        ]
        }'
    

Find more details in the [vLLM documentation](https://docs.vllm.ai/en/latest/index.html)

[](#open-source-license)Open Source License
-------------------------------------------

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

[](#citation)Citation
---------------------

    @misc{cai2024internlm2,
          title={InternLM2 Technical Report},
          author={Zheng Cai and Maosong Cao and Haojiong Chen and Kai Chen and Keyu Chen and Xin Chen and Xun Chen and Zehui Chen and Zhi Chen and Pei Chu and Xiaoyi Dong and Haodong Duan and Qi Fan and Zhaoye Fei and Yang Gao and Jiaye Ge and Chenya Gu and Yuzhe Gu and Tao Gui and Aijia Guo and Qipeng Guo and Conghui He and Yingfan Hu and Ting Huang and Tao Jiang and Penglong Jiao and Zhenjiang Jin and Zhikai Lei and Jiaxing Li and Jingwen Li and Linyang Li and Shuaibin Li and Wei Li and Yining Li and Hongwei Liu and Jiangning Liu and Jiawei Hong and Kaiwen Liu and Kuikun Liu and Xiaoran Liu and Chengqi Lv and Haijun Lv and Kai Lv and Li Ma and Runyuan Ma and Zerun Ma and Wenchang Ning and Linke Ouyang and Jiantao Qiu and Yuan Qu and Fukai Shang and Yunfan Shao and Demin Song and Zifan Song and Zhihao Sui and Peng Sun and Yu Sun and Huanze Tang and Bin Wang and Guoteng Wang and Jiaqi Wang and Jiayu Wang and Rui Wang and Yudong Wang and Ziyi Wang and Xingjian Wei and Qizhen Weng and Fan Wu and Yingtong Xiong and Chao Xu and Ruiliang Xu and Hang Yan and Yirong Yan and Xiaogui Yang and Haochen Ye and Huaiyuan Ying and Jia Yu and Jing Yu and Yuhang Zang and Chuyu Zhang and Li Zhang and Pan Zhang and Peng Zhang and Ruijie Zhang and Shuo Zhang and Songyang Zhang and Wenjian Zhang and Wenwei Zhang and Xingcheng Zhang and Xinyue Zhang and Hui Zhao and Qian Zhao and Xiaomeng Zhao and Fengzhe Zhou and Zaida Zhou and Jingming Zhuo and Yicheng Zou and Xipeng Qiu and Yu Qiao and Dahua Lin},
          year={2024},
          eprint={2403.17297},
          archivePrefix={arXiv},
          primaryClass={cs.CL}
    }
    

[](#)
---------

InternLM2.5  2.5 70 InternLM2.5-7B-Chat

*    Llama3  Gemma2-9B
*    1  LongBench   [LMDeploy](https://github.com/InternLM/InternLM/blob/main/chat/lmdeploy_zh_cn.md) 
*   InternLM2.5  [Lagent](https://github.com/InternLM/lagent/tree/main)InternLM2.5 [](https://github.com/InternLM/InternLM/blob/main/agent/lagent.md)

[](#internlm25-7b-chat-1)InternLM2.5-7B-Chat
--------------------------------------------

### [](#)

 [OpenCompass](https://github.com/internLM/OpenCompass/) InternLM [OpenCompass ](https://rank.opencompass.org.cn) 

\\

Qwen2-7B-Instruct

Yi-1.5-9B-Chat

GLM-4-9B-Chat

Llama-3-8B-Instruct

Gemma2-9B-IT

InternLM2.5-7B-Chat

Llama-3-70B-Instruct

MMLU

70.8

71.0

71.4

68.4

70.9

72.0

80.5

CMMLU

80.9

74.5

74.5

53.3

60.3

78.0

70.1

BBH

65

69.6

69.6

65.4

68.2

69.2

80.5

MATH

48.6

51.1

51.1

27.9

46.9

60.1

47.1

GSM8K

82.9

80.1

85.3

72.9

88.9

86.0

86.6

GPQA

38.4

37.9

36.9

26.3

33.8

38.4

38.9

*    [OpenCompass](https://github.com/internLM/OpenCompass/) `*` [OpenCompass](https://github.com/internLM/OpenCompass/) 
*    [OpenCompass](https://github.com/internLM/OpenCompass/)  [OpenCompass](https://github.com/internLM/OpenCompass/) 

**** 

### [](#-transformers-) Transformers 

 InternLM2.5 7B Chat 

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2_5-7b-chat", trust_remote_code=True)
    # `torch_dtype=torch.float16`  float16  transformers  float32
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm2_5-7b-chat", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    response, history = model.chat(tokenizer, "", history=[])
    print(response)
    # 
    response, history = model.chat(tokenizer, "", history=history)
    print(response)
    

 `stream_chat` 

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm2_5-7b-chat"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dype=torch.float16, trust_remote_code=True).cuda()
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

[](#)
---------

### [](#lmdeploy-1)LMDeploy

LMDeploy  MMDeploy  MMRazor  LLM 

    pip install lmdeploy
    

 python :

    import lmdeploy
    pipe = lmdeploy.pipeline("internlm/internlm2_5-7b-chat")
    response = pipe(["Hi, pls intro yourself", "Shanghai is"])
    print(response)
    

 OpenAI API :

    lmdeploy serve api_server internlm/internlm2_5-7b-chat --server-port 23333
    

:

    curl http://localhost:23333/v1/chat/completions \
        -H "Content-Type: application/json" \
        -d '{
        "model": "internlm2_5-7b-chat",
        "messages": [
        {"role": "system", "content": "AI"},
        {"role": "user", "content": ""}
        ]
        }'
    

 [LMDeploy ](https://lmdeploy.readthedocs.io/en/latest/)

### [](#vllm-1)vLLM

`vLLM>=0.3.2` OpenAI API :

    pip install vllm
    

    python -m vllm.entrypoints.openai.api_server --model internlm/internlm2_5-7b-chat --trust-remote-code
    

:

    curl http://localhost:8000/v1/chat/completions \
        -H "Content-Type: application/json" \
        -d '{
        "model": "internlm2_5-7b-chat",
        "messages": [
        {"role": "system", "content": "AI"},
        {"role": "user", "content": ""}
        ]
        }'
    

 [vLLM ](https://docs.vllm.ai/en/latest/index.html)

[](#)
---------------

 Apache-2.0 [](https://wj.qq.com/s2/12725412/f7c1/) [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn)

[](#)
---------

    @misc{cai2024internlm2,
          title={InternLM2 Technical Report},
          author={Zheng Cai and Maosong Cao and Haojiong Chen and Kai Chen and Keyu Chen and Xin Chen and Xun Chen and Zehui Chen and Zhi Chen and Pei Chu and Xiaoyi Dong and Haodong Duan and Qi Fan and Zhaoye Fei and Yang Gao and Jiaye Ge and Chenya Gu and Yuzhe Gu and Tao Gui and Aijia Guo and Qipeng Guo and Conghui He and Yingfan Hu and Ting Huang and Tao Jiang and Penglong Jiao and Zhenjiang Jin and Zhikai Lei and Jiaxing Li and Jingwen Li and Linyang Li and Shuaibin Li and Wei Li and Yining Li and Hongwei Liu and Jiangning Liu and Jiawei Hong and Kaiwen Liu and Kuikun Liu and Xiaoran Liu and Chengqi Lv and Haijun Lv and Kai Lv and Li Ma and Runyuan Ma and Zerun Ma and Wenchang Ning and Linke Ouyang and Jiantao Qiu and Yuan Qu and Fukai Shang and Yunfan Shao and Demin Song and Zifan Song and Zhihao Sui and Peng Sun and Yu Sun and Huanze Tang and Bin Wang and Guoteng Wang and Jiaqi Wang and Jiayu Wang and Rui Wang and Yudong Wang and Ziyi Wang and Xingjian Wei and Qizhen Weng and Fan Wu and Yingtong Xiong and Chao Xu and Ruiliang Xu and Hang Yan and Yirong Yan and Xiaogui Yang and Haochen Ye and Huaiyuan Ying and Jia Yu and Jing Yu and Yuhang Zang and Chuyu Zhang and Li Zhang and Pan Zhang and Peng Zhang and Ruijie Zhang and Shuo Zhang and Songyang Zhang and Wenjian Zhang and Wenwei Zhang and Xingcheng Zhang and Xinyue Zhang and Hui Zhao and Qian Zhao and Xiaomeng Zhao and Fengzhe Zhou and Zaida Zhou and Jingming Zhuo and Yicheng Zou and Xipeng Qiu and Yu Qiao and Dahua Lin},
          year={2024},
          eprint={2403.17297},
          archivePrefix={arXiv},
          primaryClass={cs.CL}
    }

## Model overview

The `internlm2-5-7b-chat` model is a 7 billion parameter language model developed by [internlm](https://aimodels.fyi/creators/huggingFace/internlm). It is part of the InternLM family of models, which also includes the `internlm2-chat-7b` and `internlm-chat-7b` models. The InternLM models are known for their outstanding reasoning capabilities, long-context support, and stronger tool use abilities compared to other open-source models of similar size.

The `internlm2-5-7b-chat` model specifically demonstrates state-of-the-art performance on math reasoning tasks, surpassing models like LLaMA-3 and Gemma2-9B. It also excels at finding relevant information in long, 1 million character contexts, as shown by its leading results on the LongBench benchmark. Additionally, the model supports gathering information from over 100 web pages, with the corresponding implementation to be released in the [Lagent](https://github.com/InternLM/lagent/tree/main) project soon.

## Model inputs and outputs

### Inputs
- Natural language text prompts for the model to generate a response to.

### Outputs
- Generated natural language text responses to the input prompts.

## Capabilities

The `internlm2-5-7b-chat` model showcases several advanced capabilities. It demonstrates outstanding reasoning skills, particularly in mathematical tasks, outperforming larger models like LLaMA-3 and Gemma2-9B. The model also has an exceptional ability to process long input contexts of up to 1 million characters, making it highly effective at "finding needles in haystacks" for tasks that require gathering and synthesizing information from large amounts of text.

Additionally, the `internlm2-5-7b-chat` model has stronger tool use abilities compared to other open-source models. It can leverage over 100 web pages to gather information, and the upcoming [Lagent](https://github.com/InternLM/lagent/tree/main) project will further expand its tool utilization capabilities for complex, multi-step tasks.

## What can I use it for?

The `internlm2-5-7b-chat` model's advanced reasoning, long-context, and tool use capabilities make it well-suited for a variety of applications, such as:

- Answering complex, multi-part questions that require gathering and synthesizing information from large amounts of text
- Solving challenging mathematical and logical problems
- Assisting with research and analysis tasks that involve sifting through large volumes of information
- Developing intelligent virtual assistants and chatbots with sophisticated language understanding and reasoning abilities

## Things to try

One key aspect to explore with the `internlm2-5-7b-chat` model is its impressive ability to process and reason over long input contexts. Try providing the model with prompts that require it to draw insights and connections from extensive amounts of text, and observe how it is able to efficiently locate and integrate relevant information to formulate a coherent response.

Another intriguing area to investigate is the model's evolving tool use capabilities. As the [Lagent](https://github.com/InternLM/lagent/tree/main) project progresses, experiment with prompts that involve the model leveraging various tools and data sources to tackle complex, multi-step tasks. This will help uncover the model's potential to serve as a versatile and adaptable assistant for a wide range of applications.

[](#internlm)InternLM
=====================

![](https://github.com/InternLM/InternLM/assets/22529082/b9788105-8892-4398-8b47-b513a292378e)

**InternLM** [_HOT_](https://internlm.intern-ai.org.cn/)

[![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)

[Github Repo](https://github.com/InternLM/InternLM)  [Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

[](#introduction)Introduction
-----------------------------

InternLM has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics:

*   It leverages trillions of high-quality tokens for training to establish a powerful knowledge base.
*   It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities.
*   It provides a versatile toolset for users to flexibly build their own workflows.

[](#internlm-7b)InternLM-7B
---------------------------

### [](#performance-evaluation)Performance Evaluation

We conducted a comprehensive evaluation of InternLM using the open-source evaluation tool [OpenCompass](https://github.com/internLM/OpenCompass/). The evaluation covered five dimensions of capabilities: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. Here are some of the evaluation results, and you can visit the [OpenCompass leaderboard](https://opencompass.org.cn/rank) for more evaluation results.

Datasets\\Models

**InternLM-Chat-7B**

**InternLM-7B**

LLaMA-7B

Baichuan-7B

ChatGLM2-6B

Alpaca-7B

Vicuna-7B

C-Eval(Val)

53.2

53.4

24.2

42.7

50.9

28.9

31.2

MMLU

50.8

51.0

35.2\*

41.5

46.0

39.7

47.3

AGIEval

42.5

37.6

20.8

24.6

39.0

24.1

26.4

CommonSenseQA

75.2

59.5

65.0

58.8

60.0

68.7

66.7

BUSTM

74.3

50.6

48.5

51.3

55.0

48.8

62.5

CLUEWSC

78.6

59.1

50.3

52.8

59.8

50.3

52.2

MATH

6.4

7.1

2.8

3.0

6.6

2.2

2.8

GSM8K

34.5

31.2

10.1

9.7

29.2

6.0

15.3

HumanEval

14.0

10.4

14.0

9.2

9.2

9.2

11.0

RACE(High)

76.3

57.4

46.9\*

28.1

66.3

40.7

54.0

*   The evaluation results were obtained from [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) (some data marked with \*, which means come from the original papers), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/).
*   The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).

**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

### [](#import-from-transformers)Import from Transformers

To load the InternLM 7B Chat model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
    # Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    response, history = model.chat(tokenizer, "hello", history=[])
    print(response)
    # Hello! How can I help you today?
    response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
    print(response)
    # Sure, here are three tips for effective time management:
    #
    # 1. Prioritize tasks based on importance and urgency: Make a list of all your tasks and categorize them into "important and urgent," "important but not urgent," and "not important but urgent." Focus on completing the tasks in the first category before moving on to the others.
    # 2. Use a calendar or planner: Write down deadlines and appointments in a calendar or planner so you don't forget them. This will also help you schedule your time more effectively and avoid overbooking yourself.
    # 3. Minimize distractions: Try to eliminate any potential distractions when working on important tasks. Turn off notifications on your phone, close unnecessary tabs on your computer, and find a quiet place to work if possible.
    # 
    # Remember, good time management skills take practice and patience. Start with small steps and gradually incorporate these habits into your daily routine.
    

The responses can be streamed using `stream_chat`:

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm-chat-7b"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True)
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "Hello", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

[](#open-source-license)Open Source License
-------------------------------------------

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

[](#)
---------

InternLM 70 InternLM-7B

*   
*   8k
*   

[](#internlm-7b-1)InternLM-7B
-----------------------------

### [](#)

 [OpenCompass](https://github.com/internLM/OpenCompass/) InternLM [OpenCompass ](https://opencompass.org.cn/rank) 

\\

**InternLM-Chat-7B**

**InternLM-7B**

LLaMA-7B

Baichuan-7B

ChatGLM2-6B

Alpaca-7B

Vicuna-7B

C-Eval(Val)

53.2

53.4

24.2

42.7

50.9

28.9

31.2

MMLU

50.8

51.0

35.2\*

41.5

46.0

39.7

47.3

AGIEval

42.5

37.6

20.8

24.6

39.0

24.1

26.4

CommonSenseQA

75.2

59.5

65.0

58.8

60.0

68.7

66.7

BUSTM

74.3

50.6

48.5

51.3

55.0

48.8

62.5

CLUEWSC

78.6

59.1

50.3

52.8

59.8

50.3

52.2

MATH

6.4

7.1

2.8

3.0

6.6

2.2

2.8

GSM8K

34.5

31.2

10.1

9.7

29.2

6.0

15.3

HumanEval

14.0

10.4

14.0

9.2

9.2

9.2

11.0

RACE(High)

76.3

57.4

46.9\*

28.1

66.3

40.7

54.0

*    [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) `*` [OpenCompass](https://github.com/internLM/OpenCompass/) 
*    [OpenCompass](https://github.com/internLM/OpenCompass/)  [OpenCompass](https://github.com/internLM/OpenCompass/) 

**** 

### [](#-transformers-) Transformers 

 InternLM 7B Chat 

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
    # `torch_dtype=torch.float16`  float16  transformers  float32
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    response, history = model.chat(tokenizer, "", history=[])
    print(response)
    # 
    response, history = model.chat(tokenizer, "", history=history)
    print(response)
    # 
    # 1. 
    # 2. 
    # 3. 
    

 `stream_chat` 

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm-chat-7b"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dype=torch.float16, trust_remote_code=True)
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

[](#)
---------------

 Apache-2.0 [](https://wj.qq.com/s2/12725412/f7c1/) [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn)

## Model overview

`internlm-chat-7b` is a 7 billion parameter AI language model developed by InternLM, a collaboration between the Shanghai Artificial Intelligence Laboratory, SenseTime Technology, the Chinese University of Hong Kong, and Fudan University. The model was trained on a vast dataset of over 2 trillion high-quality tokens, establishing a powerful knowledge base. To enable longer input sequences and stronger reasoning capabilities, it supports an 8k context window length. Compared to other models in the 7B parameter range, [InternLM-7B](https://aimodels.fyi/models/huggingFace/internlm-7b-internlm) and [InternLM-Chat-7B](https://aimodels.fyi/models/huggingFace/internlm-chat-7b-internlm) demonstrate significantly stronger performance across a range of benchmarks, including disciplinary competence, language competence, knowledge competence, inference competence, and comprehensive understanding.

## Model inputs and outputs

`internlm-chat-7b` is a text-to-text language model that can be used for a variety of natural language processing tasks. The model takes plain text as input and generates text as output. Some key highlights include:

### Inputs
- **Natural language prompts**: The model can accept a wide range of natural language prompts, from simple queries to multi-sentence instructions.
- **Context length**: The model supports an 8k context window, allowing it to reason over longer input sequences.

### Outputs
- **Natural language responses**: The model generates human-readable text responses, which can range from short phrases to multi-paragraph passages.
- **Versatile toolset**: The model provides a flexible toolset, enabling users to build their own custom workflows and applications.

## Capabilities

`internlm-chat-7b` demonstrates strong performance across a range of benchmarks, including disciplinary competence, language competence, knowledge competence, inference competence, and comprehensive understanding. For example, on the MMLU benchmark, the model achieves a score of 50.8, outperforming the LLaMA-7B, Baichuan-7B, and Alpaca-7B models. Similarly, on the AGI-Eval benchmark, the model scores 42.5, again surpassing the comparison models.

## What can I use it for?

With its robust knowledge base, strong reasoning capabilities, and versatile toolset, `internlm-chat-7b` can be applied to a wide range of natural language processing tasks and applications. Some potential use cases include:

- **Content creation**: Generate high-quality written content, such as articles, reports, and stories.
- **Question answering**: Provide informative and well-reasoned responses to a variety of questions.
- **Task assistance**: Help users complete tasks by understanding natural language instructions and generating relevant outputs.
- **Conversational AI**: Engage in natural, contextual dialogues and provide helpful responses to users.

## Things to try

One interesting aspect of `internlm-chat-7b` is its ability to handle longer input sequences. Try providing the model with more detailed, multi-sentence prompts and observe how it is able to leverage the extended context to generate more coherent and informative responses. Additionally, experiment with the model's versatile toolset to see how you can customize and extend its capabilities to suit your specific needs.

[](#internlm)InternLM
=====================

![](https://github.com/InternLM/InternLM/assets/22529082/b9788105-8892-4398-8b47-b513a292378e)

**InternLM** [_HOT_](https://internlm.intern-ai.org.cn/)

[![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)

[Github Repo](https://github.com/InternLM/InternLM)  [Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

[](#introduction)Introduction
-----------------------------

InternLM has open-sourced a 7 billion parameter base model tailored for practical scenarios. The model has the following characteristics:

*   It leverages trillions of high-quality tokens for training to establish a powerful knowledge base.
*   It provides a versatile toolset for users to flexibly build their own workflows.

[](#internlm-7b)InternLM-7B
---------------------------

### [](#performance-evaluation)Performance Evaluation

We conducted a comprehensive evaluation of InternLM using the open-source evaluation tool [OpenCompass](https://github.com/internLM/OpenCompass/). The evaluation covered five dimensions of capabilities: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. Here are some of the evaluation results, and you can visit the [OpenCompass leaderboard](https://opencompass.org.cn/rank) for more evaluation results.

Datasets\\Models

**InternLM-Chat-7B**

**InternLM-7B**

LLaMA-7B

Baichuan-7B

ChatGLM2-6B

Alpaca-7B

Vicuna-7B

C-Eval(Val)

53.2

53.4

24.2

42.7

50.9

28.9

31.2

MMLU

50.8

51.0

35.2\*

41.5

46.0

39.7

47.3

AGIEval

42.5

37.6

20.8

24.6

39.0

24.1

26.4

CommonSenseQA

75.2

59.5

65.0

58.8

60.0

68.7

66.7

BUSTM

74.3

50.6

48.5

51.3

55.0

48.8

62.5

CLUEWSC

78.6

59.1

50.3

52.8

59.8

50.3

52.2

MATH

6.4

7.1

2.8

3.0

6.6

2.2

2.8

GSM8K

34.5

31.2

10.1

9.7

29.2

6.0

15.3

HumanEval

14.0

10.4

14.0

9.2

9.2

9.2

11.0

RACE(High)

76.3

57.4

46.9\*

28.1

66.3

40.7

54.0

*   The evaluation results were obtained from [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) (some data marked with \*, which means come from the original papers), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/).
*   The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).

**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

### [](#import-from-transformers)Import from Transformers

To load the InternLM 7B Chat model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-7b", trust_remote_code=True)
    # Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    inputs = tokenizer(["A beautiful flower"], return_tensors="pt")
    for k,v in inputs.items():
        inputs[k] = v.cuda()
    gen_kwargs = {"max_length": 128, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
    output = model.generate(**inputs, **gen_kwargs)
    output = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
    print(output)
    # <s> A beautiful flower box made of white rose wood. It is a perfect gift for weddings, birthdays and anniversaries.
    # All the roses are from our farm Roses Flanders. Therefor you know that these flowers last much longer than those in store or online!</s>
    

[](#open-source-license)Open Source License
-------------------------------------------

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

[](#)
---------

InternLM 70 InternLM-7B

*   
*   

[](#internlm-7b-1)InternLM-7B
-----------------------------

### [](#)

 [OpenCompass](https://github.com/internLM/OpenCompass/) InternLM [OpenCompass ](https://opencompass.org.cn/rank) 

\\

**InternLM-Chat-7B**

**InternLM-7B**

LLaMA-7B

Baichuan-7B

ChatGLM2-6B

Alpaca-7B

Vicuna-7B

C-Eval(Val)

53.2

53.4

24.2

42.7

50.9

28.9

31.2

MMLU

50.8

51.0

35.2\*

41.5

46.0

39.7

47.3

AGIEval

42.5

37.6

20.8

24.6

39.0

24.1

26.4

CommonSenseQA

75.2

59.5

65.0

58.8

60.0

68.7

66.7

BUSTM

74.3

50.6

48.5

51.3

55.0

48.8

62.5

CLUEWSC

78.6

59.1

50.3

52.8

59.8

50.3

52.2

MATH

6.4

7.1

2.8

3.0

6.6

2.2

2.8

GSM8K

34.5

31.2

10.1

9.7

29.2

6.0

15.3

HumanEval

14.0

10.4

14.0

9.2

9.2

9.2

11.0

RACE(High)

76.3

57.4

46.9\*

28.1

66.3

40.7

54.0

*    [OpenCompass 20230706](https://github.com/internLM/OpenCompass/) `*` [OpenCompass](https://github.com/internLM/OpenCompass/) 
*    [OpenCompass](https://github.com/internLM/OpenCompass/)  [OpenCompass](https://github.com/internLM/OpenCompass/) 

**** 

### [](#-transformers-) Transformers 

 InternLM 7B Chat 

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-7b", trust_remote_code=True)
    # `torch_dtype=torch.float16`  float16  transformers  float32
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    inputs = tokenizer([""], return_tensors="pt")
    for k,v in inputs.items():
        inputs[k] = v.cuda()
    gen_kwargs = {"max_length": 128, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
    output = model.generate(**inputs, **gen_kwargs)
    output = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
    print(output)
    # ,; 
    # ,.,!
    

[](#)
---------------

 Apache-2.0 [](https://wj.qq.com/s2/12725412/f7c1/) [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn)

## Model overview

`InternLM-7B` is a 7 billion parameter large language model developed by the Shanghai Artificial Intelligence Laboratory. The model has been trained on a vast amount of high-quality data, including web text, books, and code, to establish a strong knowledge base. It provides a versatile toolset for users to build their own workflows. `InternLM-7B` is part of the InternLM model series, which also includes the `InternLM-Chat-7B` model, a version fine-tuned for conversational abilities.

Compared to similar models like [LLaMA-7B](https://aimodels.fyi/models/huggingFace/llama-7b-facebookai), [Baichuan-7B](https://aimodels.fyi/models/huggingFace/baichuan-7b-baichuan-inc), and [ChatGLM2-6B](https://aimodels.fyi/models/huggingFace/chatglm2-6b-tsinghua), `InternLM-7B` demonstrates stronger performance across various benchmarks, including disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence.

## Model inputs and outputs

### Inputs
- Free-form text input
- Can handle input sequences up to 8,192 tokens in length

### Outputs
- Free-form text output
- Generates coherent and contextually relevant responses

## Capabilities

`InternLM-7B` excels at a wide range of natural language processing tasks, including question answering, task completion, and open-ended conversation. It has shown particularly strong performance on Chinese and English language understanding, as well as reasoning and mathematical abilities.

For example, on the MMLU (Multi-Task Language Understanding) benchmark, `InternLM-7B` achieves a score of 51.0%, outperforming models like [LLaMA-7B](https://aimodels.fyi/models/huggingFace/llama-7b-facebookai) (35.2%) and [Baichuan-7B](https://aimodels.fyi/models/huggingFace/baichuan-7b-baichuan-inc) (41.5%). On the GSM8K (Grade School Math) benchmark, `InternLM-7B` scores 31.2%, again surpassing [LLaMA-7B](https://aimodels.fyi/models/huggingFace/llama-7b-facebookai) (10.1%) and [Baichuan-7B](https://aimodels.fyi/models/huggingFace/baichuan-7b-baichuan-inc) (9.7%).

## What can I use it for?

`InternLM-7B` can be used for a wide range of natural language processing applications, such as content generation, question answering, task completion, and open-ended dialogue. Its strong performance on Chinese and English language understanding and reasoning makes it a valuable tool for multilingual applications.

Potential use cases include:
- Chatbots and virtual assistants
- Automated writing and content generation
- Language translation and multilingual support
- Educational and tutoring applications
- Research and analysis tasks requiring natural language understanding

## Things to try

One interesting aspect of `InternLM-7B` is its ability to handle longer input sequences, up to 8,192 tokens, thanks to its optimized architecture. This can be particularly useful for tasks that require reasoning over longer contexts, such as summarization, question answering, or task completion over multi-step instructions.

Additionally, the model's strong performance on mathematical and reasoning tasks suggests it could be a valuable tool for applications that involve quantitative analysis or problem-solving, such as financial forecasting, scientific research, or even software engineering.

[](#internlm)InternLM
=====================

![](https://github.com/InternLM/InternLM/assets/22529082/b9788105-8892-4398-8b47-b513a292378e)

**InternLM** [_HOT_](https://internlm.intern-ai.org.cn/)

[![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)

[Github Repo](https://github.com/InternLM/InternLM)  [Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

[](#introduction)Introduction
-----------------------------

InternLM2 has open-sourced a 20 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics:

*   **200K Context window**: Nearly perfect at finding needles in the haystack with 200K-long context, with leading performance on long-context tasks like LongBench and L-Eval. Try it with [LMDeploy](https://github.com/InternLM/lmdeploy) for 200K-context inference.
    
*   **Outstanding comprehensive performance**: Significantly better than the last generation in all dimensions, especially in reasoning, math, code, chat experience, instruction following, and creative writing, with leading performance among open-source models in similar sizes. In some evaluations, InternLM2-Chat-20B may match or even surpass ChatGPT (GPT-3.5).
    
*   **Code interpreter & Data analysis**: With code interpreter, InternLM2-Chat-20B obtains compatible performance with GPT-4 on GSM8K and MATH. InternLM2-Chat also provides data analysis capability.
    
*   **Stronger tool use**: Based on better tool utilization-related capabilities in instruction following, tool selection and reflection, InternLM2 can support more kinds of agents and multi-step tool calling for complex tasks. See [examples](https://github.com/InternLM/lagent).
    

[](#internlm2-chat-20b)InternLM2-Chat-20B
-----------------------------------------

### [](#performance-evaluation)Performance Evaluation

We conducted a comprehensive evaluation of InternLM using the open-source evaluation tool [OpenCompass](https://github.com/internLM/OpenCompass/). The evaluation covered five dimensions of capabilities: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. Here are some of the evaluation results, and you can visit the [OpenCompass leaderboard](https://opencompass.org.cn/rank) for more evaluation results.

Dataset\\Models

InternLM2-7B

InternLM2-Chat-7B

InternLM2-20B

InternLM2-Chat-20B

ChatGPT

GPT-4

MMLU

65.8

63.7

67.7

66.5

69.1

83.0

AGIEval

49.9

47.2

53.0

50.3

39.9

55.1

BBH

65.0

61.2

72.1

68.3

70.1

86.7

GSM8K

70.8

70.7

76.1

79.6

78.2

91.4

MATH

20.2

23.0

25.5

31.9

28.0

45.8

HumanEval

43.3

59.8

48.8

67.1

73.2

74.4

MBPP(Sanitized)

51.8

51.4

63.0

65.8

78.9

79.0

*   The evaluation results were obtained from [OpenCompass](https://github.com/internLM/OpenCompass/) (some data marked with \*, which means come from the original papers), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/).
*   The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).

**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

### [](#import-from-transformers)Import from Transformers

To load the InternLM 20B Chat model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-20b", trust_remote_code=True)
    # Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-20b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    response, history = model.chat(tokenizer, "hello", history=[])
    print(response)
    # Hello! How can I help you today?
    response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
    print(response)
    

The responses can be streamed using `stream_chat`:

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm2-chat-20b"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "Hello", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

[](#open-source-license)Open Source License
-------------------------------------------

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

[](#)
---------

InternLM2 200 InternLM2-Chat-20B

*   2020 LongBench  L-Eval   [LMDeploy](https://github.com/InternLM/lmdeploy) 20
*    InternLM2-Chat-20B  ChatGPT GPT-3.5
*   code-interpreterInternLM2-Chat-20B  GSM8K  MATH  GPT-4 InternLM2-Chat 
*   [](https://github.com/InternLM/lagent)

[](#internlm2-chat-20b-1)InternLM2-Chat-20B
-------------------------------------------

### [](#)

 [OpenCompass](https://github.com/internLM/OpenCompass/) InternLM [OpenCompass ](https://opencompass.org.cn/rank) 



InternLM2-7B

InternLM2-Chat-7B

InternLM2-20B

InternLM2-Chat-20B

ChatGPT

GPT-4

MMLU

65.8

63.7

67.7

66.5

69.1

83.0

AGIEval

49.9

47.2

53.0

50.3

39.9

55.1

BBH

65.0

61.2

72.1

68.3

70.1

86.7

GSM8K

70.8

70.7

76.1

79.6

78.2

91.4

MATH

20.2

23.0

25.5

31.9

28.0

45.8

HumanEval

43.3

59.8

48.8

67.1

73.2

74.4

MBPP(Sanitized)

51.8

51.4

63.0

65.8

78.9

79.0

*    [OpenCompass](https://github.com/internLM/OpenCompass/) `*` [OpenCompass](https://github.com/internLM/OpenCompass/) 
*    [OpenCompass](https://github.com/internLM/OpenCompass/)  [OpenCompass](https://github.com/internLM/OpenCompass/) 

**** 

### [](#-transformers-) Transformers 

 InternLM2 20B Chat 

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-20b", trust_remote_code=True)
    # `torch_dtype=torch.float16`  float16  transformers  float32
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-20b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    response, history = model.chat(tokenizer, "", history=[])
    print(response)
    # 
    response, history = model.chat(tokenizer, "", history=history)
    print(response)
    

 `stream_chat` 

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm2-chat-20b"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dype=torch.float16, trust_remote_code=True).cuda()
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

[](#)
---------------

 Apache-2.0 [](https://wj.qq.com/s2/12725412/f7c1/) [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn)

## Model Overview

`internlm2-chat-20b` is a 20 billion parameter language model developed by [InternLM](https://aimodels.fyi/creators/huggingFace/internlm). It is an open-sourced model that has been fine-tuned for practical chat scenarios, building on InternLM's previous 7 billion parameter base model. Compared to the earlier version, `internlm2-chat-20b` exhibits significantly improved performance across a variety of benchmarks, including reasoning, math, code, chat experience, instruction following, and creative writing. In some evaluations, it may even match or surpass the capabilities of ChatGPT (GPT-3.5).

The model's 200,000 token context window allows it to excel at long-context tasks, and it also provides strong code interpretation and data analysis capabilities. Additionally, it demonstrates an enhanced ability to utilize tools and follow multi-step instructions, enabling it to support more complex agent workflows.

## Model Inputs and Outputs

### Inputs
- Text input

### Outputs
- Generated text

## Capabilities

`internlm2-chat-20b` has outstanding comprehensive performance, outperforming similar-sized open-source models across a range of benchmarks. It exhibits leading capabilities in areas such as reasoning, math, code, chat experience, instruction following, and creative writing. The model's 200,000 token context window allows it to excel at long-context tasks, and it also provides strong code interpretation and data analysis capabilities.

## What Can I Use It For?

You can use `internlm2-chat-20b` for a variety of natural language tasks, such as:

- **Chatbots and conversational agents**: The model's strong chat experience and instruction following abilities make it well-suited for building engaging conversational AI assistants.
- **Content generation**: The model's capabilities in areas like creative writing and text generation can be leveraged to produce high-quality content for various applications.
- **Problem-solving and task assistance**: The model's reasoning, math, and code interpretation skills can aid in solving complex problems and automating multi-step workflows.
- **Data analysis**: The model's data analysis capabilities can be utilized to extract insights and generate reports from structured and unstructured data.

## Things to Try

One interesting aspect of `internlm2-chat-20b` is its ability to perform well on long-context tasks, thanks to its 200,000 token context window. You can try prompting the model with long-form inputs and observe how it maintains coherence and provides relevant and insightful responses. Additionally, you can explore the model's versatility by testing its capabilities across a diverse range of domains, from creative writing to technical problem-solving.

**InternLM**

![](https://github.com/InternLM/InternLM/assets/22529082/b9788105-8892-4398-8b47-b513a292378e)

**InternLM** [_HOT_](https://internlm.intern-ai.org.cn/)

[![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)

[Github Repo](https://github.com/InternLM/InternLM)  [Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

[](#introduction)Introduction
-----------------------------

The Shanghai Artificial Intelligence Laboratory, in collaboration with SenseTime Technology, the Chinese University of Hong Kong, and Fudan University, has officially released the 20 billion parameter pretrained model, InternLM-20B. InternLM-20B was pre-trained on over **2.3T** Tokens containing high-quality English, Chinese, and code data. Additionally, the Chat version has undergone SFT and RLHF training, enabling it to better and more securely meet users' needs.

In terms of model structure, InternLM-20B opted for a deeper architecture, with a depth set at 60 layers. This surpasses the conventional 7B and 13B models that utilize 32 or 40 layers. When parameters are limited, increasing the number of layers can enhance the model's overall capability. Furthermore, compared to InternLM-7B, the pre-training data used for InternLM-20B underwent higher quality cleansing and was supplemented with data rich in knowledge and designed for reinforcing understanding and reasoning capabilities. As a result, it exhibits significant improvements in understanding, reasoning, mathematical, and programming abilitiesall of which test the technical proficiency of language models. Overall, InternLM-20B features the following characteristics:

*   Outstanding overall performance
*   Strong utility invocation capability
*   Supports a 16k context length (Through infererence extrapolation)
*   Better value alignment.

[](#performance-evaluation)Performance Evaluation
-------------------------------------------------

On the 5 capability dimensions proposed by OpenCompass, InternLM-20B has achieved excellent results (the bolded scores represent the best performances within the 13B-33B parameter range).

Capability

Llama-13B

Llama2-13B

Baichuan2-13B

InternLM-20B

Llama-33B

Llama-65B

Llama2-70B

Language

42.5

47

47.5

**55**

44.6

47.1

51.6

Knowledge

58.2

58.3

48.9

60.1

**64**

66

67.7

Understanding

45.5

50.9

58.1

**67.3**

50.6

54.2

60.8

Reasoning

42.7

43.6

44.2

**54.9**

46.4

49.8

55

Examination

37.3

45.2

51.8

**62.5**

47.4

49.7

57.3

Overall

43.8

47.3

49.4

**59.2**

48.9

51.9

57.4

The table below compares the performance of mainstream open-source models on some influential and typical datasets.

Benchmarks

Llama-13B

Llama2-13B

Baichuan2-13B

InternLM-20B

Llama-33B

Llama-65B

Llama2-70B

Examination

MMLU

47.73

54.99

59.55

**62.05**

58.73

63.71

69.75

C-Eval (val)

31.83

41.4

**59.01**

58.8

37.47

40.36

50.13

AGI-Eval

22.03

30.93

37.37

**44.58**

33.53

33.92

40.02

Knowledge

BoolQ

78.75

82.42

67

**87.46**

84.43

86.61

87.74

TriviaQA

52.47

59.36

46.61

57.26

**66.24**

69.79

70.71

NaturalQuestions

20.17

24.85

16.32

25.15

**30.89**

33.41

34.16

Understanding

CMRC

9.26

31.59

29.85

**68.78**

14.17

34.73

43.74

CSL

55

58.75

63.12

**65.62**

57.5

59.38

60

RACE (middle)

53.41

63.02

68.94

**86.35**

64.55

72.35

81.55

RACE (high)

47.63

58.86

67.18

**83.28**

62.61

68.01

79.93

XSum

20.37

23.37

25.23

**35.54**

20.55

19.91

25.38

Reasoning

WinoGrande

64.64

64.01

67.32

**69.38**

66.85

69.38

69.77

BBH

37.93

45.62

48.98

**52.51**

49.98

58.38

64.91

GSM8K

20.32

29.57

**52.62**

**52.62**

42.3

54.44

63.31

PIQA

79.71

79.76

78.07

80.25

**81.34**

82.15

82.54

Programming

HumanEval

14.02

18.9

17.07

**25.61**

17.68

18.9

26.22

MBPP

20.6

26.8

30.8

**35.6**

28.4

33.6

39.6

Overall, InternLM-20B comprehensively outperforms open-source models in the 13B parameter range in terms of overall capabilities, and on inference evaluation sets, it approaches or even surpasses the performance of Llama-65B.

[](#import-from-transformers)Import from Transformers
-----------------------------------------------------

To load the InternLM 20B model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-20b", trust_remote_code=True)
    # Set `torch_dtype=torch.bfloat16` to load model in bfloat16, otherwise it will be loaded as float32 and cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm-20b", torch_dtype=torch.bfloat16, trust_remote_code=True).cuda()
    model = model.eval()
    inputs = tokenizer(["Coming to the beautiful nature, we found"], return_tensors="pt")
    for k,v in inputs.items():
        inputs[k] = v.cuda()
    gen_kwargs = {"max_length": 128, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.05}
    output = model.generate(**inputs, **gen_kwargs)
    output = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
    print(output)
    # Coming to the beautiful nature, we found not only various mountains, rivers, trees, and flowers but also many birds and beasts. Birds are the ones we are most familiar with; some are soaring in the sky, some are hopping on the ground, while others perch on trees...
    

**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

[](#open-source-license)Open Source License
-------------------------------------------

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

[](#)
---------

200 InternLM-20B InternLM-20B  **2.3T** Tokens  Chat  SFT  RLHF 

InternLM 20B 607B13B3240InternLM-7BInternLM-20BInternLM-20B

*   
*   
*   16k
*   

[](#)
-------------

OpenCompass5InternLM-20B13B-33B



Llama-13B

Llama2-13B

Baichuan2-13B

InternLM-20B

Llama-33B

Llama-65B

Llama2-70B



42.5

47

47.5

**55**

44.6

47.1

51.6



58.2

58.3

48.9

60.1

**64**

66

67.7



45.5

50.9

58.1

**67.3**

50.6

54.2

60.8



42.7

43.6

44.2

**54.9**

46.4

49.8

55



37.3

45.2

51.8

**62.5**

47.4

49.7

57.3



43.8

47.3

49.4

**59.2**

48.9

51.9

57.4

 InternLM 20B 



Llama-13B

Llama2-13B

Baichuan2-13B

InternLM-20B

Llama-33B

Llama-65B

Llama2-70B



MMLU

47.73

54.99

59.55

**62.05**

58.73

63.71

69.75

C-Eval (val)

31.83

41.4

**59.01**

58.8

37.47

40.36

50.13

AGI-Eval

22.03

30.93

37.37

**44.58**

33.53

33.92

40.02



BoolQ

78.75

82.42

67

**87.46**

84.43

86.61

87.74

TriviaQA

52.47

59.36

46.61

57.26

**66.24**

69.79

70.71

NaturalQuestions

20.17

24.85

16.32

25.15

**30.89**

33.41

34.16



CMRC

9.26

31.59

29.85

**68.78**

14.17

34.73

43.74

CSL

55

58.75

63.12

**65.62**

57.5

59.38

60

RACE (middle)

53.41

63.02

68.94

**86.35**

64.55

72.35

81.55

RACE (high)

47.63

58.86

67.18

**83.28**

62.61

68.01

79.93

XSum

20.37

23.37

25.23

**35.54**

20.55

19.91

25.38



WinoGrande

64.64

64.01

67.32

**69.38**

66.85

69.38

69.77

BBH

37.93

45.62

48.98

**52.51**

49.98

58.38

64.91

GSM8K

20.32

29.57

**52.62**

**52.62**

42.3

54.44

63.31

PIQA

79.71

79.76

78.07

80.25

**81.34**

82.15

82.54



HumanEval

14.02

18.9

17.07

**25.61**

17.68

18.9

26.22

MBPP

20.6

26.8

30.8

**35.6**

28.4

33.6

39.6

InternLM-20B 13BLlama-65B

[](#-transformers-) Transformers 
-----------------------------------------

 InternLM 20B 

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-20b", trust_remote_code=True)
    # `torch_dtype=torch.bfloat16`  bfloat16  transformers  float32
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm-20b", torch_dtype=torch.bfloat16, trust_remote_code=True).cuda()
    model = model.eval()
    inputs = tokenizer([""], return_tensors="pt")
    for k,v in inputs.items():
        inputs[k] = v.cuda()
    gen_kwargs = {"max_length": 128, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.05}
    output = model.generate(**inputs, **gen_kwargs)
    output = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
    print(output)
    # 
    

**** 

[](#)
---------------

 Apache-2.0 [](https://wj.qq.com/s2/12725412/f7c1/) [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn)

## Model Overview

The `internlm-20b` model is a 20 billion parameter pretrained language model developed by the Shanghai Artificial Intelligence Laboratory in collaboration with SenseTime Technology, the Chinese University of Hong Kong, and Fudan University. Compared to smaller models like [internlm-7b](https://aimodels.fyi/models/huggingFace/internlm-7b-internlm) and [internlm-chat-7b](https://aimodels.fyi/models/huggingFace/internlm-chat-7b-internlm), the `internlm-20b` model has a deeper architecture with 60 layers, allowing it to achieve significant improvements in understanding, reasoning, mathematical, and programming abilities.

The model was trained on over 2.3 trillion tokens of high-quality English, Chinese, and code data. It also underwent SFT and RLHF training for the chat version, enabling it to better and more securely meet users' needs. On the 5 capability dimensions proposed by [OpenCompass](https://github.com/internLM/OpenCompass/), the `internlm-20b` model achieved excellent results, outperforming other large models in the 13B-33B parameter range.

## Model Inputs and Outputs

### Inputs
- **Text**: The `internlm-20b` model can accept text input for language modeling and generation tasks.

### Outputs
- **Text**: The model generates coherent and contextual text outputs based on the input.
- **Utility invocation**: The model has strong utility invocation capabilities, allowing it to perform various tasks like calculations, programming, and data analysis.

## Capabilities

The `internlm-20b` model excels at a wide range of language tasks, including understanding, reasoning, mathematics, and programming. It achieves state-of-the-art performance on benchmark datasets like MMLU, C-Eval, and GSM8K, demonstrating its technical proficiency. The model's 16k context length also enables it to handle longer input sequences and perform stronger reasoning.

## What Can I Use It For?

The `internlm-20b` model can be a valuable tool for a variety of applications, such as:

- **Content generation**: The model can be used to generate high-quality text content, including articles, stories, and dialogue, across various domains.
- **Question answering and knowledge retrieval**: The model's strong understanding and reasoning capabilities make it suitable for building question-answering systems and knowledge retrieval applications.
- **Code generation and programming assistance**: The model's programming abilities allow it to assist with code generation, debugging, and software development tasks.
- **Data analysis and visualization**: The model can be used to extract insights from data and generate visual representations of findings.

## Things to Try

One interesting aspect of the `internlm-20b` model is its strong utility invocation capability. You can try prompting the model to perform various tasks like mathematical calculations, unit conversions, or even simple programming. The model's ability to understand and execute these types of instructions is a testament to its technical proficiency and versatility.

Another area to explore is the model's performance on long-context tasks. Given its 16k context length, you can experiment with providing the model with extensive background information and prompts that require reasoning across a large amount of text. This can help you understand the model's strengths in handling complex, multi-faceted scenarios.

[](#internlm)InternLM
=====================

![](https://github.com/InternLM/InternLM/assets/22529082/b9788105-8892-4398-8b47-b513a292378e)

**InternLM** [_HOT_](https://internlm.intern-ai.org.cn/)

[![evaluation](https://github.com/InternLM/InternLM/assets/22529082/f80a2a58-5ddf-471a-8da4-32ab65c8fd3b)](https://github.com/internLM/OpenCompass/)

[Github Repo](https://github.com/InternLM/InternLM)  [Reporting Issues](https://github.com/InternLM/InternLM/issues/new)

[](#introduction)Introduction
-----------------------------

InternLM2 has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics:

*   **200K Context window**: Nearly perfect at finding needles in the haystack with 200K-long context, with leading performance on long-context tasks like LongBench and L-Eval. Try it with [LMDeploy](https://github.com/InternLM/lmdeploy) for 200K-context inference.
    
*   **Outstanding comprehensive performance**: Significantly better than the last generation in all dimensions, especially in reasoning, math, code, chat experience, instruction following, and creative writing, with leading performance among open-source models in similar sizes. In some evaluations, InternLM2-Chat-20B may match or even surpass ChatGPT (GPT-3.5).
    
*   **Code interpreter & Data analysis**: With code interpreter, InternLM2-Chat-20B obtains compatible performance with GPT-4 on GSM8K and MATH. InternLM2-Chat also provides data analysis capability.
    
*   **Stronger tool use**: Based on better tool utilization-related capabilities in instruction following, tool selection and reflection, InternLM2 can support more kinds of agents and multi-step tool calling for complex tasks. See [examples](https://github.com/InternLM/lagent).
    

[](#internlm2-chat-7b)InternLM2-Chat-7B
---------------------------------------

### [](#performance-evaluation)Performance Evaluation

We conducted a comprehensive evaluation of InternLM using the open-source evaluation tool [OpenCompass](https://github.com/internLM/OpenCompass/). The evaluation covered five dimensions of capabilities: disciplinary competence, language competence, knowledge competence, inference competence, and comprehension competence. Here are some of the evaluation results, and you can visit the [OpenCompass leaderboard](https://opencompass.org.cn/rank) for more evaluation results.

Dataset\\Models

InternLM2-7B

InternLM2-Chat-7B

InternLM2-20B

InternLM2-Chat-20B

ChatGPT

GPT-4

MMLU

65.8

63.7

67.7

66.5

69.1

83.0

AGIEval

49.9

47.2

53.0

50.3

39.9

55.1

BBH

65.0

61.2

72.1

68.3

70.1

86.7

GSM8K

70.8

70.7

76.1

79.6

78.2

91.4

MATH

20.2

23.0

25.5

31.9

28.0

45.8

HumanEval

43.3

59.8

48.8

67.1

73.2

74.4

MBPP(Sanitized)

51.8

51.4

63.0

65.8

78.9

79.0

*   The evaluation results were obtained from [OpenCompass](https://github.com/internLM/OpenCompass/) (some data marked with \*, which means come from the original papers), and evaluation configuration can be found in the configuration files provided by [OpenCompass](https://github.com/internLM/OpenCompass/).
*   The evaluation data may have numerical differences due to the version iteration of [OpenCompass](https://github.com/internLM/OpenCompass/), so please refer to the latest evaluation results of [OpenCompass](https://github.com/internLM/OpenCompass/).

**Limitations:** Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

### [](#import-from-transformers)Import from Transformers

To load the InternLM2 7B Chat model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
    # Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    response, history = model.chat(tokenizer, "hello", history=[])
    print(response)
    # Hello! How can I help you today?
    response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
    print(response)
    

The responses can be streamed using `stream_chat`:

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm2-chat-7b"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "Hello", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

[](#open-source-license)Open Source License
-------------------------------------------

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow **free** commercial usage. To apply for a commercial license, please fill in the [application form (English)](https://wj.qq.com/s2/12727483/5dba/)/[](https://wj.qq.com/s2/12725412/f7c1/). For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

[](#)
---------

InternLM2 70 InternLM2-Chat-7B

*   2020 LongBench  L-Eval   [LMDeploy](https://github.com/InternLM/lmdeploy) 20
*    InternLM2-Chat-20B  ChatGPT GPT-3.5
*   code-interpreterInternLM2-Chat-20B  GSM8K  MATH  GPT-4 InternLM2-Chat 
*   [](https://github.com/InternLM/lagent)

[](#internlm2-chat-7b-1)InternLM2-Chat-7B
-----------------------------------------

### [](#)

 [OpenCompass](https://github.com/internLM/OpenCompass/) InternLM [OpenCompass ](https://opencompass.org.cn/rank) 



InternLM2-7B

InternLM2-Chat-7B

InternLM2-20B

InternLM2-Chat-20B

ChatGPT

GPT-4

MMLU

65.8

63.7

67.7

66.5

69.1

83.0

AGIEval

49.9

47.2

53.0

50.3

39.9

55.1

BBH

65.0

61.2

72.1

68.3

70.1

86.7

GSM8K

70.8

70.7

76.1

79.6

78.2

91.4

MATH

20.2

23.0

25.5

31.9

28.0

45.8

HumanEval

43.3

59.8

48.8

67.1

73.2

74.4

MBPP(Sanitized)

51.8

51.4

63.0

65.8

78.9

79.0

*    [OpenCompass](https://github.com/internLM/OpenCompass/) `*` [OpenCompass](https://github.com/internLM/OpenCompass/) 
*    [OpenCompass](https://github.com/internLM/OpenCompass/)  [OpenCompass](https://github.com/internLM/OpenCompass/) 

**** 

### [](#-transformers-) Transformers 

 InternLM2 7B Chat 

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
    # `torch_dtype=torch.float16`  float16  transformers  float32
    model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    response, history = model.chat(tokenizer, "", history=[])
    print(response)
    # 
    response, history = model.chat(tokenizer, "", history=history)
    print(response)
    

 `stream_chat` 

    import torch
    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_path = "internlm/internlm2-chat-7b"
    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dype=torch.float16, trust_remote_code=True).cuda()
    tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
    
    model = model.eval()
    length = 0
    for response, history in model.stream_chat(tokenizer, "", history=[]):
        print(response[length:], flush=True, end="")
        length = len(response)
    

[](#)
---------------

 Apache-2.0 [](https://wj.qq.com/s2/12725412/f7c1/) [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn)

## Model overview

The `internlm2-chat-7b` model is a 7 billion parameter language model developed by [internlm](https://aimodels.fyi/creators/huggingFace/internlm), a team that has also open-sourced larger models like the `internlm2-chat-20b`. This model is optimized for practical conversational scenarios, with capabilities that surpass other open-source models of similar size. 

The `internlm2-chat-7b` model has several key characteristics. It leverages a 200K context window, allowing it to excel at long-form tasks like LongBench and L-Eval. It also demonstrates strong performance across a variety of benchmarks, including reasoning, math, code, chat experience, instruction following, and creative writing. Notably, the `internlm2-chat-20b` version may even match or exceed the capabilities of ChatGPT.

The model also includes a code interpreter and data analysis capabilities, providing compatible performance with GPT-4 on tasks like GSM8K and MATH. Additionally, the `internlm2` series demonstrates improved tool utilization, enabling more flexible multi-step workflows for complex tasks.

## Model inputs and outputs

### Inputs
- **Text prompts**: The `internlm2-chat-7b` model accepts natural language text prompts as input.

### Outputs
- **Generated text**: The model outputs generated text responses based on the provided prompts.

## Capabilities

The `internlm2-chat-7b` model exhibits strong performance across a range of benchmarks, including reasoning, math, code, chat experience, instruction following, and creative writing. For example, on the MATH dataset, the `internlm2-chat-7b` model scored 23.0, outperforming the LLaMA-7B model and approaching the performance of larger models like GPT-4.

## What can I use it for?

The `internlm2-chat-7b` model can be used for a variety of language-based tasks, such as:

- **Conversational AI**: The model's strong chat experience capabilities make it well-suited for building conversational AI assistants.
- **Content generation**: The model's creative writing abilities allow it to generate high-quality text, such as articles, stories, or poems.
- **Code generation and assistance**: The model's code interpreter and programming capabilities can be leveraged to assist with code-related tasks.

## Things to try

One interesting aspect of the `internlm2-chat-7b` model is its ability to handle long-form contexts. You can experiment with providing the model with longer prompts or sequences of text to see how it performs on tasks that require understanding and reasoning over extended information.

Additionally, you can explore the model's capabilities in areas like math, coding, and data analysis by prompting it with relevant tasks and evaluating its responses. The [OpenCompass](https://github.com/internLM/OpenCompass/) evaluation tool provides a comprehensive way to benchmark the model's performance across various domains.

![](/internlm/internlm-xcomposer2-vl-7b/resolve/main/logo_en.png)

**InternLM-XComposer2**

[Github Repo](https://github.com/InternLM/InternLM-XComposer)

[Paper](https://arxiv.org/abs/2401.16420)

**InternLM-XComposer2** is a vision-language large model (VLLM) based on [InternLM2](https://github.com/InternLM/InternLM) for advanced text-image comprehension and composition.

We release InternLM-XComposer2 series in two versions:

*   InternLM-XComposer2-VL: The pretrained VLLM model with InternLM2 as the initialization of the LLM, achieving strong performance on various multimodal benchmarks.
*   InternLM-XComposer2: The finetuned VLLM for _Free-from Interleaved Text-Image Composition_.

### [](#import-from-transformers)Import from Transformers

To load the InternLM-XComposer2-VL-7B model using Transformers, use the following code:

    import torch
    from transformers import AutoTokenizer, AutoModelForCausalLM
    ckpt_path = "internlm/internlm-xcomposer2-vl-7b"
    tokenizer = AutoTokenizer.from_pretrained(ckpt_path, trust_remote_code=True).cuda()
    # Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
    model = AutoModelForCausalLM.from_pretrained(ckpt_path, torch_dtype=torch.float16, trust_remote_code=True).cuda()
    model = model.eval()
    

[](#quickstart)Quickstart
-------------------------

We provide a simple example to show how to use InternLM-XComposer with  Transformers.

    import torch
    from transformers import AutoModel, AutoTokenizer
    
    torch.set_grad_enabled(False)
    
    # init model and tokenizer
    model = AutoModel.from_pretrained('internlm/internlm-xcomposer2-vl-7b', trust_remote_code=True).cuda().eval()
    tokenizer = AutoTokenizer.from_pretrained('internlm/internlm-xcomposer2-vl-7b', trust_remote_code=True)
    
    query = '<ImageHere>Please describe this image in detail.'
    image = './image1.webp'
    with torch.cuda.amp.autocast():
      response, _ = model.chat(tokenizer, query=query, image=image, history=[], do_sample=False)
    print(response)
    #The image features a quote by Oscar Wilde, "Live life with no excuses, travel with no regret,"
    # set against a backdrop of a breathtaking sunset. The sky is painted in hues of pink and orange,
    # creating a serene atmosphere. Two silhouetted figures stand on a cliff, overlooking the horizon.
    # They appear to be hiking or exploring, embodying the essence of the quote.
    # The overall scene conveys a sense of adventure and freedom, encouraging viewers to embrace life without hesitation or regrets.
    

### [](#open-source-license)Open Source License

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/. For other questions or collaborations, please contact [internlm@pjlab.org.cn](mailto:internlm@pjlab.org.cn).

## Model overview

`internlm-xcomposer2-vl-7b` is a vision-language large model (VLLM) based on [InternLM2](https://aimodels.fyi/models/huggingFace/internlm-7b-internlm) for advanced text-image comprehension and composition. The model was developed by [internlm](https://aimodels.fyi/creators/huggingFace/internlm), who have also released the `internlm-xcomposer` model for similar capabilities. `internlm-xcomposer2-vl-7b` achieves strong performance on various multimodal benchmarks by leveraging the powerful InternLM2 as the initialization for the language model component.

## Model inputs and outputs

`internlm-xcomposer2-vl-7b` is a large multimodal model that can accept both text and image inputs. The model can generate detailed textual descriptions of images, as well as compose text and images together in creative ways.

### Inputs
- **Text**: The model can take text prompts as input, such as instructions or queries about an image.
- **Images**: The model can accept images of various resolutions and aspect ratios, up to 4K resolution.

### Outputs
- **Text**: The model can generate coherent and detailed textual responses based on the input image and text prompt.
- **Interleaved text-image compositions**: The model can create unique compositions by generating text that is interleaved with the input image.

## Capabilities

`internlm-xcomposer2-vl-7b` demonstrates strong multimodal understanding and generation capabilities. It can accurately describe the contents of images, answer questions about them, and even compose new text-image combinations. The model's performance rivals or exceeds other state-of-the-art vision-language models, making it a powerful tool for tasks like image captioning, visual question answering, and creative text-image generation.

## What can I use it for?

`internlm-xcomposer2-vl-7b` can be used for a variety of multimodal applications, such as:

- **Image captioning**: Generate detailed textual descriptions of images.
- **Visual question answering**: Answer questions about the contents of images.
- **Text-to-image composition**: Create unique compositions by generating text that is interleaved with an input image.
- **Multimodal content creation**: Combine text and images in creative ways for applications like advertising, education, and entertainment.

The model's strong performance and efficient design make it well-suited for both academic research and commercial use cases.

## Things to try

One interesting aspect of `internlm-xcomposer2-vl-7b` is its ability to handle high-resolution images at any aspect ratio. This allows the model to perceive fine-grained visual details, which can be beneficial for tasks like optical character recognition (OCR) and scene text understanding. You could try inputting images with small text or complex visual scenes to see how the model performs.

Additionally, the model's strong multimodal capabilities enable interesting creative applications. You could experiment with generating text-image compositions on a variety of topics, from abstract concepts to specific scenes or narratives. The model's ability to interweave text and images in novel ways opens up possibilities for innovative multimodal content creation.

[](#internlm25-7b-chat-gguf-model)InternLM2.5-7B-Chat GGUF Model
================================================================

[](#introduction)Introduction
-----------------------------

The `internlm2_5-7b-chat` model in GGUF format can be utilized by [llama.cpp](https://github.com/ggerganov/llama.cpp), a highly popular open-source framework for Large Language Model (LLM) inference, across a variety of hardware platforms, both locally and in the cloud. This repository offers `internlm2_5-7b-chat` models in GGUF format in both half precision and various low-bit quantized versions, including `q5_0`, `q5_k_m`, `q6_k`, and `q8_0`.

In the subsequent sections, we will first present the installation procedure, followed by an explanation of the model download process. And finally we will illustrate the methods for model inference and service deployment through specific examples.

[](#installation)Installation
-----------------------------

We recommend building `llama.cpp` from source. The following code snippet provides an example for the Linux CUDA platform. For instructions on other platforms, please refer to the [official guide](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#build).

*   Step 1: create a conda environment and install cmake

    conda create --name internlm2 python=3.10 -y
    conda activate internlm2
    pip install cmake
    

*   Step 2: clone the source code and build the project

    git clone --depth=1 https://github.com/ggerganov/llama.cpp.git
    cd llama.cpp
    cmake -B build -DGGML_CUDA=ON
    cmake --build build --config Release -j
    

All the built targets can be found in the sub directory `build/bin`

In the following sections, we assume that the working directory is at the root directory of `llama.cpp`.

[](#download-models)Download models
-----------------------------------

In the [introduction section](#introduction), we mentioned that this repository includes several models with varying levels of computational precision. You can download the appropriate model based on your requirements. For instance, `internlm2_5-7b-chat-fp16.gguf` can be downloaded as below

    pip install huggingface-hub
    huggingface-cli download internlm/internlm2_5-7b-chat-gguf internlm2_5-7b-chat-fp16.gguf--local-dir . --local-dir-use-symlinks False
    

[](#inference)Inference
-----------------------

You can use `llama-cli` for conducting inference. For a detailed explanation of `llama-cli`, please refer to [this guide](https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md)

    build/bin/llama-cli \
        --model internlm2_5-7b-chat-fp16.gguf \
        --predict 512 \
        --ctx-size 4096 \
        --gpu-layers 32 \
        --temp 0.8 \
        --top-p 0.8 \
        --top-k 50 \
        --seed 1024 \
        --color \
        --prompt "<|im_start|>system\nYou are an AI assistant whose name is InternLM ().\n- InternLM () is a conversational language model that is developed by Shanghai AI Laboratory (). It is designed to be helpful, honest, and harmless.\n- InternLM () can understand and communicate fluently in the language chosen by the user such as English and .<|im_end|>\n" \
        --interactive \
        --multiline-input \
        --conversation \
        --verbose \
        --logdir workdir/logdir \
        --in-prefix "<|im_start|>user\n" \
        --in-suffix "<|im_end|>\n<|im_start|>assistant\n"
    

[](#serving)Serving
-------------------

`llama.cpp` provides an OpenAI API compatible server - `llama-server`. You can deploy `internlm2_5-7b-chat-fp16.gguf` into a service like this:

    ./build/bin/llama-server -m ./internlm2_5-7b-chat-fp16.gguf -ngl 32
    

At the client side, you can access the service through OpenAI API:

    from openai import OpenAI
    client = OpenAI(
        api_key='YOUR_API_KEY',
        base_url='http://localhost:8080/v1'
    )
    model_name = client.models.list().data[0].id
    response = client.chat.completions.create(
      model=model_name,
      messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": " provide three suggestions about time management"},
      ],
      temperature=0.8,
      top_p=0.8
    )
    print(response)

## Model overview

The `internlm2_5-7b-chat` model from `internlm` is a large language model optimized for practical chat scenarios. It has outstanding reasoning capabilities, outperforming models like [Llama3](https://aimodels.fyi/models/huggingFace/llama-2-7b-gguf-thebloke) and [Gemma2-9B](https://aimodels.fyi/models/huggingFace/llama-2-7b-gguf-thebloke) on tasks like math reasoning. The model also has a 1M context window, allowing it to excel at long-context tasks like LongBench. Additionally, it supports gathering information from over 100 web pages, showing stronger tool utilization capabilities compared to other models.

## Model inputs and outputs

### Inputs
- Text prompts

### Outputs 
- Generated text responses

## Capabilities

The `internlm2_5-7b-chat` model demonstrates state-of-the-art performance on a variety of benchmarks, including MMLU, CMMLU, BBH, MATH, GSM8K and GPQA. It excels at tasks that require strong reasoning, world knowledge, and language understanding.

## What can I use it for?

The `internlm2_5-7b-chat` model is well-suited for a variety of natural language processing tasks, such as chatbots, question-answering systems, and text generation applications. Its robust capabilities make it a powerful tool for developers and researchers looking to build advanced AI assistants or incorporate language understanding into their projects. By leveraging the model's strong performance on benchmarks and its ability to gather information from multiple sources, you can create more intelligent and capable systems.

## Things to try

One key capability of the `internlm2_5-7b-chat` model is its ability to handle long-context tasks. You can try using it with the [LMDeploy](https://github.com/InternLM/InternLM/blob/main/chat/lmdeploy.md) tool to leverage its 1M-context inference capabilities. Additionally, you can explore its strong tool utilization skills by integrating it with the [Lagent](https://github.com/InternLM/lagent/tree/main) framework, which allows the model to gather information from a wide range of web sources.