Several previous studies have considered language- and domain-specific large language models (LLMs) as separate topics. This study explores the combination of a non-English language and a high-demand industry domain, focusing on a Japanese business-specific LLM. This type of a model requires expertise in the business domain, strong language skills, and regular updates of its knowledge. We trained a 13-billion-parameter LLM from scratch using a new dataset of business texts and patents, and continually pretrained it with the latest business documents. Further we propose a new benchmark for Japanese business domain question answering (QA) and evaluate our models on it. The results show that our pretrained model improves QA accuracy without losing general knowledge, and that continual pretraining enhances adaptation to new information. Our pretrained model and business domain benchmark are publicly available.

## Overview

- The paper explores pretraining and updating a large language model (LLM) for the Japanese business domain.
- The researchers used an existing LLM and further trained it on a domain-specific corpus to create a more specialized model.
- They evaluated the performance of the updated model on various tasks relevant to the Japanese business domain.

## Plain English Explanation

In this study, the researchers wanted to create a language model that would be particularly useful for understanding and generating text in the Japanese business domain. They started with an existing large language model, which is a powerful AI system trained on a vast amount of text data from the internet. 

To make this model more specialized for business tasks, the researchers gave it additional training using a dataset focused on Japanese business documents, such as financial reports, industry news, and business correspondence. This additional training, called "pretraining," allows the model to learn the unique vocabulary, writing style, and topics that are common in the Japanese business world.

After pretraining, the researchers tested the updated model on a variety of tasks related to the Japanese business domain, such as [summarizing financial reports](https://aimodels.fyi/papers/arxiv/comprehensive-study-german-language-models-clinical-biomedical), [generating business emails](https://aimodels.fyi/papers/arxiv/chinese-tiny-llm-pretraining-chinese-centric-large), and [answering questions about industry trends](https://aimodels.fyi/papers/arxiv/large-language-models-as-oracles-instantiating-ontologies). They found that the updated model performed better on these business-focused tasks compared to the original, more general-purpose language model.

The key insight here is that while large language models can be very powerful, they may not be optimally suited for specific domains or tasks. By further training these models on domain-specific data, researchers can create more specialized and effective AI systems for real-world applications, such as [assisting with business operations](https://aimodels.fyi/papers/arxiv/large-language-models-expansion-spoken-language-understanding) or [automating certain business processes](https://aimodels.fyi/papers/arxiv/sambalingo-teaching-large-language-models-new-languages).

## Technical Explanation

The researchers used an existing large language model as a starting point and further trained it on a dataset of Japanese business documents, including financial reports, industry news, and business correspondence. This additional pretraining step allowed the model to learn the unique vocabulary, writing style, and topical focuses that are common in the Japanese business domain.

After pretraining, the researchers evaluated the updated model's performance on a variety of business-related tasks, such as summarizing financial reports, generating business emails, and answering questions about industry trends. They compared the updated model's performance to the original, more general-purpose language model and found that the specialized model achieved better results on the business-focused tasks.

The key technical innovations in this work include the use of domain-specific pretraining to create a more specialized language model and the comprehensive evaluation of the updated model's performance on a range of business-relevant tasks. The researchers also provide insights into the types of data and tasks that are most important for optimizing language models for specific domains, such as the Japanese business sector.

## Critical Analysis

The researchers acknowledge several limitations in their study. First, the dataset used for pretraining, while substantial, may not have fully captured the breadth and complexity of the Japanese business domain. There may be important genres or topics that were underrepresented in the training data, which could limit the model's performance on certain tasks.

Additionally, the evaluation tasks used in the study, while relevant to the business domain, may not be fully representative of the real-world challenges faced by companies and industry professionals. The researchers suggest that future work should involve more extensive testing with end-users to better understand the practical applications and limitations of the updated language model.

Another potential concern is the potential for bias in the pretraining data or the model's outputs. As with any large language model, there is a risk that the system may perpetuate or amplify societal biases present in the training data. The researchers do not address this issue in depth, and further investigation into the fairness and ethics of the updated model would be valuable.

Despite these limitations, the study represents an important step towards creating more specialized and effective language models for real-world business applications. The researchers' approach of domain-specific pretraining and comprehensive evaluation could serve as a useful blueprint for similar efforts in other industries or languages, such as [creating German-focused language models for the healthcare domain](https://aimodels.fyi/papers/arxiv/comprehensive-study-german-language-models-clinical-biomedical) or [developing Chinese-centric language models for the tech sector](https://aimodels.fyi/papers/arxiv/chinese-tiny-llm-pretraining-chinese-centric-large).

## Conclusion

This paper demonstrates the value of pretraining and updating large language models for specific domains, using the Japanese business sector as a case study. By further training an existing LLM on a corpus of Japanese business documents, the researchers were able to create a more specialized model that outperformed the original on a range of business-related tasks.

The insights from this work could inform the development of similar domain-specific language models in other industries and languages, potentially leading to more effective AI-powered tools and services for businesses and professionals. As large language models continue to advance, it will be important for researchers and developers to explore ways to tailor these powerful systems to the unique needs and challenges of different real-world applications and contexts.