## Model overview

The `moss-moon-003-sft` model is a conversational language model developed by [fnlp](https://aimodels.fyi/creators/huggingFace/fnlp). It was initialized with the [CodeGen](https://arxiv.org/abs/2203.13474) model and further pre-trained on 100B Chinese tokens and 20B English tokens, seeing a total of 700B tokens during pre-training. The model was then fine-tuned on ~1.1M multi-turn conversational data, allowing it to follow instructions in multi-turn dialogues and refuse inappropriate requests.

The `moss-moon-003-sft` model is part of a family of MOSS models, which also includes the `moss-moon-003-base` base model, the `moss-moon-003-sft-plugin` model fine-tuned on plugin-augmented data, and various quantized versions of these models (e.g. `moss-moon-003-sft-int4`, `moss-moon-003-sft-int8`). The final MOSS-003 model, which demonstrated better factuality, safety, and more stable response quality, will be open-sourced in the near future.

## Model inputs and outputs

### Inputs

- **Text**: The `moss-moon-003-sft` model takes text input in the form of prompts or dialogue history. It can handle both English and Chinese text.

### Outputs

- **Text**: The model generates text responses in the language specified by the user, which can be either English or Chinese.

## Capabilities

The `moss-moon-003-sft` model has been trained to be helpful, honest, and harmless. It can understand and communicate fluently in both English and Chinese, and perform a wide range of language-based tasks. The model can follow instructions in multi-turn dialogues, refuse inappropriate requests, and provide additional relevant details to answer questions in-depth and comprehensively.

Some example use cases for the `moss-moon-003-sft` model include:

- **Web search**: The model can be used to search the web and provide summaries of the results, as shown in the [example](https://github.com/OpenLMLab/MOSS/blob/main/examples/example_moss_search.gif).
- **Math and coding**: The model can solve simple math problems and write basic code, as demonstrated in the [examples](https://github.com/OpenLMLab/MOSS/blob/main/examples/example_moss_calculate.png) and [examples](https://github.com/OpenLMLab/MOSS/blob/main/examples/example_moss_code_1.png).
- **Text-to-image generation**: The model can use text-to-image plugins to generate images based on user descriptions, as shown in the [example](https://github.com/OpenLMLab/MOSS/blob/main/examples/example_moss_text2img.png).
- **Chinese language tasks**: The model has strong Chinese language capabilities, as evidenced by the [examples](https://github.com/OpenLMLab/MOSS/blob/main/examples/example_moss_chinese_1.png), [examples](https://github.com/OpenLMLab/MOSS/blob/main/examples/example_moss_chinese_2.png), and [examples](https://github.com/OpenLMLab/MOSS/blob/main/examples/example_moss_chinese_3.png).
- **Harmlessness**: The model has been trained to refuse requests for harmful or unethical actions, as shown in the [example](https://github.com/OpenLMLab/MOSS/blob/main/examples/example_moss_harmless.png).

## What can I use it for?

The `moss-moon-003-sft` model can be used in a variety of applications that require natural language processing and generation, such as:

- **Chatbots and virtual assistants**: The model's ability to engage in multi-turn dialogues and understand both English and Chinese makes it a suitable choice for building chatbots and virtual assistants.
- **Content generation**: The model can be used to generate text content, such as articles, stories, or product descriptions, in both English and Chinese.
- **Code generation**: The model's capability to write basic code can be leveraged for tasks like automated programming, code completion, or code generation.
- **Multilingual translation**: While the model is not specifically designed for translation, its understanding of both English and Chinese can be used for rudimentary translation between the two languages.

## Things to try

One interesting aspect of the `moss-moon-003-sft` model is its ability to refuse inappropriate requests. This feature can be useful in building safe and ethical AI systems that prioritize user wellbeing and avoid causing harm. Developers can experiment with the model's response to different types of prompts, both benign and potentially harmful, to better understand its safety and alignment capabilities.

Another interesting aspect is the model's strong performance on Chinese language tasks. Developers working on applications targeting Chinese-speaking users can explore the model's capabilities in areas like content generation, question answering, and language understanding for the Chinese language.

Finally, the availability of quantized versions of the `moss-moon-003-sft` model (e.g. `moss-moon-003-sft-int4`, `moss-moon-003-sft-int8`) presents an opportunity to experiment with deploying the model on hardware with limited memory resources, such as edge devices or mobile phones. Developers can test the performance and quality trade-offs of these quantized models to find the best fit for their specific use cases.