[](#pygmalion-13b-4bit-128g)pygmalion-13b-4bit-128g
===================================================

[](#model-description)Model description
---------------------------------------

**Warning: THIS model is NOT suitable for use by minors. The model will output X-rated content.**

Quantized from the decoded pygmalion-13b xor format. **[https://huggingface.co/PygmalionAI/pygmalion-13b](https://huggingface.co/PygmalionAI/pygmalion-13b)**

In safetensor format.

### [](#quantization-information)Quantization Information

GPTQ CUDA quantized with: [https://github.com/0cc4m/GPTQ-for-LLaMa](https://github.com/0cc4m/GPTQ-for-LLaMa)

    python llama.py --wbits 4 models/pygmalion-13b c4 --true-sequential --groupsize 128 --save_safetensors models/pygmalion-13b/4bit-128g.safetensors

## Model overview

The `pygmalion-13b-4bit-128g` model is a quantized version of the pre-trained [Pygmalion-13B](https://huggingface.co/PygmalionAI/pygmalion-13b) language model. It has been quantized to 4-bit precision using the GPTQ method with a group size of 128, reducing the model size while preserving much of the original model's performance. This model is well-suited for GPU inference due to its small size and fast inference speed.

The `pygmalion-13b-4bit-128g` model is similar to other quantized language models like [alpaca-30b-lora-int4](https://aimodels.fyi/models/huggingFace/alpaca-30b-lora-int4-elinas) and the [stable-vicuna-13B-GPTQ](https://aimodels.fyi/models/huggingFace/stable-vicuna-13b-gptq-thebloke) model, which also leverage quantization techniques to reduce model size.

## Model inputs and outputs

### Inputs
- **Text prompts**: The model accepts text prompts as input, which can be used to guide the model's language generation.

### Outputs
- **Generated text**: The model outputs generated text, which can be used for a variety of natural language processing tasks such as text generation, summarization, and question answering.

## Capabilities

The `pygmalion-13b-4bit-128g` model is a powerful text generation model that can be used for a variety of tasks, such as writing creative stories, generating responses to prompts, and engaging in open-ended conversations. It has been trained on a large corpus of text data and can generate coherent, context-aware text. However, as with many language models, it may also generate biased or harmful content, and should be used with caution.

## What can I use it for?

The `pygmalion-13b-4bit-128g` model can be used for a variety of natural language processing tasks, such as:

- **Text generation**: The model can be used to generate text, such as creative stories, poems, or even news articles, based on user prompts.
- **Chatbots and conversational agents**: The model can be fine-tuned and used as the foundation for building chatbots and conversational agents that can engage in natural language interactions.
- **Question answering**: The model can be used to answer questions on a wide range of topics, by generating relevant and informative responses.

However, it's important to note that the model was not trained to be safe or harmless, and may generate biased or inappropriate content. It should be used with caution and appropriate safeguards in place.

## Things to try

One interesting thing to try with the `pygmalion-13b-4bit-128g` model is to explore its capabilities in generating coherent and context-aware text. You can try providing the model with various prompts and observe how it responds, paying attention to the model's ability to maintain a consistent tone, personality, and narrative throughout the generated text.

Another interesting avenue to explore is the model's performance on specific tasks, such as question answering or text summarization. You can design test cases and benchmarks to assess the model's strengths and limitations in these areas, and compare its performance to other similar models.