Togethercomputer

Rank:

Average Model Cost: $0.0000

Number of Runs: 72,897

Models by this creator

RedPajama-INCITE-Chat-3B-v1

RedPajama-INCITE-Chat-3B-v1

togethercomputer

RedPajama-INCITE-Chat-3B-v1 is a text generation model developed by OpenAI. It is a transformer-based model with 3 billion parameters. The model is trained to facilitate interactive and dynamic conversations with users. It can generate human-like responses given a conversation history and a prompt. The model has been trained on a large corpus of text from the internet, which enables it to generate coherent and contextually relevant responses.

Read more

$-/run

10.9K

Huggingface

GPT-JT-6B-v1

GPT-JT-6B-v1

GPT-JT-6B-v1 is a large language model that is based on EleutherAI's GPT-J (6B). It uses UL2 training objectives, allowing the model to see bidirectional context of the prompt. The model was trained on a diverse dataset, including Chain-of-Thought (CoT), Public Pool of Prompts (P3), and Natural-Instructions (NI) datasets. GPT-JT-6B-v1 improves the performance of classification tasks compared to the original GPT-J and even outperforms most 100B+ parameter models. The model was trained using AdamW optimizer with a learning rate of 1e-5 and a global batch size of 64. It utilizes both data parallelism and pipeline parallelism during training. The input sequence is truncated to 2048 tokens, and multiple sequences are concatenated for sequences with fewer than 2048 tokens. The model was trained on the Together Research Computer.

Read more

$-/run

10.8K

Huggingface

RedPajama-INCITE-Base-3B-v1

RedPajama-INCITE-Base-3B-v1

RedPajama-INCITE-Base-3B-v1 is a language model developed by Together Computer with input from various institutions and researchers. It is a 2.8 billion parameter pretrained language model. The model has been trained on 3,072 V100 GPUs provided as part of the INCITE 2023 project. It is licensed under Apache 2.0. The model can be used for various natural language processing tasks. It has limitations and should be used responsibly and within its intended scope. Misuse of the model for illegal, unethical, or harmful activities is strictly prohibited. The model's training data, hardware, optimizer, and hyperparameters are provided in the documentation. More details and community support can be found on the Together Discord channel.

Read more

$-/run

8.8K

Huggingface

RedPajama-INCITE-Instruct-3B-v1

RedPajama-INCITE-Instruct-3B-v1

RedPajama-INCITE-Instruct-3B-v1 is a language model developed by Together and leaders from the open-source AI community. It is a few-shot version of the GPT-JT model and is fine-tuned for instruction-based tasks. The model has a total of 2.8 billion parameters and can be used for text generation tasks. It is available for GPU and CPU inference. However, it is important to use the model responsibly and within its intended scope. Misuse, such as engaging in illegal or unethical activities, is strictly prohibited. The model has limitations and may not always provide accurate or relevant answers, particularly for complex or ambiguous questions. The training data and procedure are specified, and the community is invited to contribute towards improving the model.

Read more

$-/run

8.5K

Huggingface

RedPajama-INCITE-7B-Chat

RedPajama-INCITE-7B-Chat

RedPajama-INCITE-7B-Chat is a language model developed by Together and leaders from the open-source AI community. It is fine-tuned on OASST1 and Dolly2 to enhance its chatting ability. The model has a total of 6.9 billion parameters and is licensed under Apache 2.0. It can be used for GPU and CPU inference, but certain considerations need to be taken into account. The model is intended for language modeling purposes and must be used responsibly and ethically. Misuse of the model, such as engaging in illegal or unethical activities, is strictly prohibited. The model has limitations in providing accurate and relevant answers, particularly for complex or ambiguous questions. Contributions and collaborations are encouraged to improve the model's performance. The training data used for the model is available, and the training procedure involved the use of 8 A100 GPUs and the Adam optimizer with a learning rate of 1e-5. The model is supported by the Together Discord community.

Read more

$-/run

8.0K

Huggingface

Pythia-Chat-Base-7B

Pythia-Chat-Base-7B

Feel free to try out our OpenChatKit feedback app! Pythia-Chat-Base-7B-v0.16 Pythia-Chat-Base-7B-v0.16 is based on ElutherAI’s Pythia-7B model, and is fine-tuned with data focusing on dialog-style interactions. We focused the tuning on several tasks such as question answering, classification, extraction, and summarization. We’ve fine-tuned the model with a collection of 43 million high-quality instructions. Together partnered with LAION and Ontocord.ai, who both helped curate the dataset the model is based on. You can read more about this process and the availability of this dataset in LAION’s blog post here. In addition to the aforementioned fine-tuning, Pythia-Chat-Base-7B-v0.16 has also undergone further fine-tuning via a small amount of feedback data. This process allows the model to better adapt to human preferences in the conversations. One of the notable features of Pythia-Chat-Base-7B-v0.16 is its ability to run inference on a 12GB GPU, thanks to the quantization technique. It helps maintain the dialogue capabilities while making the model more accessible to a wider range of users and hardware configurations. Model Details Developed by: Together Computer. Model type: Language Model Language(s): English License: Apache 2.0 Model Description: A 7B parameter open source chat model, fine-tuned from EleutherAI’s Pythia with over 40M instructions on 100% carbon negative compute Resources for more information: GitHub Repository. Quick Start GPU Inference This requires a GPU with 24GB memory. GPU Inference in Int8 This requires a GPU with 12GB memory. CPU Inference Strengths of the model There are several tasks that OpenChatKit excels at out of the box. This includes: Summarization and question answering within context. Extraction. Classification. In addition, the model does well on few-shot prompts. For both classification and extraction, the model performs even better with few shots, as in most HELM tasks. Contact us if you’re interested in trying few-shot prompts with the model. Weaknesses of the model That said, there are several areas where we have more work to do, and we need your help! Some of these include: Knowledge-based closed question and answering: The chatbot may hallucinate and give incorrect results. Be sure to fact check, and if possible provide feedback with the corrected information. Coding tasks: The chatbot was not trained on a large enough corpus of source code to excel at writing code. We welcome contributions of additional datasets to improve this! Repetition: Sometimes the chatbot will repeat its response. We’re working to improve this, but in the meantime you can click the refresh button to start a new conversation. Context switching: If you change the topic in the middle of a conversation the chatbot often cannot make the switch automatically and will continue to give answers related to the prior topic. Creative writing and longer answers: The chatbot does not generate long, creative text such as an essay or story. We are excited to work with you to address these weaknesses by getting your feedback, bolstering data sets, and improving accuracy. Uses Direct Use The model is intended for research purposes. Possible research areas and tasks include Safe deployment of models which have the potential to generate harmful content. Probing and understanding the limitations and biases of dialogue models or language models. Generation of artworks and use in design and other artistic processes. Applications in educational or creative tools. Research on dialogue models or language models. Excluded uses are described below. Misuse, Malicious Use, and Out-of-Scope Use The OpenChatKit community provides Pythia-Chat-Base-7B-v0.16 as an open source tool for building chatbots. The community is not responsible for any misuse, malicious use, or out-of-scope use of the model. It is the responsibility of the end user to ensure that the model is used in a responsible and ethical manner. Pythia-Chat-Base-7B-v0.16 is designed for use in chatbot applications and may not perform well for other use cases outside of its intended scope. For example, it may not be suitable for use in safety-critical applications or for making decisions that have a significant impact on individuals or society. It is important to consider the limitations of the model and to only use it for its intended purpose. Pythia-Chat-Base-7B-v0.16 is designed for use in chatbot applications and should not be used for any other purpose. Misuse of the model, such as using it to engage in illegal or unethical activities, is strictly prohibited and goes against the principles of the OpenChatKit community project. Using the model to generate content that is cruel to individuals is a misuse of this model. This includes, but is not limited to: Generating fake news, misinformation, or propaganda Promoting hate speech, discrimination, or violence against individuals or groups Impersonating individuals or organizations without their consent Engaging in cyberbullying or harassment Defamatory content Spamming or scamming Sharing confidential or sensitive information without proper authorization Violating the terms of use of the model or the data used to train it Creating automated bots for malicious purposes such as spreading malware, phishing scams, or spamming Limitations Pythia-Chat-Base-7B-v0.16, like other language model-based chatbots, has limitations that should be taken into consideration. For example, the model may not always provide accurate or relevant answers, particularly for questions that are complex, ambiguous, or outside of its training data. We therefore welcome contributions from individuals and organizations, and encourage collaboration towards creating a more robust and inclusive chatbot. Training Training Data Please refer to togethercomputer/OpenDataHub Training Procedure Hardware: 8 x A100 GPUs Optimizer: 8bit-AdamW Gradient Accumulations: 4 Batch: 4 x 4 x 16 x 2048 = 524288 tokens Learning rate: warmup to 1e-5 for 100 steps and then kept constant Community Join us on Together Discord

Read more

$-/run

2.5K

Huggingface

GPT-JT-Moderation-6B

GPT-JT-Moderation-6B

GPT-JT-Moderation-6B v1 This model card introduces a moderation model, a GPT-JT model fine-tuned on Ontocord.ai's OIG-moderation dataset v0.1. This model can be used to moderate other chatbot models, including GPT-NeoXT-Chat-Base-20B. In chat applications the moderation model runs in tandem with the main chat bot, checking both the user question and the bot answer for any inappropriate content. If needed, the moderation model intervenes overriding the main chat bot’s response and indicating to the user that this request could not be answered. Examples An example prompt and its expected result is as follows: Uses Limitations and Bias The model's performance is limited by the quality and representativeness of its training data. We will continue working on this. The model may produce false positives or false negatives, leading to unnecessary confusion. We apologize and welcome any feedbacks or comments for that! Training Training Data allenai/prosocial-dialog. A small subset of LAION's OIG dataset to augment casual queries. The processed data can be found in the OIG-moderation repository here. Training Procedure Hardware: 8 x A100 GPUs Optimizer: AdamW Gradient Accumulations: 1 Batch: 16 x 4 = 64 Learning rate: warmup to 1e-5 for 100 steps and then kept constant Community Join us on Together Discord

Read more

$-/run

390

Huggingface

Similar creators