Georgiatechresearchinstitute

Models by this creator

🔄

starcoder-gpteacher-code-instruct

GeorgiaTechResearchInstitute

Total Score

79

The starcoder-gpteacher-code-instruct model is a fine-tuned version of the BigCode StarCoder model that has been trained on the GPTeacher codegen dataset. This model is maintained by the Georgia Tech Research Institute. The base StarCoder models are 15.5B parameter models trained on over 80 programming languages from The Stack (v1.2) dataset. They use Multi-Query Attention, a context window of 8192 tokens, and were trained using the Fill-in-the-Middle objective on 1 trillion tokens. Model inputs and outputs Inputs Instruction**: A text prompt describing a task for the model to complete, such as "Write a function that computes the square root." Input**: Additional context information that the model can use to generate the requested output. Outputs Response**: The model's attempt at completing the requested task, generating code or text to fulfill the instruction. Capabilities The starcoder-gpteacher-code-instruct model is capable of following code-related instructions and generating relevant responses. For example, given the prompt "Write a function that computes the square root", the model may generate the following Python function: import math def sqrt(x): return math.sqrt(x) What can I use it for? The starcoder-gpteacher-code-instruct model could be useful for a variety of applications that require generating code or text based on instructions, such as: Automated code generation and assisted programming Technical assistance and question-answering for developers Prototyping and experimentation with new ideas Things to try One interesting thing to try with the starcoder-gpteacher-code-instruct model is using the Tech Assistant prompt to prompt it into behaving as a technical assistant. This can help the model better understand and respond to code-related instructions. Another idea is to experiment with the model's ability to generate code in different programming languages, by providing instructions that specify the desired language.

Read more

Updated 5/28/2024

🛸

galpaca-30b

GeorgiaTechResearchInstitute

Total Score

55

The galpaca-30b is a large language model developed by the Georgia Tech Research Institute. It is a fine-tuned version of the GALACTICA 30B model, which was trained on a large-scale scientific corpus to perform a variety of scientific tasks. The GALACTICA models range in size from 125M to 120B parameters, with the galpaca-30b being the "large" 30B parameter variant. The galpaca-30b model was further fine-tuned on the Alpaca dataset, a collection of 52K instruction-response pairs designed to enhance the instruction-following capabilities of pre-trained language models. This fine-tuning was done using a modified version of the Self-Instruct Framework. Model inputs and outputs Inputs Freeform text**: The galpaca-30b model can accept arbitrary freeform text as input, such as instructions, questions, or prompts. Outputs Generated text**: Based on the input text, the model will generate relevant output text. This can include answers to questions, responses to instructions, or continuations of the provided prompt. Capabilities The galpaca-30b model demonstrates strong performance on a range of scientific tasks, including citation prediction, scientific question answering, mathematical reasoning, summarization, and more. It outperforms several existing language models on knowledge-intensive tasks, thanks to its large-scale training on scientific data. However, the model is also prone to hallucination, meaning it can generate factually incorrect information, especially for less popular scientific concepts. Additionally, while the model exhibits lower toxicity levels compared to other large language models, it still exhibits some biases. What can I use it for? The primary intended users of the GALACTICA models, including the galpaca-30b, are researchers studying the application of language models to scientific domains. The model could be used to build various scientific tooling, such as literature discovery, scientific question answering, and mathematical reasoning assistants. That said, the maintainers caution against using the model in production environments without proper safeguards, due to the risk of hallucination and biases. Things to try Given the model's strengths in scientific tasks, users may want to experiment with prompts related to various scientific fields, such as requesting explanations of scientific concepts, generating research paper abstracts, or solving mathematical problems. However, it's important to be aware of the model's limitations and not rely on its outputs as authoritative sources of information.

Read more

Updated 5/28/2024