Bigcode

Rank:

Average Model Cost: $0.0000

Number of Runs: 1,034,543

Models by this creator

santacoder

santacoder

bigcode

The summary generation model is a language model that generates a concise summary of a given text. It uses text-generation techniques to understand the input text and generate a summary that captures the key points of the input. It is designed for a technical audience and can be used in various applications such as news article summarization, document summarization, and information extraction.

Read more

$-/run

917.7K

Huggingface

starcoder

starcoder

The StarCoder model is a language model that has been trained on over 80 programming languages using the Fill-in-the-Middle objective. It has 15.5 billion parameters and a context window of 8192 tokens. The model is designed to assist with programming tasks and can generate code snippets. However, it is not guaranteed to produce working code and may have limitations and potential errors. The model was trained on GitHub code and does not work well with instruction-based prompts. The training time for the model was 24 days using 512 Tesla A100 GPUs. The model is licensed under the BigCode OpenRAIL-M v1 license agreement.

Read more

$-/run

58.4K

Huggingface

starpii

starpii

StarPII is a Named Entity Recognition (NER) model trained to detect Personal Identifiable Information (PII) in code datasets. It is built on top of the bigcode-encoder model and uses a token classification head with 6 target classes: Names, Emails, Keys, Passwords, IP addresses, and Usernames. The model was fine-tuned on an annotated dataset of PII and initially trained on a pseudo-labeled dataset to enhance performance on rare PII entities like keys. The model's performance may vary across different data types and programming languages, and caution should be exercised when processing sensitive data.

Read more

$-/run

18.5K

Huggingface

starcoderbase

starcoderbase

The model is a text summarization model. It takes in a piece of text as input and generates a concise summary of the text as output. This can be useful for quickly understanding the main ideas and key points of a large body of text. The model uses a combination of natural language processing techniques to identify the most important information in the text and generate a summary that captures the essence of the original content.

Read more

$-/run

13.9K

Huggingface

gpt_bigcode-santacoder

gpt_bigcode-santacoder

The gpt_bigcode-santacoder model is a large language model trained on GitHub code in Python, Java, and JavaScript. It can generate code snippets based on given context, such as function signatures or code comments. However, the generated code is not guaranteed to work flawlessly, as it may be inefficient, contain bugs, or have security vulnerabilities. The model was trained using the GPT-2 architecture with multi-query attention and the Fill-in-the-Middle objective. It was trained for 600K steps using 236 billion tokens on a cluster of 96 Tesla V100 GPUs, taking 6.2 days to complete. The model is available under the CodeML Open RAIL-M v0.1 license.

Read more

$-/run

13.0K

Huggingface

starcoder-megatron

starcoder-megatron

StarCoder Play with the model on the StarCoder Playground. Table of Contents Model Summary Use Limitations Training License Citation Model Summary This is the Megatron-version of StarCoder. We refer the reader to the StarCoder model page for full documentation about this model The StarCoder models are 15.5B parameter models trained on 80+ programming languages from The Stack (v1.2), with opt-out requests excluded. The model uses Multi Query Attention, a context window of 8192 tokens, and was trained using the Fill-in-the-Middle objective on 1 trillion tokens. Repository: bigcode/Megatron-LM Project Website: bigcode-project.org Paper: 💫StarCoder: May the source be with you! Point of Contact: contact@bigcode-project.org Languages: 80+ Programming languages Use Intended use The model was trained on GitHub code. As such it is not an instruction model and commands like "Write a function that computes the square root." do not work well. However, by using the Tech Assistant prompt you can turn it into a capable technical assistant. Feel free to share your generations in the Community tab! Generation Fill-in-the-middle Fill-in-the-middle uses special tokens to identify the prefix/middle/suffix part of the input and output: Attribution & Other Requirements The pretraining dataset of the model was filtered for permissive licenses only. Nevertheless, the model can generate source code verbatim from the dataset. The code's license might require attribution and/or other specific requirements that must be respected. We provide a search index that let's you search through the pretraining data to identify where generated code came from and apply the proper attribution to your code. Limitations The model has been trained on source code from 80+ programming languages. The predominant natural language in source code is English although other languages are also present. As such the model is capable of generating code snippets provided some context but the generated code is not guaranteed to work as intended. It can be inefficient, contain bugs or exploits. See the paper for an in-depth discussion of the model limitations. Training Model Architecture: GPT-2 model with multi-query attention and Fill-in-the-Middle objective Pretraining steps: 250k Pretraining tokens: 1 trillion Precision: bfloat16 Hardware GPUs: 512 Tesla A100 Training time: 24 days Software Orchestration: Megatron-LM Neural networks: PyTorch BP16 if applicable: apex License The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can find the full agreement here. Citation

Read more

$-/run

0

Huggingface

Similar creators