Bigcode
Rank:Average Model Cost: $0.0000
Number of Runs: 1,034,543
Models by this creator
santacoder
santacoder
The summary generation model is a language model that generates a concise summary of a given text. It uses text-generation techniques to understand the input text and generate a summary that captures the key points of the input. It is designed for a technical audience and can be used in various applications such as news article summarization, document summarization, and information extraction.
$-/run
917.7K
Huggingface
starcoder
starcoder
The StarCoder model is a language model that has been trained on over 80 programming languages using the Fill-in-the-Middle objective. It has 15.5 billion parameters and a context window of 8192 tokens. The model is designed to assist with programming tasks and can generate code snippets. However, it is not guaranteed to produce working code and may have limitations and potential errors. The model was trained on GitHub code and does not work well with instruction-based prompts. The training time for the model was 24 days using 512 Tesla A100 GPUs. The model is licensed under the BigCode OpenRAIL-M v1 license agreement.
$-/run
58.4K
Huggingface
starpii
starpii
StarPII is a Named Entity Recognition (NER) model trained to detect Personal Identifiable Information (PII) in code datasets. It is built on top of the bigcode-encoder model and uses a token classification head with 6 target classes: Names, Emails, Keys, Passwords, IP addresses, and Usernames. The model was fine-tuned on an annotated dataset of PII and initially trained on a pseudo-labeled dataset to enhance performance on rare PII entities like keys. The model's performance may vary across different data types and programming languages, and caution should be exercised when processing sensitive data.
$-/run
18.5K
Huggingface
starcoderbase
starcoderbase
The model is a text summarization model. It takes in a piece of text as input and generates a concise summary of the text as output. This can be useful for quickly understanding the main ideas and key points of a large body of text. The model uses a combination of natural language processing techniques to identify the most important information in the text and generate a summary that captures the essence of the original content.
$-/run
13.9K
Huggingface
gpt_bigcode-santacoder
gpt_bigcode-santacoder
The gpt_bigcode-santacoder model is a large language model trained on GitHub code in Python, Java, and JavaScript. It can generate code snippets based on given context, such as function signatures or code comments. However, the generated code is not guaranteed to work flawlessly, as it may be inefficient, contain bugs, or have security vulnerabilities. The model was trained using the GPT-2 architecture with multi-query attention and the Fill-in-the-Middle objective. It was trained for 600K steps using 236 billion tokens on a cluster of 96 Tesla V100 GPUs, taking 6.2 days to complete. The model is available under the CodeML Open RAIL-M v0.1 license.
$-/run
13.0K
Huggingface
starencoder
$-/run
1.2K
Huggingface
santacoder-fast-inference
$-/run
116
Huggingface
starcoder-megatron
starcoder-megatron
StarCoder Play with the model on the StarCoder Playground. Table of Contents Model Summary Use Limitations Training License Citation Model Summary This is the Megatron-version of StarCoder. We refer the reader to the StarCoder model page for full documentation about this model The StarCoder models are 15.5B parameter models trained on 80+ programming languages from The Stack (v1.2), with opt-out requests excluded. The model uses Multi Query Attention, a context window of 8192 tokens, and was trained using the Fill-in-the-Middle objective on 1 trillion tokens. Repository: bigcode/Megatron-LM Project Website: bigcode-project.org Paper: 💫StarCoder: May the source be with you! Point of Contact: contact@bigcode-project.org Languages: 80+ Programming languages Use Intended use The model was trained on GitHub code. As such it is not an instruction model and commands like "Write a function that computes the square root." do not work well. However, by using the Tech Assistant prompt you can turn it into a capable technical assistant. Feel free to share your generations in the Community tab! Generation Fill-in-the-middle Fill-in-the-middle uses special tokens to identify the prefix/middle/suffix part of the input and output: Attribution & Other Requirements The pretraining dataset of the model was filtered for permissive licenses only. Nevertheless, the model can generate source code verbatim from the dataset. The code's license might require attribution and/or other specific requirements that must be respected. We provide a search index that let's you search through the pretraining data to identify where generated code came from and apply the proper attribution to your code. Limitations The model has been trained on source code from 80+ programming languages. The predominant natural language in source code is English although other languages are also present. As such the model is capable of generating code snippets provided some context but the generated code is not guaranteed to work as intended. It can be inefficient, contain bugs or exploits. See the paper for an in-depth discussion of the model limitations. Training Model Architecture: GPT-2 model with multi-query attention and Fill-in-the-Middle objective Pretraining steps: 250k Pretraining tokens: 1 trillion Precision: bfloat16 Hardware GPUs: 512 Tesla A100 Training time: 24 days Software Orchestration: Megatron-LM Neural networks: PyTorch BP16 if applicable: apex License The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can find the full agreement here. Citation
$-/run
0
Huggingface