Pszemraj

Rank:

Average Model Cost: $0.0000

Number of Runs: 156,638

Models by this creator

flan-t5-large-grammar-synthesis

flan-t5-large-grammar-synthesis

pszemraj

The flan-t5-large-grammar-synthesis model is a fine-tuned version of the google/flan-t5-large model that is specifically designed for grammar correction. It is trained on an expanded version of the JFLEG dataset. The model is intended to perform "single-shot grammar correction" on text that may have multiple grammatical mistakes without changing the semantic meaning of the text. It can be used in a variety of applications such as correcting errors in transcriptions or text generated by language models. The model has been converted to ONNX format and can be loaded using Hugging Face's Optimum library. There are also smaller checkpoint versions of the model available for faster inference. The model is still a work-in-progress and should be checked for correctness.

Read more

$-/run

72.1K

Huggingface

led-base-book-summary

led-base-book-summary

The LED-Based Summarization Model, also known as LED, is a model trained to condense long and technical information into concise summaries. It is designed for summarizing narratives, articles, papers, textbooks, and other documents. The model has a high capacity and can handle up to 16,384 tokens per batch. It has been fine-tuned on the BookSum dataset and offers insightful output in a Sparknotes-esque style. The model checkpoint for this trained model is pszemraj/led-base-16384-finetuned-booksum. There are also other variants of the model available for different datasets. The model can be used by creating a pipeline object and feeding text into it. Alternatively, a simplified usage can be achieved using the Python package utility named TextSum, which provides simple interfaces for applying summarization models to text documents.

Read more

$-/run

40.3K

Huggingface

grammar-synthesis-small

grammar-synthesis-small

The grammar-synthesis-small model is a fine-tuned version of the T5 language model that is specifically trained for grammar correction. It is designed to correct grammatical errors in a text without changing the semantic meaning. The model is particularly useful for tasks like correcting transcriptions or infilling text generated by other models. It has been trained on an expanded version of the JFLEG dataset. However, it is still a work-in-progress and may not always produce correct outputs. The model has been trained with specific hyperparameters and is compatible with Transformers 4.20.1, PyTorch 1.11.0+cu113, Datasets 2.3.2, and Tokenizers 0.12.1.

Read more

$-/run

31.4K

Huggingface

long-t5-tglobal-base-16384-book-summary

long-t5-tglobal-base-16384-book-summary

The long-t5-tglobal-base-16384-book-summary model is a fine-tuned version of the google/long-t5-tglobal-base model. It has been trained on the kmfoda/booksum dataset using 30+ epochs of fine-tuning. The model is designed to summarize long texts and provide SparkNotes-like summaries of arbitrary topics. The training data used for this model is the kmfoda/booksum dataset, and the summaries longer than 1024 tokens were filtered out. The model performs well in terms of factual consistency, but it is important to note that the summaries should not be considered foolproof and should be double-checked for accuracy. The model can be used for various applications, but it may have limitations and further improvements can be made. The training procedure and hyperparameters used for training the model are also provided.

Read more

$-/run

6.0K

Huggingface

led-large-book-summary

led-large-book-summary

led-large-book-summary This model is a fine-tuned version of allenai/led-large-16384 on the BookSum dataset (kmfoda/booksum). It aims to generalize well and be useful in summarizing lengthy text for both academic and everyday purposes. Handles up to 16,384 tokens input See the Colab demo linked above or try the demo on Spaces Basic Usage To improve summary quality, use encoder_no_repeat_ngram_size=3 when calling the pipeline object. This setting encourages the model to utilize new vocabulary and construct an abstractive summary. Load the model into a pipeline object: Feed the text into the pipeline object: Important: For optimal summary quality, use the global attention mask when decoding, as demonstrated in this community notebook, see the definition of generate_answer(batch). If you're facing computing constraints, consider using the base version pszemraj/led-base-book-summary. Training Information Data The model was fine-tuned on the booksum dataset. During training, the chapterwas the input col, while the summary_text was the output. Procedure Fine-tuning was run on the BookSum dataset across 13+ epochs. Notably, the final four epochs combined the training and validation sets as 'train' to enhance generalization. Hyperparameters The training process involved different settings across stages: Initial Three Epochs: Low learning rate (5e-05), batch size of 1, 4 gradient accumulation steps, and a linear learning rate scheduler. In-between Epochs: Learning rate reduced to 4e-05, increased batch size to 2, 16 gradient accumulation steps, and switched to a cosine learning rate scheduler with a 0.05 warmup ratio. Final Two Epochs: Further reduced learning rate (2e-05), batch size reverted to 1, maintained gradient accumulation steps at 16, and continued with a cosine learning rate scheduler, albeit with a lower warmup ratio (0.03). Versions Transformers 4.19.2 Pytorch 1.11.0+cu113 Datasets 2.2.2 Tokenizers 0.12.1 Simplified Usage with TextSum To streamline the process of using this and other models, I've developed a Python package utility named textsum. This package offers simple interfaces for applying summarization models to text documents of arbitrary length. Install TextSum: Then use it in Python with this model: Currently implemented interfaces include a Python API, a Command-Line Interface (CLI), and a demo/web UI. For detailed explanations and documentation, check the README or the wiki Related Models Check out these other related models, also trained on the BookSum dataset: LED-large continued - experiment with further fine-tuning Long-T5-tglobal-base BigBird-Pegasus-Large-K Pegasus-X-Large Long-T5-tglobal-XL There are also other variants on other datasets etc on my hf profile, feel free to try them out :)

Read more

$-/run

4.0K

Huggingface

bigbird-pegasus-large-K-booksum

bigbird-pegasus-large-K-booksum

bigbird pegasus on the booksum dataset GOAL: A summarization model that 1) summarizes the source content accurately 2) more important IMO produces summaries that are easy to read and understand (* cough * unlike arXiv * cough *) This model attempts to help with that by using the booksum dataset to provide explanatory summarization Explanatory Summary - A summary that both consolidates information and also explains why said consolidated information is important. This model was trained for seven epochs total (approx 70,000 steps) and is closer to finished. Will continue to improve (slowly, now that it has been trained for a long time) based on any result findings/feedback. starting checkpoint was google/bigbird-pegasus-large-bigpatent example usage create the summarizer object: define text to be summarized, and pass it through the pipeline. Boom done. Alternate Checkpoint if experiencing runtime/memory issues, try this earlier checkpoint at 40,000 steps which is almost as good at the explanatory summarization task but runs faster. see similar summarization models fine-tuned on booksum but using different architectures: long-t5 base and LED-Large

Read more

$-/run

272

Huggingface

Similar creators