Jinhybr

Rank:

Average Model Cost: $0.0000

Number of Runs: 4,458

Models by this creator

OCR-Donut-CORD

OCR-Donut-CORD

jinhybr

Donut (base-sized model, fine-tuned on CORD) Donut model fine-tuned on CORD. It was introduced in the paper OCR-free Document Understanding Transformer by Geewok et al. and first released in this repository. Disclaimer: The team releasing Donut did not write a model card for this model so this model card has been written by the Hugging Face team. Model description Donut consists of a vision encoder (Swin Transformer) and a text decoder (BART). Given an image, the encoder first encodes the image into a tensor of embeddings (of shape batch_size, seq_len, hidden_size), after which the decoder autoregressively generates text, conditioned on the encoding of the encoder. Intended uses & limitations This model is fine-tuned on CORD, a document parsing dataset. We refer to the documentation which includes code examples. CORD Dataset CORD: A Consolidated Receipt Dataset for Post-OCR Parsing.

Read more

$-/run

3.1K

Huggingface

OCR-LayoutLMv3

OCR-LayoutLMv3

OCR-LayoutLMv3 This model is a fine-tuned version of microsoft/layoutlmv3-base on the funsd-layoutlmv3 dataset. It achieves the following results on the evaluation set: Loss: 0.9788 Precision: 0.8989 Recall: 0.9051 F1: 0.9020 Accuracy: 0.8404 Model description LayoutLMv3 is a pre-trained multimodal Transformer for Document AI with unified text and image masking. The simple unified architecture and training objectives make LayoutLMv3 a general-purpose pre-trained model. For example, LayoutLMv3 can be fine-tuned for both text-centric tasks, including form understanding, receipt understanding, and document visual question answering, and image-centric tasks such as document image classification and document layout analysis. LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei, Preprint 2022. Training hyperparameters The following hyperparameters were used during training: learning_rate: 1e-05 train_batch_size: 2 eval_batch_size: 2 seed: 42 optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 lr_scheduler_type: linear training_steps: 2000 Training results Framework versions Transformers 4.25.0.dev0 Pytorch 1.12.1 Datasets 2.6.1 Tokenizers 0.13.1

Read more

$-/run

523

Huggingface

OCR-DocVQA-Donut

OCR-DocVQA-Donut

Donut (base-sized model, fine-tuned on DocVQA) Donut model fine-tuned on DocVQA. It was introduced in the paper OCR-free Document Understanding Transformer by Geewok et al. and first released in this repository. Disclaimer: The team releasing Donut did not write a model card for this model so this model card has been written by the Hugging Face team. Model description Donut consists of a vision encoder (Swin Transformer) and a text decoder (BART). Given an image, the encoder first encodes the image into a tensor of embeddings (of shape batch_size, seq_len, hidden_size), after which the decoder autoregressively generates text, conditioned on the encoding of the encoder. Intended uses & limitations This model is fine-tuned on DocVQA, a document visual question answering dataset. We refer to the documentation which includes code examples.

Read more

$-/run

498

Huggingface

OCR-LM-v1

OCR-LM-v1

OCR-LM-v1 This model is a fine-tuned version of jinhybr/layoutlm-funsd-pytorch on the funsd dataset. It achieves the following results on the evaluation set: Loss: 1.1740 Answer: {'precision': 0.7201327433628318, 'recall': 0.8046971569839307, 'f1': 0.7600700525394046, 'number': 809} Header: {'precision': 0.4246575342465753, 'recall': 0.5210084033613446, 'f1': 0.46792452830188674, 'number': 119} Question: {'precision': 0.8236380424746076, 'recall': 0.8375586854460094, 'f1': 0.8305400372439479, 'number': 1065} Overall Precision: 0.7525 Overall Recall: 0.8053 Overall F1: 0.7780 Overall Accuracy: 0.8146 Model description More information needed Intended uses & limitations More information needed Training and evaluation data More information needed Training procedure Training hyperparameters The following hyperparameters were used during training: learning_rate: 3e-05 train_batch_size: 16 eval_batch_size: 8 seed: 42 optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 lr_scheduler_type: linear num_epochs: 50 Training results Framework versions Transformers 4.23.1 Pytorch 1.12.1 Datasets 2.6.1 Tokenizers 0.13.1

Read more

$-/run

26

Huggingface

text-summarization-t5base-xsum

text-summarization-t5base-xsum

t5-small-finetuned-xsum This model is a fine-tuned version of t5-small on the xsum dataset. It achieves the following results on the evaluation set: Loss: 2.4789 Rouge1: 28.282 Rouge2: 7.6989 Rougel: 22.2019 Rougelsum: 22.197 Gen Len: 18.8238 Model description More information needed Intended uses & limitations More information needed Training and evaluation data More information needed Training procedure Training hyperparameters The following hyperparameters were used during training: learning_rate: 2e-05 train_batch_size: 16 eval_batch_size: 16 seed: 42 optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 lr_scheduler_type: linear num_epochs: 1 mixed_precision_training: Native AMP Training results Framework versions Transformers 4.22.2 Pytorch 1.12.1+cu113 Datasets 2.5.1 Tokenizers 0.12.1

Read more

$-/run

19

Huggingface

Similar creators