Maintainer: numind

Total Score


Last updated 5/28/2024


Model LinkView on HuggingFace
API SpecView on HuggingFace
Github LinkNo Github link provided
Paper LinkNo paper link provided

Get summaries of the top AI models delivered straight to your inbox:

Model overview

The NuNER-v0.1 model is an English language entity recognition model fine-tuned from the RoBERTa-base model by the team at NuMind. This model provides strong token embeddings for entity recognition tasks in English. It was the prototype for the NuNER v1.0 model, which is the version reported in the paper introducing the model.

The NuNER-v0.1 model outperforms the base RoBERTa-base model on entity recognition, achieving an F1 macro score of 0.7500 compared to 0.7129 for RoBERTa-base. Combining the last and second-to-last hidden states further improves performance to 0.7686 F1 macro.

Other notable entity recognition models include bert-base-NER, a BERT-base model fine-tuned on the CoNLL-2003 dataset, and roberta-large-ner-english, a RoBERTa-large model fine-tuned for English NER.

Model inputs and outputs


  • Text: The model takes in raw text as input, which it then tokenizes and encodes for processing.


  • Entity predictions: The model outputs a sequence of entity predictions for the input text, classifying each token as belonging to one of the four entity types: location (LOC), organization (ORG), person (PER), or miscellaneous (MISC).
  • Token embeddings: The model can also be used to extract token-level embeddings, which can be useful for downstream tasks. The author suggests using the concatenation of the last and second-to-last hidden states for better quality embeddings.


The NuNER-v0.1 model is highly capable at recognizing entities in English text, surpassing the base RoBERTa model on the CoNLL-2003 NER dataset. It can accurately identify locations, organizations, people, and miscellaneous entities within input text. This makes it a powerful tool for applications that require understanding the entities mentioned in documents, such as information extraction, knowledge graph construction, or content analysis.

What can I use it for?

The NuNER-v0.1 model can be used for a variety of applications that involve identifying and extracting entities from English text. Some potential use cases include:

  • Information Extraction: The model can be used to automatically extract key entities (people, organizations, locations, etc.) from documents, articles, or other text-based data sources.
  • Knowledge Graph Construction: The entity predictions from the model can be used to populate a knowledge graph with structured information about the entities mentioned in a corpus.
  • Content Analysis: By understanding the entities present in text, the model can enable more sophisticated content analysis tasks, such as topic modeling, sentiment analysis, or text summarization.
  • Chatbots and Virtual Assistants: The entity recognition capabilities of the model can be leveraged to improve the natural language understanding of chatbots and virtual assistants, allowing them to better comprehend user queries and respond appropriately.

Things to try

One interesting aspect of the NuNER-v0.1 model is its ability to produce high-quality token embeddings by concatenating the last and second-to-last hidden states. These embeddings could be used as input features for a wide range of downstream NLP tasks, such as text classification, named entity recognition, or relation extraction. Experimenting with different ways of utilizing these embeddings, such as fine-tuning on domain-specific datasets or combining them with other model architectures, could lead to exciting new applications and performance improvements.

Another avenue to explore would be comparing the NuNER-v0.1 model's performance on different types of text data, beyond the news-based CoNLL-2003 dataset used for evaluation. Trying the model on more informal, conversational text (e.g., social media, emails, chat logs) could uncover interesting insights about its generalization capabilities and potential areas for improvement.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models




Total Score


The bert-base-NER model is a fine-tuned BERT model that is ready to use for Named Entity Recognition (NER) and achieves state-of-the-art performance for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC). Specifically, this model is a bert-base-cased model that was fine-tuned on the English version of the standard CoNLL-2003 Named Entity Recognition dataset. If you'd like to use a larger BERT-large model fine-tuned on the same dataset, a bert-large-NER version is also available. The maintainer, dslim, has also provided several other NER models including distilbert-NER, bert-large-NER, and both cased and uncased versions of bert-base-NER. Model inputs and outputs Inputs Text**: The model takes a text sequence as input and predicts the named entities within that text. Outputs Named entities**: The model outputs the recognized named entities, along with their type (LOC, ORG, PER, MISC) and the start/end position within the input text. Capabilities The bert-base-NER model is capable of accurately identifying a variety of named entities within text, including locations, organizations, persons, and miscellaneous entities. This can be useful for applications such as information extraction, content analysis, and knowledge graph construction. What can I use it for? The bert-base-NER model can be used for a variety of text processing tasks that involve identifying and extracting named entities. For example, you could use it to build a search engine that allows users to find information about specific people, organizations, or locations mentioned in a large corpus of text. You could also use it to automatically extract key entities from customer service logs or social media posts, which could be valuable for market research or customer sentiment analysis. Things to try One interesting thing to try with the bert-base-NER model is to experiment with incorporating it into a larger natural language processing pipeline. For example, you could use it to first identify the named entities in a piece of text, and then use a different model to classify the sentiment or topic of the text, focusing on the identified entities. This could lead to more accurate and nuanced text analysis. Another idea is to fine-tune the model further on a domain-specific dataset, which could help it perform better on specialized text. For instance, if you're working with legal documents, you could fine-tune the model on a corpus of legal text to improve its ability to recognize legal entities and terminology.

Read more

Updated Invalid Date




Total Score


roberta-large-ner-english is an English named entity recognition (NER) model that was fine-tuned from the RoBERTa large model on the CoNLL2003 dataset. The model was developed by Jean-Baptiste and is capable of identifying entities such as persons, organizations, locations, and miscellaneous. It was validated on emails and chat data, and outperforms other models on this type of data, particularly for entities that do not start with an uppercase letter. Model inputs and outputs Inputs Raw text to be processed for named entity recognition Outputs A list of identified entities, with the entity type (PER, ORG, LOC, MISC), the start and end positions in the input text, the text of the entity, and the confidence score. Capabilities The roberta-large-ner-english model can accurately identify a variety of named entities in English text, including people, organizations, locations, and miscellaneous entities. It has been shown to perform particularly well on informal text like emails and chat messages, where entities may not always start with an uppercase letter. What can I use it for? You can use the roberta-large-ner-english model for a variety of natural language processing tasks that require named entity recognition, such as information extraction, question answering, and content analysis. For example, you could use it to automatically extract the key people, organizations, and locations mentioned in a set of business documents or news articles. Things to try One interesting thing to try with the roberta-large-ner-english model is to see how it performs on your own custom text data, especially if it is in a more informal or conversational style. You could also experiment with combining the model's output with other natural language processing techniques, such as relation extraction or sentiment analysis, to gain deeper insights from your text data.

Read more

Updated Invalid Date




Total Score


The roberta-base model is a transformer model pretrained on English language data using a masked language modeling (MLM) objective. It was developed and released by the Facebook AI research team. The roberta-base model is a case-sensitive model, meaning it can distinguish between words like "english" and "English". It builds upon the BERT architecture, but with some key differences in the pretraining procedure that make it more robust. Similar models include the larger roberta-large as well as the BERT-based bert-base-cased and bert-base-uncased models. Model inputs and outputs Inputs Unconstrained text input The model expects tokenized text in the required format, which can be handled automatically using the provided tokenizer Outputs The model can be used for masked language modeling, where it predicts the masked tokens in the input It can also be used as a feature extractor, where the model outputs contextual representations of the input text that can be used for downstream tasks Capabilities The roberta-base model is a powerful language understanding model that can be fine-tuned on a variety of tasks such as text classification, named entity recognition, and question answering. It has been shown to achieve strong performance on benchmarks like GLUE. The model's bidirectional nature allows it to capture contextual relationships between words, which is useful for tasks that require understanding the full meaning of a sentence or passage. What can I use it for? The roberta-base model is primarily intended to be fine-tuned on downstream tasks. The Hugging Face model hub provides access to many fine-tuned versions of the model for various applications. Some potential use cases include: Text classification: Classifying documents, emails, or social media posts into different categories Named entity recognition: Identifying and extracting important entities (people, organizations, locations, etc.) from text Question answering: Building systems that can answer questions based on given text passages Things to try One interesting thing to try with the roberta-base model is to explore its performance on tasks that require more than just language understanding, such as common sense reasoning or multi-modal understanding. The model's strong performance on many benchmarks suggests it may be able to capture deeper semantic relationships, which could be leveraged for more advanced applications. Another interesting direction is to investigate the model's biases and limitations, as noted in the model description. Understanding the model's failure cases and developing techniques to mitigate biases could lead to more robust and equitable language AI systems.

Read more

Updated Invalid Date




Total Score


The UniNER-7B-all model is the best model from the Universal NER project. It is a large language model trained on a combination of three data sources: (1) Pile-NER-type data and Pile-NER-definition data generated by ChatGPT, and (2) 40 supervised datasets in the Universal NER benchmark. This robust model outperforms similar NER models like wikineural-multilingual-ner and bert-base-NER, making it a powerful tool for named entity recognition tasks. Model inputs and outputs The UniNER-7B-all model is a text-to-text AI model that can be used for named entity recognition (NER) tasks. It takes in a text input and outputs the entities identified in the text, along with their corresponding types. Inputs Text**: The input text that the model will analyze to identify named entities. Outputs Entity predictions**: The model's predictions of the named entities present in the input text, along with their entity types (e.g. person, location, organization). Capabilities The UniNER-7B-all model is capable of accurately identifying a wide range of named entities within text, including person, location, organization, and more. Its robust training on diverse datasets allows it to perform well on a variety of text types and genres, making it a versatile tool for NER tasks. What can I use it for? The UniNER-7B-all model can be used for a variety of applications that require named entity recognition, such as: Content analysis**: Analyze news articles, social media posts, or other text-based content to identify key entities and track mentions over time. Knowledge extraction**: Extract structured information about entities (e.g. people, companies, locations) from unstructured text. Chatbots and virtual assistants**: Integrate the model into conversational AI systems to better understand user queries and provide more relevant responses. Things to try One interesting thing to try with the UniNER-7B-all model is to use it to analyze text across different domains and genres, such as news articles, academic papers, and social media posts. This can help you understand the model's performance and limitations in different contexts, and identify areas where it excels or struggles. Another idea is to experiment with different prompting techniques to see how they affect the model's entity predictions. For example, you could try providing additional context or framing the task in different ways to see if it impacts the model's outputs.

Read more

Updated Invalid Date