[](#yolov8-detection-model)YOLOv8 Detection Model
=================================================

[](#datasets)Datasets
---------------------

### [](#face)Face

*   [Anime Face CreateML](https://universe.roboflow.com/my-workspace-mph8o/anime-face-createml)
*   [xml2txt](https://universe.roboflow.com/0oooooo0/xml2txt-njqx1)
*   [AN](https://universe.roboflow.com/sed-b8vkf/an-lfg5i)
*   [wider face](http://shuoyang1213.me/WIDERFACE/index.html)

### [](#hand)Hand

*   [AnHDet](https://universe.roboflow.com/1-yshhi/anhdet)
*   [hand-detection-fuao9](https://universe.roboflow.com/catwithawand/hand-detection-fuao9)

### [](#person)Person

*   [coco2017](https://cocodataset.org/#home) (only person)
*   [AniSeg](https://github.com/jerryli27/AniSeg)
*   [skytnt/anime-segmentation](https://huggingface.co/datasets/skytnt/anime-segmentation)

### [](#deepfashion2)deepfashion2

*   [deepfashion2](https://github.com/switchablenorms/DeepFashion2)

id

label

0

short\_sleeved\_shirt

1

long\_sleeved\_shirt

2

short\_sleeved\_outwear

3

long\_sleeved\_outwear

4

vest

5

sling

6

shorts

7

trousers

8

skirt

9

short\_sleeved\_dress

10

long\_sleeved\_dress

11

vest\_dress

12

sling\_dress

[](#info)Info
-------------

Model

Target

mAP 50

mAP 50-95

face\_yolov8n.pt

2D / realistic face

0.660

0.366

face\_yolov8n\_v2.pt

2D / realistic face

0.669

0.372

face\_yolov8s.pt

2D / realistic face

0.713

0.404

face\_yolov8m.pt

2D / realistic face

0.737

0.424

face\_yolov9c.pt

2D / realistic face

0.748

0.433

hand\_yolov8n.pt

2D / realistic hand

0.767

0.505

hand\_yolov8s.pt

2D / realistic hand

0.794

0.527

hand\_yolov9c.pt

2D / realistic hand

0.810

0.550

person\_yolov8n-seg.pt

2D / realistic person

0.782 (bbox)  
0.761 (mask)

0.555 (bbox)  
0.460 (mask)

person\_yolov8s-seg.pt

2D / realistic person

0.824 (bbox)  
0.809 (mask)

0.605 (bbox)  
0.508 (mask)

person\_yolov8m-seg.pt

2D / realistic person

0.849 (bbox)  
0.831 (mask)

0.636 (bbox)  
0.533 (mask)

deepfashion2\_yolov8s-seg.pt

realistic clothes

0.849 (bbox)  
0.840 (mask)

0.763 (bbox)  
0.675 (mask)

[](#usage)Usage
---------------

    from huggingface_hub import hf_hub_download
    from ultralytics import YOLO
    
    path = hf_hub_download("Bingsu/adetailer", "face_yolov8n.pt")
    model = YOLO(path)
    

    import cv2
    from PIL import Image
    
    img = "https://farm5.staticflickr.com/4139/4887614566_6b57ec4422_z.jpg"
    output = model(img)
    pred = output[0].plot()
    pred = cv2.cvtColor(pred, cv2.COLOR_BGR2RGB)
    pred = Image.fromarray(pred)
    pred
    

[![image](https://i.imgur.com/9ny1wmD.png)](https://i.imgur.com/9ny1wmD.png)

## Model overview

The `adetailer` model is a set of object detection models developed by Bingsu, a Hugging Face creator. The models are trained on various datasets, including face, hand, person, and deepfashion2 datasets, and can detect and segment these objects with high accuracy. The model offers several pre-trained variants, each specialized for a specific task, such as detecting 2D/realistic faces, hands, and persons with bounding boxes and segmentation masks.

The `adetailer` model is closely related to the [YOLOv8 detection model](https://aimodels.fyi/creators/huggingFace/Bingsu) and leverages the YOLO (You Only Look Once) framework. It provides a versatile solution for tasks involving the detection and segmentation of faces, hands, and persons in images.

## Model inputs and outputs

### Inputs
- Image data (either a file path, URL, or a PIL Image object)

### Outputs
- Bounding boxes around detected objects (faces, hands, persons)
- Class labels for the detected objects
- Segmentation masks for the detected objects (in addition to bounding boxes)

## Capabilities

The `adetailer` model is capable of detecting and segmenting faces, hands, and persons in images with high accuracy. It outperforms many existing object detection models in terms of mAP (mean Average Precision) on the specified datasets, as shown in the provided performance metrics.

The model's ability to provide both bounding boxes and segmentation masks for the detected objects makes it a powerful tool for applications that require precise object localization and segmentation, such as image editing, augmented reality, and computer vision tasks.

## What can I use it for?

The `adetailer` model can be used in a variety of applications that involve the detection and segmentation of faces, hands, and persons in images. Some potential use cases include:

- **Image editing and manipulation**: The model's segmentation capabilities can be used to enable advanced image editing techniques, such as background removal, object swapping, and face/body editing.
- **Augmented reality**: The bounding box and segmentation outputs can be used to overlay virtual elements on top of real-world objects, enabling more realistic and immersive AR experiences.
- **Computer vision and image analysis**: The model's object detection and segmentation capabilities can be leveraged in various computer vision tasks, such as person tracking, gesture recognition, and clothing/fashion analysis.
- **Facial analysis and recognition**: The face detection and segmentation features can be used in facial analysis applications, such as emotion recognition, age estimation, and facial landmark detection.

## Things to try

One interesting aspect of the `adetailer` model is its ability to handle a diverse range of object types, from realistic faces and hands to anime-style persons and clothing. This versatility allows you to experiment with different input images and see how the model performs across various visual styles and domains.

For example, you could try feeding the model images of anime characters, cartoon figures, or stylized illustrations to see how it handles the detection and segmentation of these more abstract object representations. Observing the model's performance on these challenging inputs can provide valuable insights into its generalization capabilities and potential areas for improvement.

Additionally, you could explore the model's segmentation outputs in more detail, examining the quality and accuracy of the provided masks for different object types. This information can be useful in determining the model's suitability for applications that require precise object isolation, such as image compositing or virtual try-on scenarios.