Video Annotator - Tracking objects through video frames¶
The current tutorial illustrates how to use Ipyannotator to classify video data.
The task of identifying objects in a video frame is called video classification.
Ipyannotator allows users to explore an entire set of video frames and specific labels; manually create their datasets drawing bounding boxes and associating labels across the frames; improve existing annotations.
This tutorial is divided in the following steps:
Select dataset¶
This tutorial uses a minimal artificial video dataset generated by Ipyannotator. The dataset follows MOT data format. It contains 20 images with 2 classes (rectangle
and circle
) and doesn’t need to be downloaded.
dataset = DS.ARTIFICIAL_VIDEO
Setup annotator¶
This section will set up the paths and the input/output pair needed to classify the images.
The following cell will import the project file and directory where the images were generated. For this tutorial we simplify the process using the get_settings
function instead of hardcoding the paths.
settings_ = get_settings(dataset)
settings_.project_file, settings_.image_dir
(Path('data/artificial/annotations.json'), 'images')
Ipyannotator uses pairs of input/output data to set up the annotation.
The video image classification annotator uses InputImage
and OutputVideoBox
as the pair to set up the annotator.
The InputImage
function provides information about the directory that contains the images to be classified, and the images itself. The OutputImageBox
function provides information about the directory that contains the classes that can be associated with the images.
input_ = InputImage(image_dir=settings_.image_dir,
image_width=settings_.im_width,
image_height=settings_.im_height)
output_ = OutputVideoBbox(classes=['Circle', 'Rectangle'])
input_.dir
'images'
The final part in setting up the Ipyannotator is the configuration of the Annotator
factory with the pair of input/output data.
The factory allows three types of annotator tools: explore, create, improve. The next sections will guide you through every step.
anni = Annotator(input_, NoOutput(), settings_)
Explore¶
The explore option allows users to navigate across the images in the dataset using next/previous
buttons. This function is used for data visualization only, improvement and additional labeling is done in the next steps.
When exploring the artificial dataset used in this tutorial you will see a red circle and a gray rectangle as the objects to be tracked. The black square represents an occlusion on the objects and is used to illustrate how the improve step works.
explorer = anni.explore()
explorer
Create¶
The create option allows users to manually create their annotated datasets. Please be aware that
Warning
The video annotator create option is a beta version
Currently, video annotation allows users to draw multiple bounding boxes in every frame and associate a label to every annotated object bounding box. Ipyannotator generates the objects creating indexed labels that start from 0.
The next cell removes already created annotation files to create a new dataset.
dirpath = 'data/artificial/create_results'
if os.path.exists(dirpath) and os.path.isdir(dirpath):
shutil.rmtree(dirpath)
The next cell initializes the create option.
For this tutorial, a function was defined that imitates human work, annotating the images automatically.
anni.output_item = output_
creator = anni.create()
creator
The next cell imitate human work annotating all images automatically.
HELPER = Tutorial(dataset, settings_.project_path)
annotations = HELPER.annotate_video_bboxes(creator)
All data is stored in a file formatted as JSON in the following format:
data_format = {k: v for i, (k, v) in enumerate(annotations.items()) if i == 0}
print(json.dumps(data_format, indent=2))
{
"data/artificial/images/0000.jpg": {
"bbox": [
{
"x": 10,
"y": 150,
"width": 30,
"height": 10,
"id": "0"
},
{
"x": 30,
"y": 30,
"width": 40,
"height": 40,
"id": "1"
}
],
"labels": [
[
"Rectangle"
],
[
"Circle"
]
]
}
}
Note that in the JSON file above the annotations of each frame is mapped by the path of the image. Every bounding box drawn in the annotators has the properties: x
, y
, width
, height
, id
as part of the bbox
field. The annotation labels are mapped in the labels
field in the JSON file. Every index of the labels
array corresponds to the object mapped in the bbox
property.
Improve¶
The improve feature in the Ipyannotator video annotation allows users to refine the annotated dataset. This includes:
Select objects across the frames and join the trajectories drawn.
Update labels across the entire annotation.
In the example below we have an occlusion illustrated by a black square. The rectangle disappears behind the occluding object and appears again with a new object id. The video annotator allows users to join the trajectories of different objects into a new object.
Joining trajectory:
Navigate across the annotator
Note that the gray rectangle disappears
Note that the gray rectangle reappears but with a new id
Select the rectangle with a new id (marking the checkbox)
Navigate back until you see the old gray rectangle id
Select the rectangle with the old id (marking the checkbox)
Click on the join button
improver = anni.improve()
improver