The input metadata offers essential information about the video, comprising a sequence of images that create the motion picture, along with its size, height, and width.
{
"data":{
"video Files":"["000000.jpg","000001.jpg","000002.jpg"...]",
"video Original url":"https://New-York-Street-Video.mp4",
"video Original file name":"New-York-Street-Video.mp4",
"video Frames per second": "29.97002997002997",
"video Total frames": "29",
"video Height": "1080",
"video Width": "1920",
"video Duration": "0.531000"
}
}
Input
The 'data' element systematically enumerates the inputs as specified in the project settings. In this context, the defined inputs include 'name' and 'url'. This means that the 'data' element will specifically list and reference these inputs, aligning with how they are configured within the project's parameters.
Scene outputs reflect the entire workspace. In this instance, elements like “date” and “comments” prefixed with 'output_', are examples of scene outputs. These outputs also encompass the overall scene context, such as indicating whether it's day or night, or whether the weather is sunny or rainy.
{
"output_date":"2025-05-07",
"output_comments":"This is a test comment"
}
Shape output
When annotating a video, it involves marking shapes on each image in the sequence, thus forming a list. Each marked shape is identified as a 'shape output'. These shape outputs are often called 'nested outputs' due to their organizational structure. They can either be integrated within a workspace output or presented as nested objects in the JSON output file.
In the provided JSON example, observe how the points of a rectangle change from frame 0 to frame 15. This variation occurs because one object is being annotated throughout the entire sequence, indicative of the video format. The changing points across different frames illustrate how the object's position or shape is dynamically tracked and annotated across the video, showcasing the concept of multiple frames in video annotation.
The workspace serves as the canvas for creating annotations, with any shapes drawn on it constituting the workspace's outputs. Specifically for video annotations, this workspace is always referred to as 'output_video'.
{
"output_video": [
{
"shapes": []
}
]
}
Output Type
Multi-level menu
Multi-level menus consist of a hierarchy of nested menus. In our JSON format representation, this hierarchical structure is denoted using the '|' character. For instance, in the 'object' output provided as an example, 'color' represents the first level of the hierarchy, while 'multiple colors' signifies the second level. This notation effectively communicates the layered organization of the menu options
{
"object":"color|black and white"
}
Dropdown
The syntax for a dropdown output is structured simply as 'key': 'value'. For example, a dropdown can be found within the 'category' tag in this scenario.
{
"category":"pedestrian"
}
Radio button
The syntax used for a radio button output adopts a straightforward 'key': 'value' format. In the given example, a radio button is located under the 'position' tag.
{
"position":"forward"
}
Checkbox
The syntax for a checkboxes output is structured as 'key': { 'option1': '0', 'option2': '1' }, where the number '1' indicates the selected option. Examples of checkboxes can be found under the 'road side' tag
The date output adheres to the standard date format, exemplified as '2025-05-07'
{
"output_date":"2025-05-07"
}
Text area
The syntax for a text area output uses a straightforward 'key': 'value' format. In this structure, the value can represent a paragraph of text
{
"output_comments":"This is a test comment"
}
Frames
Multiple frames
Refers to a series of individual images or "frames" that make up a video or animation. Each frame is a static image, and when these frames are played in sequence at a certain speed, they create the illusion of motion.
{
"frame_number": 0
}
Visibility
The 'visibility' element indicates the visibility of a shape in a given frame. A shape might be visible in one frame but obscured by another object in the next, rendering it invisible. However, it can reappear in subsequent frames. This element helps track the shape's visibility across different frames.
Visible
{
"visibility": 1,
}
Not visible
{
"visibility": 0,
}
Frame number
The 'frame_number' in a 'key_frame' represents its position within the overall sequence of frames.