Video Instance Segmentation JSON-creation

Updated at January 15th, 2024

Data

Input metadata

The input metadata offers essential information about the video, comprising a sequence of images that create the motion picture, along with its size, height, and width.

{
   "data":{
   "video Files":"["000000.jpg","000001.jpg","000002.jpg"...]",
      "video Original url":"https://New-York-Street-Video.mp4",
      "video Original file name":"New-York-Street-Video.mp4",
   "video Frames per second": "29.97002997002997",
   "video Total frames": "29",
   "video Height": "1080",
   "video Width": "1920",
   "video Duration": "0.531000"
   }
}

Input

The 'data' element systematically enumerates the inputs as specified in the project settings. In this context, the defined inputs include 'name' and 'url'. This means that the 'data' element will specifically list and reference these inputs, aligning with how they are configured within the project's parameters.

{
   "data":{
      "url":"https://New-York-Street-Video.mp4",
      "name":"City"
   }
}

Output

Static output

Static outputs are unchanging tags, such as a car's color. Static tags are also known as static shape outputs, which remain consistent across frames.

{
   "tags":{
      "object":"color|black and white",
      "category":"pedestrian",
      "position":"forward"
   }
}

Dynamic output

Dynamic tags, nested within 'key_locations' or 'locations', can vary from one keyframe to the next for example the roadside of a car.

{
    "tags": {
        "roadside": {
            "left": "1",
            "none": "0",
            "right": "0"
        }
    }
}

Scene output

Scene outputs reflect the entire workspace. In this instance, elements like “date” and “comments” prefixed with 'output_', are examples of scene outputs. These outputs also encompass the overall scene context, such as indicating whether it's day or night, or whether the weather is sunny or rainy.

{
   "output_date":"2025-05-07",
   "output_comments":"This is a test comment"
}

Shape output

When annotating a video, it involves marking shapes on each image in the sequence, thus forming a list. Each marked shape is identified as a 'shape output'. These shape outputs are often called 'nested outputs' due to their organizational structure. They can either be integrated within a workspace output or presented as nested objects in the JSON output file.

In the provided JSON example, observe how the points of a rectangle change from frame 0 to frame 15. This variation occurs because one object is being annotated throughout the entire sequence, indicative of the video format. The changing points across different frames illustrate how the object's position or shape is dynamically tracked and annotated across the video, showcasing the concept of multiple frames in video annotation.

[
    {
        "output_video": [
            {
                "shapes": [
                    {
                        "tags": {
                            "object": "color|black and white",
                            "category": "pedestrian",
                            "position": "forward"
                        },
                        "type": "rectangle",
                        "index": 1,
                        "key_locations": [
                            {
                                "tags": {
                                    "road side": {
                                        "left": "1",
                                        "none": "0",
                                        "right": "0"
                                    }
                                },
                                "points": [
                                    [337, 668],
                                    [492, 668],
                                    [337, 972],
                                    [492, 972]
                                ],
                                "visibility": 1,
                                "frame_number": 0
                            },
                            {
                                "points": [
                                    [321, 664],
                                    [396, 664],
                                    [321, 968],
                                    [396, 968]
                                ],
                                "visibility": 1,
                                "frame_number": 5
                            },
                            {
                                "points": [
                                    [209, 666],
                                    [386, 666],
                                    [209, 970],
                                    [386, 970]
                                ],
                                "visibility": 1,
                                "frame_number": 10
                            },
                            {
                                "points": [
                                    [175, 668],
                                    [371, 668],
                                    [175, 972],
                                    [371, 972]
                                ],
                                "visibility": 1,
                                "frame_number": 14
                            }
                        ]
                    }
                ],
                "group_type": null,
                "frame_count": 15
            }
        ]
    }
]

Workspace output

The workspace serves as the canvas for creating annotations, with any shapes drawn on it constituting the workspace's outputs. Specifically for video annotations, this workspace is always referred to as 'output_video'.

{
   "output_video": [
      {
         "shapes": []
      }
   ]
}

Output Type

Multi-level menu

Multi-level menus consist of a hierarchy of nested menus. In our JSON format representation, this hierarchical structure is denoted using the '|' character. For instance, in the 'object' output provided as an example, 'color' represents the first level of the hierarchy, while 'multiple colors' signifies the second level. This notation effectively communicates the layered organization of the menu options

{
   "object":"color|black and white"
}

Dropdown

The syntax for a dropdown output is structured simply as 'key': 'value'. For example, a dropdown can be found within the 'category' tag in this scenario.

{
   "category":"pedestrian"
}

Radio button

The syntax used for a radio button output adopts a straightforward 'key': 'value' format. In the given example, a radio button is located under the 'position' tag.

{
   "position":"forward"
}

Checkbox

The syntax for a checkboxes output is structured as 'key': { 'option1': '0', 'option2': '1' }, where the number '1' indicates the selected option. Examples of checkboxes can be found under the 'road side' tag

{
   "road side":{
      "left":"1",
      "none":"0",
      "right":"0"
   }
}

Date

The date output adheres to the standard date format, exemplified as '2025-05-07'

{
   "output_date":"2025-05-07"
}

Text area

The syntax for a text area output uses a straightforward 'key': 'value' format. In this structure, the value can represent a paragraph of text

{
   "output_comments":"This is a test comment"
}

Frames

Multiple frames

Refers to a series of individual images or "frames" that make up a video or animation. Each frame is a static image, and when these frames are played in sequence at a certain speed, they create the illusion of motion.

{
   "frame_number": 0
}

Visibility

The 'visibility' element indicates the visibility of a shape in a given frame. A shape might be visible in one frame but obscured by another object in the next, rendering it invisible. However, it can reappear in subsequent frames. This element helps track the shape's visibility across different frames.

Visible

{
    "visibility": 1,
}

Not visible

{
    "visibility": 0,
}

Frame number

The 'frame_number' in a 'key_frame' represents its position within the overall sequence of frames.

{
     "frame_number": 0
}

Check the complete JSON

[
    {
       "priority":0,
       "data":{
          "video Files":[
             "000000.jpg",
             "000001.jpg",
             "000002.jpg",
             "000003.jpg",
             "000004.jpg",
             "000005.jpg",
             "000006.jpg",
             "000007.jpg",
             "000008.jpg",
             "000009.jpg",
             "000010.jpg",
             "000011.jpg",
             "000012.jpg",
             "000013.jpg",
             "000014.jpg"
          ],
          "video Original url":"https://New-York-Street-Short-Video.mp4",
          "video Original file name":"New-York-Street-Short-Video.mp4",
          "video Frames per second":"29.97002997002997",
          "video Total frames":"29",
          "video Height":"1080",
          "video Width":"1920",
          "video Duration":"0.531000",
          "url":"https://New-York-Street-Short-Video.zip",
          "name":"City",
          "output_video":[
             {
                "shapes":[
                   {
                      "tags":{
                         "object":"color|black and white",
                         "category":"pedestrian",
                         "position":"forward"
                      },
                      "type":"rectangle",
                      "index":1,
                      "key_locations":[
                         {
                            "tags":{
                               "road side":{
                                  "left":"1",
                                  "none":"0",
                                  "right":"0"
                               }
                            },
                            "points":[
                               [
                                  337,
                                  668
                               ],
                               [
                                  492,
                                  668
                               ],
                               [
                                  337,
                                  972
                               ],
                               [
                                  492,
                                  972
                               ]
                            ],
                            "visibility":1,
                            "frame_number":0
                         },
                         {
                            "points":[
                               [
                                  321,
                                  664
                               ],
                               [
                                  396,
                                  664
                               ],
                               [
                                  321,
                                  968
                               ],
                               [
                                  396,
                                  968
                               ]
                            ],
                            "visibility":1,
                            "frame_number":5
                         },
                         {
                            "points":[
                               [
                                  209,
                                  666
                               ],
                               [
                                  386,
                                  666
                               ],
                               [
                                  209,
                                  970
                               ],
                               [
                                  386,
                                  970
                               ]
                            ],
                            "visibility":1,
                            "frame_number":10
                         },
                         {
                            "points":[
                               [
                                  175,
                                  668
                               ],
                               [
                                  371,
                                  668
                               ],
                               [
                                  175,
                                  972
                               ],
                               [
                                  371,
                                  972
                               ]
                            ],
                            "visibility":1,
                            "frame_number":14
                         }
                      ]
                   }
                ],
                "group_type":null,
                "frame_count":15
             },
             {
                "shapes":[
                   {
                      "tags":{
                         "object":"color|multiple colors",
                         "category":"cab",
                         "position":"forward"
                      },
                      "type":"cuboid",
                      "index":2,
                      "key_locations":[
                         {
                            "tags":{
                               "road side":{
                                  "left":"0",
                                  "none":"0",
                                  "right":"1"
                               }
                            },
                            "points":[
                               [
                                  1281,
                                  1057
                               ],
                               [
                                  1281,
                                  648
                               ],
                               [
                                  1693,
                                  1070
                               ],
                               [
                                  1693.0,
                                  647.936117936118
                               ],
                               [
                                  1274,
                                  980
                               ],
                               [
                                  1274.0,
                                  648.3783783783783
                               ],
                               [
                                  1605.867326732673,
                                  988.4950495049504
                               ],
                               [
                                  1605.867326732673,
                                  648.3366336633665
                               ]
                            ],
                            "visibility":1,
                            "frame_number":0,
                            "key_points":[
                               [
                                  1267,
                                  650
                               ],
                               [
                                  1693,
                                  1070
                               ],
                               [
                                  1281,
                                  1057
                               ],
                               [
                                  1274,
                                  980
                               ],
                               [
                                  1281,
                                  648
                               ]
                            ]
                         },
                         {
                            "points":[
                               [
                                  1280,
                                  1059
                               ],
                               [
                                  1280,
                                  650
                               ],
                               [
                                  1716,
                                  1072
                               ],
                               [
                                  1716.0,
                                  650.0
                               ],
                               [
                                  1273,
                                  982
                               ],
                               [
                                  1273.0,
                                  650.0
                               ],
                               [
                                  1624.632117517025,
                                  990.5149451515907
                               ],
                               [
                                  1624.632117517025,
                                  650.0
                               ]
                            ],
                            "visibility":1,
                            "frame_number":5,
                            "key_points":[
                               [
                                  1267,
                                  650
                               ],
                               [
                                  1716,
                                  1072
                               ],
                               [
                                  1280,
                                  1059
                               ],
                               [
                                  1273,
                                  982
                               ],
                               [
                                  1280,
                                  650
                               ]
                            ]
                         },
                         {
                            "points":[
                               [
                                  1280,
                                  1059
                               ],
                               [
                                  1280,
                                  650
                               ],
                               [
                                  1711,
                                  1072
                               ],
                               [
                                  1711.0,
                                  650.0
                               ],
                               [
                                  1273,
                                  982
                               ],
                               [
                                  1273.0,
                                  650.0
                               ],
                               [
                                  1620.597580252196,
                                  990.5149451515908
                               ],
                               [
                                  1620.597580252196,
                                  650.0
                               ]
                            ],
                            "visibility":1,
                            "frame_number":10,
                            "key_points":[
                               [
                                  1267,
                                  650
                               ],
                               [
                                  1711,
                                  1072
                               ],
                               [
                                  1280,
                                  1059
                               ],
                               [
                                  1273,
                                  982
                               ],
                               [
                                  1280,
                                  650
                               ]
                            ]
                         }
                      ]
                   }
                ],
                "group_type":null,
                "frame_count":15
             }
          ],
          "output_date":"2025-05-07",
          "output_comments":"This is a test comment"
       }
    }
 ]

Contact Us