This technical brief on “Using KLV Metadata for Telemetry” was written by Paul Hightower from Instrumentation Technology Systems (www.itsamerica.com) for more information, contact us.
When SMPTE created the digital video frames (SMTPE 274 and SMPT 296) they designed in space for ancillary (ANC) data. The most commonly used ancillary data are the sound bites. Sound bites are packets of digital audio data that are embedded in each frame. When video is played, the sound packets are decoded and are transformed back to the original recorded audio. Since the sound packets are embedded in the frames relevant to the audio, the sound is synchronised with the video. So, when someone is speaking the words you hear look like they are those formed by the mouth of the speaker in the video, and when a door slam sound is made, the image of the door hitting its stop happens at the same time.
Taking this a bit further, packets of data embedded with image frames can be synchronised the same way.
In ITS HD video text-time-KLV inserter products, we have designed the systems to sample data (time, and other measurements) at the start of each frame. This method adds the ability of the video system to act as a data acquisition system where the start of each frame is the strobe (the signal that says “latch the data now”). In this video-data acquisition system the sample rate is the frame rate. We insert the data in the vertical ancillary data space known as the vertical ANC (VANC). Audio packets are always inserted in the horizontal ancillary data space known as the HANC.
The general SMPTE HD/UHD video frame structure
To help give you a visual, the general frame structure of an HD frame is shown above. The cardinal points are the SAV (start of active video) and EAV (end of active video). Between these to digital markers is the imagery of the video frame on the horizontal pixel line. The VANC space is several lines above the active video area that varies with the image format (720, 1080 and 21601 lines). The corresponding image pixel count (1280, 1920 and 3840 for 720, 1080, and 4K respectively) varies as well.
The variation defines the HANC space on each line. The principle reason for the variations it to normalise the base band serialisation frequencies to 1.5, 3 and 12 Gb respectively for single channel 720p/60-1080p/30-1080i/60, 1080p/60 and 2160p/60 videos. Every line of video starts with SAV and is followed by video data (or VANC data above the start of active video) followed by the EAV followed by a line count and CRC for the line and followed by the HANC data. The HANC data is used by the next frame. EAV is used as the equivalent of horizontal sync. The space between EAV and SAV is always the HANC space when the lines are outside of the active image area
What is in the HANC and VANC space?
The HANC and VANC space are general black pixels unless they contain information, metadata. Metadata samples are packets of information. A metadata packet, like an ethernet packet. Data is enclosed in an envelope(s) that identifies the data. The data can be one piece of information or several fields of information.
Part of the envelop (ID) is an identifier used by the recipient (display, frame grabber or other video processor) to determine first whether it is data of importance to it. Removed from this envelope, there may be another envelope that more finely identifies the data packet within. Sound packets, found in the HANC, are AES3 (Audio Engineering Society) sound encoding. This format is identified by the packet type and key number so that an audio decoder locates the packet and then process the data and recreate the sound stream you hear.
A more general-purpose packet was defined by SMPTE called the Key Length Value (KLV) packet. There is an identifier that specifies that the packet is a KLV packet and a supplemental specifier (SSID) that when present identifies the KLV packet is formatted for the VANC space.
What is KLV Metadata?
KLV metadata is a variable length packet of data. That can contain any information desired. The metadata packet at above depicts a SMPTE type 2 KLV metadata packet. The type 02 is a packet specified to reside in the VANC space of a frame.
The Data Count (DC) in the outer envelope specifies how many samples (10-bit words) follow in the user data word. Within this structure, data is formatted as 8-bits of data and 2-bits as odd and even parity. Since the Data Count is only one byte long, the user data word may only be 1-255 bytes in length.
The content of this particular metadata packet is the KLV packet (inner envelope). The message ID (1-byte) and PSC (2-bytes) are often used by video CODECs to link and manage these data packets as part of separate MPEG streams embodied in the MPEG Transport Stream.
Just like Ethernet, the KLV metadata packet identifier is the first layer of information. Used by a recipient to determine whether to extract the message within. It will extract the packet in this case if the recipient generally processes KLV packets. The second layer identifier is the key (K). Keys are 16-byte numbers that are registered and are found in a published SMPTE dictionary. A range of key numbers have also been assigned the Motion Imagery Standards Board (MISB). This subset managed by the Government. The value (V) is the payload of the key and can be any glob of information from 1-235 bytes long.
When you look up a key number you will find a specification about how the payload of data is formatted and interpreted. It may identify the type of data (lens, pointing angle, aperture, text) contained in the value and provide a structure having fields of data where each field may be a different data type. Since the value of a KLV packet may be of variable length. The length of the value is specified in the key envelope, L. The value of L may only be from 1-235.
How can KLV metadata packets be used for Telemetry?
One of the struggles in telemetry is matching imagery with data. Such as when monitoring events, performance and capturing anomalies. We use video to observe the behaviour of things in motion. In order to better understand this better, we also measure influencers of that behaviour (wind, heat, moisture, impacts, stress, etc.) to fully understand it. Data and video are instrumental in formulating and validating models of behaviours. As they can be used to predict performance in other situations.
Since 2011, ITS has offered HD video equipment that can accept video. It also overlays timestamps on each frame as they are received. But it also embedded KLV metadata packets. In fact, the MISB 605,.3 Microsecond Timestamp is a registered KLV packet. When using this key, it is specified that the value be 9 bytes long. Where the first byte provides key status of the time measurement (whether locked to a reference, whether valid or invalid in some defined manner, etc.). The remaining 8 bytes is a 64-bit count of the number of microseconds that have transpired since January 1, 1970, the UNIX epoch. The count limit will carry us for 58,000 years from 1970, so this counter will be valid for quite a while.
Many KLV packets can be embedded on nearly any VANC line. The first packet must always begin immediately after the SAV marker in the frame. A unique feature of the Microsecond Timestamp, is it is also specified that it be located as the first packet on line 9 of the frame when used. This fixes the location of the packet for all applications; the first packet of a scan line 9.
Every ITS HD-SDI product will embed a KLV Microsecond Timestamp on every packet. The time embedded is that sampled from our internal time master at the instant of the first line of the frame at the SAV. Therefore, time is always sampled at the start of frame regardless of the frame format (720, 1080, 2160). All ITS text-time, HD inserters may be externally synchronised using an IRIG B12x time code source. Most HD units also have a 12-channel embedded GPS receiver which can synchronise our time master to within 30 nanoseconds of UTC. Since our time is sampled at a single point in each video frame . The time source can by highly accurately synchronised, the time at which a video frame arrived at our equipment (which is ideally within 100 meters of the video source) can be known with great accuracy.
In this case, our time master is like any data acquisition system. It is measuring time continuously and asynchronously to the start of video frames. However, if we sample the time with the video frame as described, we capture a very important analysis parameter, when the picture was taken. So, while our time master is counting time in microseconds, we are subsampling that count at the video frame rate in synchronism with the video capture itself. ITS is considered by our customers to be the gold standard. When it comes time to sampling time and stamping video frames. It is precise and accurate. Our system captures the time to 1 µsec resolution.
ITS Text-Time-KLV equipment has extend this use by enabling them to sample globs of data at the same time we sample time for insertion of the Microsecond Timestamp. These globs of data are then inserted as a second (or third) KLV packet as shown on the right. A data acquisition system may be offering data globs faster than the frame rate (just like our time master), the last glob received will at the start of every frame will be inserted in the VANC in real-time. The ITS Text-Time-KLV inserters subsample the data glob stream.
The ITS camera sync (CS) function can fine tune the timing. By sending a frame synchronised strobe to a data acquisition system or sensor. The strobe is essentially a “latch data now” signal. That can cause the measurement device to send a snapshot at that instant. Properly timed this data can then be inserted in the frame providing not only a timestamp but a set of influencing data perfectly aligned with the picture itself. No post processing is required to align the data with video. It is already accomplished using this method and if recorded correctly and archived, alignment it is permanent.
This is not entirely new with digital video. But unlike is ancestor analog video, the ability to capture data (metadata) with video was a designed in feature of the digital frame. The precision of the data capture can be well controlled giving the telemetry community a new efficient error free tool for video/data analysis of the behaviour of things in motion.
So how do we build metadata packet for insertion?
While there may be other ways, let us focus on a solution that the we know well, the ITS KLV tool suite. The suite comes in two central parts,
- a tool to create and define a data glob definition (or use an existing key)
- a software KLV tool that enables one to collect data from one or more sources and populate your data glob automatically. It further will encapsulate that glob in the commands needed by our HD-SDI equipment to cause them to be sampled and inserted into each video frame received.
There are published and registered metadata keys both in SMPTE Recommended Practice (RP) 224 of the latest issue.
Those of us familiar with the MISB, ST0807.16 provides a list of registered keys.
The Microsecond Timestamp is one of those keys, 06.0E.2B.34.02.05.01.01.0E.01.01.03.11.00.00.00. The details of the 9-byte format is found in ST 0605. Essentially there is a status byte followed by and 8-byte word that represents the number of microseconds that have transpired since January 1, 1970. All ITS HD video equipment is capable of both embedding and extracting for overlay the Microsecond Timestamp packet.
A general purpose MISB key is the UASDatalink:LocalMetadata Set that is fully defined in MISB ST 0601.6. In this packet a universal key, 06.0E.2B.34.02.0B.01.01.0E.01.03.01.01.00.00.00, identifies the packet. The value (payload) of the key can contain up to 94 variable tags that each define specific variables that can be embedded in the packet. The tag identifies the variable, the binary format and what the binary value represents. For example, tag 56 is defined as Platform Ground Speed. It is a I byte unsinged integer have the units of metres/second where each bit represents 1 metre/sec.
ITS Text-Time-KLV HD video equipment will accept KLV data using this key number the correct length which for this key is a variable and the data organised as tag-value pairs. The ITS HD equipment will accept such packets of data and insert after the microsecond timestamp a SMPTE 291M KLV type 02 pack containing these KLV values one each frame of video received.
There is a bit of work to do if it is desired for the ITS equipment to extract, parse and overlay the values of the UAS Local Metadata set.
Creating your own Metadata Key
There is nothing preventing one from creating a key number (you must follow the rules and avoid keys that start with any registered book number). From there one can structure the payload to represent any value or number of values that fit within the 235 bytes available.
ITS created a tool to facilitate the development of a key payload structure we call the KLV KeyTemplate©. The KeyTemplate© is a MS Excel workbook designed to help you build a key number (and check that it is correctly encoded) and form the structure within the value portion of the packet. The Key Template will calculate the length (L) for you automatically. An example is shown below
One can assign a field label (e.g. Latitude), assign a position in the 235-byte value space (e.g. 81 which is byte 81 into the 235 byte max value), assign a data type (e.g. SI-MAX which means that the value is a signed integer scaled such that the max value is Full Scale Value (e.g. 360). The resolution is set by the number of bytes of this field, Input LEN (e.g. 3 meaning that each bit when scaled represents 0.000043 degrees).
For convenience this KeyTemplate calculates the resolution as you have set it. In order to help you be sure your settings for this field and its decoded value are correct. You can define up to 64 fields (provided that these fields add up to 235 bytes or less). Value representations can be signed and unsigned integers of 1-3 bytes, an ASCII string, single precision and double precision IEEE 754 floating point numbers.
The final designed key is then saved as a CSV file. This file is used by all of our KLV equipment and other software tools. As it enables encoding, collection, insertion, extraction and overlay, and formation of output files (CSV) that are a list of records of the data collected in each frame along with the timestamp of each frame.
ITS Text-Time-KLV inserters and the I-Observe recorder will accept the CSV using the supporting GUIs supplied with the equipment to embed the packets. It can also instruct our units to extract and output a record (a frame record of data plus time) to a file while video passes through or is played back (I-Observe).
ITS created a utility called KeyTest© which will accept an import of the KeyTemplate CSV file. Once imported you can set KeyTest© to send test data organised as you specified using the KeyTemplate©. It will send packets of data enclosed in the minimal ITS commands necessary to cause any of the ITS Text-Time-KLV inserters of the I-Observe recorders to sample the latest data packet received just before the start of the next frame as received by the equipment. This is exactly the same instant that the internal time master is sampled to stamp the video frame.
KeyTest©, then is a key development debugging tool to help you be sure that the key is correctly built and when populated with live data can work as desired. KeyTest© is included at no additional cost with all HD Text-Time-KLV inserters and the I-Observe recorders.
This file can be viewed using the ITS KeyRead© software. This software also accepts the key CSV file and will build a form. Having the label you assign each field and the data collected parsed and formatted as you specified to render a human readable data set. Each record can be individually viewed or automatically sequenced from start to finish.
KeyRead© also enables you to output this data set to a CSV file. Which may be imported into MS Excel or other suitable program for further analysis. Combined with KeyTest©, these two programs can be used to help develop and test a key before connecting any system to the live data source(s). KeyRead© is included at no additional cost with all HD Text-Time-KLV inserters and the I-Observe recorders.
KLV Key Specification
Any registered key provides all the information necessary in the corresponding key dictionary to parse, scale and format the data as structured in the key of that registered key number. Similarly, the exported KeyTemplate© also provides all the information necessary to parse, scale and format data collected as instructed by you in the KeyTemplate.
Either structuring source provides a means to populate a KLV packet. With the data and downstream (or later during playback) to extract and use the data with the corresponding video.
The ITS HD Text-Time-KLV inserters and I-Observe recorders can extract KLV packets of data. That are already embedded in the VANC space and save the data to a file (user named). This file can be read, parsed and formatted by KeyRead© . This will enable the user to examine collected data in the video. Also output the data set to a general comma separated variable (CSV) file.
DownloadVideo© is ITS software that runs on a Windows (XP, 7 or 10) platform. It is intended to be used with any of our I-Observe (model 6520) recorders. When a video clip is captured by the recorder is may later be downloaded over Ethernet to a file. The recorder itself can do this with functions in its GUI or the embedded webserver.
However, DownloadVideo© can also command the recorder to download a clip or part of a clip (starting frame and number of frames) to a PC. These files are very large binary files as they contain uncompressed HD video. Typically, each 720 frame consumes 3.1 MB of storage and a 1080 frame consumes 6.2 MB of storage. As can be seen, clips of only a few seconds can be quite large. In the limit, ITS offers a 4TB SSD module that can capture just under 3 hours of 1080p/60 uncompressed video; a ton of data.
Once a clip or clip segment is downloaded. DownloadVideo©can parse this video into a raw (4:2:2 subsampled as the camera provided) video data set. DownloadVideo©) can also transcode this data set into an AVI file (remains uncompressed) or an MP4 file (a range of bit rate targets can be set). During transcode any audio data present is formatted into a WAV file and synchronised to the video. Concurrently, if instructed to do so, DownloadVideo©can extract to a KeyRead compatible file the KLV metadata capture with it (needs to have the KeyTemplate structure imported into DownloadVideo©).
When playing a transcoded clip, the video captured will be displayed. As well as a window will be opened, display the key structure and all the data collected synchronised with each frame currently shown.
Playback can be stepped forward or backward a frame at a time. This will enable examination of the picture and its corresponding data.
Once complete you can select a field of the key and search until a criterion is met for that field. DownloadVideo© will search through the data and when found display the data set. It will also display the image frame in which this data was collected.
Learn more about KLV Metadata Viewers and KLV Metadata Decoders from this recently, published article.
Populating your Key with live data
So now you have designed a structure to collect data and embed it in a HD video stream. The problem is, how do you populate the structure with real data in real time? The task is to collect data from each sensor and/or measurement system. Parse the data and place it in the key structure, wrap that data in the commands needed to cause equipment to embed the packets into the video stream.
A data acquisition system (DAS) may be programmable enough to furnish measurement data in a manner you can use directly. The key can be designed around the way and order that the DAS collects the data. If you need to collect the data from other sensors, as well as a DAS, this data must be parsed and funnelled to your key.
ITS has two solutions that facilitates collecting data in real time organized the way necessary to match your key. The model 2110 system controller can be used as a data funnel. It has 8 serial ports, an ethernet port and up to 8 discrete signals that can be used to detect events. The 2110 comes with a real time operating system and the computing resources. Allowing it to receive records, parse them and build a data packet to match your key. While custom firmware needs to be developed. The 2110 with the baseline firmware and custom firmware will form an automated way to receive data from your existing sensor and DAS systems and build key compatible packets.
Our second solution is an available GUI called the DataConcentrator. It can collect data from multiple sources connected to a PC. The sources may be connected to network (Ethernet), serial or USB ports or a custom interface. If the port of entry into the Windows platform is identifiable and permits intercept or parallel data capture.
The DataConcentrator© has functions within that enable you to identify the source of each data port. The bytes to parse (if necessary) from each record received from the source and map those bytes to the appropriate field in your key. The DataConcentrator also uses the CSV version of the completed KeyTemplate file. This is a means to structure your data. A function in the DataConcentrator lets you drag and drop the preparsed data and map it into the appropriate fields.
Once the mapping is complete, the DataConcentrator© will use the specifications embedded in the KeyTemplate to format the incoming data. When in operation, it will display the received data in formatted human readable form for monitoring and troubleshooting.
The DataConcentrator© can be set to build. It can send packets to an ITS Text-Time-KLV inserter or I-Observe recorder for insertion into the video stream. The rate (packets/second) can be set from 1 to 200. At any rate faster than the incoming frame rate. The ITS inserter will embed the last complete packet received for insertion.
This process can be fine-tuned with our Camera Sync to more closing align the data with the frame.
Conclusion on KLV Metadata
The SMPTE 274, 296 and 2036-1 (4K) frames have been defined with data spaces in them. Which may be used to collect telemetry data. This data is surrounded by SMTPE defined envelopes, one of which is a Key-Length-Value (KLV) packet. A VANC packet can hold from 1-235 bytes each of data structured in any way desired. When used in this manner the data collected is permanently aligned with the frames without any postproduction work to synchronise data and images. Data received can be at any data rate. However in order to assure the best data-image alignment, the start of a video frame should be used to subsample the incoming KLV packets.
ITS offers a full suite of software and equipment designed to employ the KLV packet to sample, transport and permanently align images with data.
- Collect with the 2110 System Controller and/or the DataConcentrator© software
- Embed packets in video on each video frame with any ITS HD Text-Time-KLV inserter or the I-Observe recorder
- Parse and Overlay KLV data present in a video stream using any ITS HD Text-Time-KLV inserter or the I-Observe recorder
- Extract KLV metadata to a file using any ITS HD Text-Time-KLV inserter or the I-Observe recorder
- Parse and display KLV data file using KeyRead© or DownloadVideo© (when using I-Observe).
- Convert the ITS KLV data file to a CSV file using KeyRead© or DownloadVideo© for use with other commercial software.
- Test key performance using KeyTest© with KeyRead© and any of the ITS tools.
EON Instrumentation have recently purchased components of ITS Systems.