Skip to main content

What is metadata?

(in non-technical terms)

And why is RIXML metadata important?

Metadata is usually unseen by the end user, but is critical for organizing, searching for, and managing content. Metadata means “data about data”; it refers to the structured data that is used to describe content in the machine-readable way that databases, search engines, and ai-powered tools need.

Metadata is a type of structured data, meaning that the different types of information are stored in a fixed, predefined manner, generally in rows and columns such as in a spreadsheet or database. Each type of data (e.g., title, subtitle, publication date, author, ticker, key topics, etc.) is stored in a separate field, allowing the systems that use it to quickly identify each type of data.

You've already met metadata

An example of metadata that you are almost certainly familiar with is the information in a library catalog. The catalog record in a library database doesn’t replace the book, nor does it contain the full text of the book. It simply describes the book with information that will make it easier to find.

This allows the search engine to be more accurate, as it can distinguish between whether a search for “Brown” refers to the author’s last name, a word in the title, or part of the company name. It also allows the search engine to be far more efficient, as it knows where to look when you want to do an author search or a title search – it doesn’t need to search all fields, just the relevant one.


RIXML: the metadata for investment research

RIXML is the metadata designed specifically to describe investment research and interactions. At a very high level, you can think of RIXML as the custom cataloging for investment research and interaction records. Just as libraries have developed a standardized way to describe the books, CDs, audiobooks and other print and electronic materials in their collections, the member firms of RIXML have worked together to determine how best to describe a wide variety of investment research, as well as the details of inter-firm interactions, to ensure that this content can be found by the investment professionals and systems looking for it.

For over 25 years, the RIXML Research Standard has been used across the industry to power the databases that investment professionals use to search for the research they need, create alerts, set up email subscriptions, and perform other tasks.


What about artificial intelligence?

Artificial intelligence tools will make - and already have made - significant impacts to the ways that investment research is created, described, distributed, consumed, and analyzed. One thing that hasn't changed is the importance of structured data.

The metadata in a RIXML record is designed to meet the needs of both the humans and the systems that will be using it, and includes tagging that identifies the content of the report and makes finding it easier.

In fact, the reason that artificial intelligence-powered content analysis tools need high-quality metadata is similar to the reason that the tools that power traditional searching, filtering, and alerting need it: accuracy and efficiency!

How?

Here's an example:

An end user enters a prompt such as, "summarize the key reasons for the recent upgrades to Acme Company's earnings estimates":

As you can see, both options will result in an answer. However, using RIXML metadata will:

  • improve accuracy by ensuring that the right input content is used
  • improve efficiency by speeding up the process for finding the content needed to answer the requester's question


What metadata is in the RIXML standards?

The metadata in a RIXML record is designed to meet the needs of both the humans and the systems that will be using it, and includes tagging that identifies the content of the report, the copyright and other administrative information, and workflow tagging that facilitates content management and recordkeeping.

TypeResearch StandardInteractions Standard
Descriptive tags

authorship data
tickers & other identifiers
sector & industry
country and region
key topics

Interaction host, speakers, and participants
key topics

Administrative tags
creation date, publication date, revision date
file type
entitlement data
event date, registration deadline
Structural tags
Chapters
Chart and graph lists

conference breakout groups
Agendas and related materials


Common terminology brings similar content together

One of the key purposes of metadata is to improve findability. To facilitate this, it is helpful for some metadata tags to be constrained by a predefined list of options. Using a common vocabulary ensures that similar content is tagged in a consistent manner even if different firms use different terminology, regional spelling differences exist, etc. The technical term for this is an ontology, but in the RIXML standards, we call these enumeration lists

Whenever possible – such as for identifying countries, currencies, tickers and other company identifiers, date/time information, etc., we use ISO standards or other similar external sources. But since the RIXML standards are designed to meet the specific needs of describing investment research content and interaction data, many of our defined lists are ones developed by our member firms. Working together, we have developed a rich set of enumeration lists to describe coverage action, intended audience, publishing action, and other terms relevant to describing investment research and interactions.


DEFINITIONS

UNSTRUCTURED DATA means information that is not stored in a structured way – like the text in a book or an investment research report.
STRUCTURED DATA means data that is stored in a structured way – like in spreadsheets or databases with a separate column for each type of data.