Data Engineering Fundamentals part 1 : What is Data? What is Metadata? #dataengineering #azureinterviewquestions
What Is Data?
Data is a set of facts such as descriptions, observations, and numbers used in decision making.
We can classify data as structured, unstructured, or semi-structured data.
Structured data
In structured data, sometimes called relational data, all data has the same fields or properties. All the data has the same organization and shape, or schema. The shared schema allows this type of data to be easily searched by using query languages like Structured Query Language (SQL). This capability makes this data style perfect for applications like CRM systems, reservations, and inventory management.
Semi-structured data
Semi-structured data is less organized than structured data. Semi-structured data isn't stored in a relational format because the fields don't fit neatly into tables, rows, and columns. Semi-structured data contains tags that make the organization and hierarchy of the data apparent. One example is key/value pairs. Semi-structured data is also referred to as non-relational or not only SQL (NoSQL) data.
Unstructured data
The organization of unstructured data is undefined. Unstructured data is often delivered in file format, such as in photo or video files. The video file itself might have an overall structure and come with semi-structured metadata, but the data that forms the video itself is unstructured. Therefore, photos, videos, and other similar files are classified as unstructured data.
The Difference Between Structured, Unstructured, And Semi-Structured Data
コメント