background.md 1.9 KB

Unstructured, Structured, and Semistructured

Structured

Structued data is information with a predefined organizational system.

Name Date of Birth
Johnathon July 20, 1990
Synthia August 17, 2000

Unstructured

Unstructured data is information without a predefined organizational system. It is the most widespread form of data, with examples being video, audio, articles, etc.

All three major US stock markets closed down in their worst day since June 2020, during the Covid pandemic. The tech-heavy Nasdaq fell 6%, while the S&P 500 and the Dow dropped 4.8% and 3.9%, respectively. Apple and Nvidia, two of the US’s largest companies by market value, had lost a combined $470bn in value by midday.

Aratani, Lauren. “US Stock Markets See Worst Day since Covid Pandemic after Investors Shaken by Trump Tariffs.” The Guardian, April 3, 2025, sec. US news.

Semistructured

Semistructured is a mix of unstructured and structured data.

{
"Movie": "A Minecraft Movie",
"Rating": "April 3, 2025",
"Author": "David Fear"
"Details": "We just don’t want to be the one to inform God what his creations hath wrought with this expensively cheap, 100-percent corporate mess."
}


“A Minecraft Movie | Rotten Tomatoes.” Accessed April 3, 2025. https://www.rottentomatoes.com/m/a_minecraft_movie.

In this section of data the Movie, Rating, and Author follow a predefined organization system. However, the "Details" section includes unstructured information.

Data Pipeline

  1. Acquisition

  2. Cleansing

  3. Transformation

  4. Analysis

  5. Storage

Additional Citations

(also good additional resources/references)


Vasiliev, Yuli. Python for data science: A hands-on introduction. San Francisco: No Starch Press, 2022.