M o v i e    D a t a
W e b    I n t e r f a c e

Data URL: https://data.world/popculture/imdb-5000-movie-dataset

Chuan Sun, J.P. Morgan Vice President and data scientist collected cross-platform movie statistical and qualitative data from the Internet Movie Database, Rotten Tomatoes, and the-numbers data collection. The 5000 IMDB Movie Dataset includes relational financing data like budget and gross return data and categorical data like genre and country, along with comparable qualitative data like critic review, user review and cast Facebook lives. Volunteer contributors provide the data posted to the IMDB interface. Corrections to casts, demographics, and credits are immediately approved, but images, names, character names and summaries are screened before publication. The Numbers website includes movie financing statistics that are sourced from Nash Information Services, LLC that include consulting and analytical, data visualization services for both industry and independent filmmaking. The 5000 Movie Dataset notes 28 variables (see below) for 5043 movies in a span of a century. The dataset represents 66 countries, but about 75% of the movies are U.S. productions. Even if the Hollywood movie industry presents a dominating force, most of the productions rated highest on IMDB fall under western indie films. I have personally taken out some variables to better focus on the large themes revolving the film industry.
This wide-ranged dataset is important to issues pertaining to representation in the film industry. Particularly the countries, languages and the corresponding genres can reveal what culture/regional industry holds more influence. For filmmakers, the data can provide a broader scope of how much financing is necessary for their projects, along with understanding the trends that set filmmakers apart. For more recent trends, the data includes social media as an important factor in film distribution and advertising. You can expect to delve into and compare a wide-ranged data set for financing, review, demographic and logistical; revealing relations between defining films.

Through this website, you can expect to navigate film genres in relation to IMDB rating.

Here are the full 28 variables in the dataset:

"movie_title"
"color"
"num_critic_for_reviews"
"movie_facebook_likes"
"duration"
"director_name"
"director_facebook_likes"
"actor_3_name"
"actor_3_facebook_likes"
"actor_2_name"
"actor_2_facebook_likes"
"actor_1_name"
"actor_1_facebook_likes"
"gross"
"genres"
"num_voted_users"
"cast_total_facebook_likes"
"facenumber_in_poster"
"plot_keywords"
"movie_imdb_link"
"num_user_for_reviews"
"language"
"country"
"content_rating"
"budget"
"title_year"
"imdb_score"
"aspect_ratio"

Here are the variables I have excluded:

"color"
"actor_3_facebook_likes"
"actor_2_facebook_likes"
"actor_2_name"
"actor_3_name"
"facenumber_in_poster"
"movie_imdb_link"
"plot_keywords"





Select a genre:

How many records do you want to see?

What order would you like to see your data in?