Polymorph ML Challenge

Challenge: https://www.kaggle.com/deepmatrix/imdb-5000-movie-dataset

Answer the following question: "Can you predict the movie IMDB rating"

Grading

You will be expected to explain why you used a specific tool set like SparkML, ScikitLearn, Keras or XGBoost to name a few. Tools below are suggested, but feel free to use any method.

The submission would be graded on the following parameters:

Ease of use: We should be able to run your script easily
Communication: Explain why it’s performant and works well in your readme.md file
Feature Engineering: Deriving keys insights from the sample data
Creating a visual data narrative: charts and plots must be created from code, and not external standalone software like Tableau or Excel
Use of best coding practises: for the language and tools used

Suggested Tools

Code:

Python: Pandas + Scikit learn Visualization:
can use any python libraries like plotly, matplotlib, etc

How to submit

You may submit via github link. Alternatively, send an email back to polymorph as a single ZIP file (< 20 MB) containing:

Working code with documentation and a README file
Any generated graphs and graphics
Documentation including any metadata for any data created
Brief description of the keys insights from visualization
An explanation of tools used with a brief reason why you selected each one (in readme.md)

sammerry/readme.md

Select an option