Skip to content

Instantly share code, notes, and snippets.

@sammerry
Last active July 26, 2018 04:59
Show Gist options
  • Select an option

  • Save sammerry/53f85338f8406e0a509a938c0f6d73c5 to your computer and use it in GitHub Desktop.

Select an option

Save sammerry/53f85338f8406e0a509a938c0f6d73c5 to your computer and use it in GitHub Desktop.
Polymorph ML Challenge

Polymorph ML Challenge

Challenge: https://www.kaggle.com/deepmatrix/imdb-5000-movie-dataset

Answer the following question: "Can you predict the movie IMDB rating"

Grading

You will be expected to explain why you used a specific tool set like SparkML, ScikitLearn, Keras or XGBoost to name a few. Tools below are suggested, but feel free to use any method.

The submission would be graded on the following parameters:

  • Ease of use: We should be able to run your script easily
  • Communication: Explain why it’s performant and works well in your readme.md file
  • Feature Engineering: Deriving keys insights from the sample data
  • Creating a visual data narrative: charts and plots must be created from code, and not external standalone software like Tableau or Excel
  • Use of best coding practises: for the language and tools used

Suggested Tools

Code:

  • Python: Pandas + Scikit learn Visualization:
  • can use any python libraries like plotly, matplotlib, etc

How to submit

You may submit via github link. Alternatively, send an email back to polymorph as a single ZIP file (< 20 MB) containing:

  • Working code with documentation and a README file
  • Any generated graphs and graphics
  • Documentation including any metadata for any data created
  • Brief description of the keys insights from visualization
  • An explanation of tools used with a brief reason why you selected each one (in readme.md)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment