Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save dipta007/4a1299d2be1b35f507a210b96e9bf012 to your computer and use it in GitHub Desktop.

Select an option

Save dipta007/4a1299d2be1b35f507a210b96e9bf012 to your computer and use it in GitHub Desktop.
Music Genre Classification.ipynb
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Music Genre Classification.ipynb",
"version": "0.3.2",
"provenance": [],
"collapsed_sections": [],
"toc_visible": true,
"machine_shape": "hm",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"accelerator": "TPU"
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/dipta007/4a1299d2be1b35f507a210b96e9bf012/notebook.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "cNnM2w-HCeb1",
"colab_type": "text"
},
"source": [
"# Music genre classification notebook"
]
},
{
"cell_type": "code",
"metadata": {
"id": "0oyOFtqJIxOK",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 122
},
"outputId": "88375ff2-40f4-4042-9912-6e3d39f3f2c1"
},
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')"
],
"execution_count": 3,
"outputs": [
{
"output_type": "stream",
"text": [
"Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=email%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdocs.test%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive.photos.readonly%20https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fpeopleapi.readonly&response_type=code\n",
"\n",
"Enter your authorization code:\n",
"··········\n",
"Mounted at /content/drive\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2l3sppZMCydR",
"colab_type": "text"
},
"source": [
"## Importing Libraries"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Gt3fyg6dCNvX",
"colab_type": "code",
"colab": {}
},
"source": [
"# feature extractoring and preprocessing data\n",
"import librosa\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"import os\n",
"from PIL import Image\n",
"import pathlib\n",
"import csv\n",
"\n",
"# Preprocessing\n",
"from sklearn.model_selection import train_test_split\n",
"from sklearn.preprocessing import LabelEncoder, StandardScaler\n",
"\n",
"#Keras\n",
"import keras\n",
"\n",
"import warnings\n",
"warnings.filterwarnings('ignore')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "DPe_ebYuDqr5",
"colab_type": "text"
},
"source": [
"The dataset consists of 10 genres i.e\n",
" * Blues\n",
" * Classical\n",
" * Country\n",
" * Disco\n",
" * Hiphop\n",
" * Jazz\n",
" * Metal\n",
" * Pop\n",
" * Reggae\n",
" * Rock"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "AfBSVfRCD3PE",
"colab_type": "text"
},
"source": [
"## Extracting the Spectrogram for every Audio"
]
},
{
"cell_type": "code",
"metadata": {
"id": "BHh3pTEVDdrT",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "8f0a3783-63d6-4300-bb3e-30a4a525c03f"
},
"source": [
"cmap = plt.get_cmap('inferno')\n",
"\n",
"plt.figure(figsize=(10,10))\n",
"genres = 'blues classical country disco hiphop jazz metal pop reggae rock'.split()\n",
"for g in genres:\n",
" pathlib.Path(f'img_data/{g}').mkdir(parents=True, exist_ok=True) \n",
" for filename in os.listdir(f'./drive/My Drive/Thesis/music_genre/genres/{g}'):\n",
" songname = f'./drive/My Drive/Thesis/music_genre/genres/{g}/{filename}'\n",
" y, sr = librosa.load(songname, mono=True, duration=5)\n",
" plt.specgram(y, NFFT=2048, Fs=2, Fc=0, noverlap=128, cmap=cmap, sides='default', mode='default', scale='dB');\n",
" plt.axis('off');\n",
" plt.savefig(f'img_data/{g}/{filename[:-3].replace(\".\", \"\")}.png')\n",
" plt.clf()\n",
" "
],
"execution_count": 7,
"outputs": [
{
"output_type": "display_data",
"data": {
"text/plain": [
"<Figure size 720x720 with 0 Axes>"
]
},
"metadata": {
"tags": []
}
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "piwUwgP5Eef9",
"colab_type": "text"
},
"source": [
"## Extracting features from Spectrogram\n",
"\n",
"\n",
"* Mel-frequency cepstral coefficients (MFCC)(20 in number)\n",
"* Spectral Centroid,\n",
"* Zero Crossing Rate\n",
"* Chroma Frequencies\n",
"* Spectral Roll-off."
]
},
{
"cell_type": "code",
"metadata": {
"id": "__g8tX8pDeIL",
"colab_type": "code",
"colab": {}
},
"source": [
"header = 'filename chroma_stft rmse spectral_centroid spectral_bandwidth rolloff zero_crossing_rate'\n",
"for i in range(1, 21):\n",
" header += f' mfcc{i}'\n",
"header += ' label'\n",
"header = header.split()"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "TBlT448pEqR9",
"colab_type": "text"
},
"source": [
"## Writing data to csv file\n",
"\n",
"We write the data to a csv file "
]
},
{
"cell_type": "code",
"metadata": {
"id": "ZsSQmB0PE3Iu",
"colab_type": "code",
"colab": {}
},
"source": [
"file = open('data.csv', 'w', newline='')\n",
"with file:\n",
" writer = csv.writer(file)\n",
" writer.writerow(header)\n",
"genres = 'blues classical country disco hiphop jazz metal pop reggae rock'.split()\n",
"for g in genres:\n",
" for filename in os.listdir(f'./drive/My Drive/Thesis/music_genre/genres/{g}'):\n",
" songname = f'./drive/My Drive/Thesis/music_genre/genres/{g}/{filename}'\n",
" y, sr = librosa.load(songname, mono=True, duration=30)\n",
" chroma_stft = librosa.feature.chroma_stft(y=y, sr=sr)\n",
" spec_cent = librosa.feature.spectral_centroid(y=y, sr=sr)\n",
" spec_bw = librosa.feature.spectral_bandwidth(y=y, sr=sr)\n",
" rolloff = librosa.feature.spectral_rolloff(y=y, sr=sr)\n",
" zcr = librosa.feature.zero_crossing_rate(y)\n",
" mfcc = librosa.feature.mfcc(y=y, sr=sr)\n",
" rmse = librosa.feature.rms(y=y)\n",
" to_append = f'{filename} {np.mean(chroma_stft)} {np.mean(rmse)} {np.mean(spec_cent)} {np.mean(spec_bw)} {np.mean(rolloff)} {np.mean(zcr)}' \n",
" for e in mfcc:\n",
" to_append += f' {np.mean(e)}'\n",
" to_append += f' {g}'\n",
" file = open('data.csv', 'a', newline='')\n",
" with file:\n",
" writer = csv.writer(file)\n",
" writer.writerow(to_append.split())"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "fgeCZSKQEp1A",
"colab_type": "text"
},
"source": [
"# Analysing the Data in Pandas"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Kr5_EdpD9dyh",
"colab_type": "code",
"outputId": "edb24adf-8d31-4b0f-c7f3-be61e7f5c18b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 224
}
},
"source": [
"data = pd.read_csv('data.csv')\n",
"data.head()"
],
"execution_count": 13,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>filename</th>\n",
" <th>chroma_stft</th>\n",
" <th>rmse</th>\n",
" <th>spectral_centroid</th>\n",
" <th>spectral_bandwidth</th>\n",
" <th>rolloff</th>\n",
" <th>zero_crossing_rate</th>\n",
" <th>mfcc1</th>\n",
" <th>mfcc2</th>\n",
" <th>mfcc3</th>\n",
" <th>mfcc4</th>\n",
" <th>mfcc5</th>\n",
" <th>mfcc6</th>\n",
" <th>mfcc7</th>\n",
" <th>mfcc8</th>\n",
" <th>mfcc9</th>\n",
" <th>mfcc10</th>\n",
" <th>mfcc11</th>\n",
" <th>mfcc12</th>\n",
" <th>mfcc13</th>\n",
" <th>mfcc14</th>\n",
" <th>mfcc15</th>\n",
" <th>mfcc16</th>\n",
" <th>mfcc17</th>\n",
" <th>mfcc18</th>\n",
" <th>mfcc19</th>\n",
" <th>mfcc20</th>\n",
" <th>label</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>blues.00005.au</td>\n",
" <td>0.302346</td>\n",
" <td>0.103468</td>\n",
" <td>1831.942368</td>\n",
" <td>1729.483241</td>\n",
" <td>3480.937285</td>\n",
" <td>0.094040</td>\n",
" <td>-177.869048</td>\n",
" <td>118.196916</td>\n",
" <td>-17.550673</td>\n",
" <td>30.758635</td>\n",
" <td>-21.742746</td>\n",
" <td>11.903802</td>\n",
" <td>-20.734249</td>\n",
" <td>3.180597</td>\n",
" <td>-8.583484</td>\n",
" <td>-0.936488</td>\n",
" <td>-11.776273</td>\n",
" <td>-2.420615</td>\n",
" <td>-9.339365</td>\n",
" <td>-9.939325</td>\n",
" <td>-3.909892</td>\n",
" <td>-5.570625</td>\n",
" <td>-1.839023</td>\n",
" <td>-2.778421</td>\n",
" <td>-3.046866</td>\n",
" <td>-8.115809</td>\n",
" <td>blues</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>blues.00004.au</td>\n",
" <td>0.308590</td>\n",
" <td>0.091563</td>\n",
" <td>1835.494603</td>\n",
" <td>1748.362448</td>\n",
" <td>3580.945013</td>\n",
" <td>0.101500</td>\n",
" <td>-160.266031</td>\n",
" <td>126.198800</td>\n",
" <td>-35.605448</td>\n",
" <td>22.153301</td>\n",
" <td>-32.489270</td>\n",
" <td>10.864512</td>\n",
" <td>-23.357929</td>\n",
" <td>0.503117</td>\n",
" <td>-11.805831</td>\n",
" <td>1.206804</td>\n",
" <td>-13.083820</td>\n",
" <td>-2.806385</td>\n",
" <td>-6.934122</td>\n",
" <td>-7.558619</td>\n",
" <td>-9.173552</td>\n",
" <td>-4.512166</td>\n",
" <td>-5.453538</td>\n",
" <td>-0.924162</td>\n",
" <td>-4.409333</td>\n",
" <td>-11.703781</td>\n",
" <td>blues</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>blues.00003.au</td>\n",
" <td>0.404779</td>\n",
" <td>0.141191</td>\n",
" <td>1070.119953</td>\n",
" <td>1596.333948</td>\n",
" <td>2185.028454</td>\n",
" <td>0.033309</td>\n",
" <td>-199.431144</td>\n",
" <td>150.099218</td>\n",
" <td>5.647594</td>\n",
" <td>26.871927</td>\n",
" <td>1.754463</td>\n",
" <td>14.238344</td>\n",
" <td>-4.830882</td>\n",
" <td>9.297965</td>\n",
" <td>-0.757741</td>\n",
" <td>8.149012</td>\n",
" <td>-3.196314</td>\n",
" <td>6.087676</td>\n",
" <td>-2.476420</td>\n",
" <td>-1.073890</td>\n",
" <td>-2.874777</td>\n",
" <td>0.780976</td>\n",
" <td>-3.316932</td>\n",
" <td>0.637981</td>\n",
" <td>-0.619690</td>\n",
" <td>-3.408233</td>\n",
" <td>blues</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>blues.00000.au</td>\n",
" <td>0.349943</td>\n",
" <td>0.130225</td>\n",
" <td>1784.420446</td>\n",
" <td>2002.650192</td>\n",
" <td>3806.485316</td>\n",
" <td>0.083066</td>\n",
" <td>-113.596742</td>\n",
" <td>121.557302</td>\n",
" <td>-19.158825</td>\n",
" <td>42.351029</td>\n",
" <td>-6.376457</td>\n",
" <td>18.618875</td>\n",
" <td>-13.697911</td>\n",
" <td>15.344630</td>\n",
" <td>-12.285266</td>\n",
" <td>10.980491</td>\n",
" <td>-8.324323</td>\n",
" <td>8.810668</td>\n",
" <td>-3.667367</td>\n",
" <td>5.751690</td>\n",
" <td>-5.162761</td>\n",
" <td>0.750947</td>\n",
" <td>-1.691937</td>\n",
" <td>-0.409954</td>\n",
" <td>-2.300208</td>\n",
" <td>1.219928</td>\n",
" <td>blues</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>blues.00001.au</td>\n",
" <td>0.340983</td>\n",
" <td>0.095918</td>\n",
" <td>1529.835316</td>\n",
" <td>2038.617579</td>\n",
" <td>3548.820207</td>\n",
" <td>0.056044</td>\n",
" <td>-207.556796</td>\n",
" <td>124.006717</td>\n",
" <td>8.930562</td>\n",
" <td>35.874684</td>\n",
" <td>2.916038</td>\n",
" <td>21.523725</td>\n",
" <td>-8.554703</td>\n",
" <td>23.358672</td>\n",
" <td>-10.103616</td>\n",
" <td>11.903744</td>\n",
" <td>-5.560387</td>\n",
" <td>5.376802</td>\n",
" <td>-2.239119</td>\n",
" <td>4.216963</td>\n",
" <td>-6.012273</td>\n",
" <td>0.936109</td>\n",
" <td>-0.716537</td>\n",
" <td>0.293875</td>\n",
" <td>-0.287431</td>\n",
" <td>0.531573</td>\n",
" <td>blues</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" filename chroma_stft rmse ... mfcc19 mfcc20 label\n",
"0 blues.00005.au 0.302346 0.103468 ... -3.046866 -8.115809 blues\n",
"1 blues.00004.au 0.308590 0.091563 ... -4.409333 -11.703781 blues\n",
"2 blues.00003.au 0.404779 0.141191 ... -0.619690 -3.408233 blues\n",
"3 blues.00000.au 0.349943 0.130225 ... -2.300208 1.219928 blues\n",
"4 blues.00001.au 0.340983 0.095918 ... -0.287431 0.531573 blues\n",
"\n",
"[5 rows x 28 columns]"
]
},
"metadata": {
"tags": []
},
"execution_count": 13
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "iHrDHCaR9gKR",
"colab_type": "code",
"outputId": "6224651e-390f-4f9b-b6b9-6c3d21eee001",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"data.shape"
],
"execution_count": 14,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(1000, 28)"
]
},
"metadata": {
"tags": []
},
"execution_count": 14
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "veD5BgX49hZa",
"colab_type": "code",
"colab": {}
},
"source": [
"# Dropping unneccesary columns\n",
"data = data.drop(['filename'],axis=1)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "Nyr0aAAsGXjZ",
"colab_type": "text"
},
"source": [
"## Encoding the Labels"
]
},
{
"cell_type": "code",
"metadata": {
"id": "frI5HH4q-1HS",
"colab_type": "code",
"colab": {}
},
"source": [
"genre_list = data.iloc[:, -1]\n",
"encoder = LabelEncoder()\n",
"y = encoder.fit_transform(genre_list)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "_2n8a02zGfvP",
"colab_type": "text"
},
"source": [
"## Scaling the Feature columns"
]
},
{
"cell_type": "code",
"metadata": {
"id": "uqcqn-nyAofk",
"colab_type": "code",
"colab": {}
},
"source": [
"scaler = StandardScaler()\n",
"X = scaler.fit_transform(np.array(data.iloc[:, :-1], dtype = float))"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "e3VZvbwpGo9R",
"colab_type": "text"
},
"source": [
"## Dividing data into training and Testing set"
]
},
{
"cell_type": "code",
"metadata": {
"id": "F1GW3VvQA7Rj",
"colab_type": "code",
"colab": {}
},
"source": [
"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "upuczQ-KBHJ5",
"colab_type": "code",
"outputId": "21bc8598-a065-461e-ba81-6f30f5148591",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"len(y_train)"
],
"execution_count": 19,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"800"
]
},
"metadata": {
"tags": []
},
"execution_count": 19
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "LtoE_FqqBzM8",
"colab_type": "code",
"outputId": "b79ac2ac-33d5-497f-adbe-adaffa0f768a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"len(y_test)"
],
"execution_count": 20,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"200"
]
},
"metadata": {
"tags": []
},
"execution_count": 20
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ir9XaWgQB0lq",
"colab_type": "code",
"outputId": "dac8f3a3-a818-48dc-d7ef-13f5f2269f0e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 119
}
},
"source": [
"X_train[10]"
],
"execution_count": 21,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"array([-0.23721292, 0.3985479 , -1.28512259, -0.97500018, -1.2045147 ,\n",
" -1.35000695, 0.04046173, 1.58853069, -0.05235996, 0.14723953,\n",
" 0.22031819, 0.46377532, -0.62299048, 0.58929559, -0.75420259,\n",
" 0.30958517, -0.47323844, 0.03341018, -0.57921517, 0.09900053,\n",
" -0.54194579, -0.60586268, -0.55635453, -1.35375003, -0.06554428,\n",
" -0.18034592])"
]
},
"metadata": {
"tags": []
},
"execution_count": 21
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "Vp2yc5FWG04e",
"colab_type": "text"
},
"source": [
"# Classification with Keras\n",
"\n",
"## Building our Network"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Qj3sc2uFEUMt",
"colab_type": "code",
"colab": {}
},
"source": [
"from keras import models\n",
"from keras import layers\n",
"\n",
"model = models.Sequential()\n",
"model.add(layers.Dense(256, activation='relu', input_shape=(X_train.shape[1],)))\n",
"\n",
"model.add(layers.Dense(128, activation='relu'))\n",
"\n",
"model.add(layers.Dense(64, activation='relu'))\n",
"\n",
"model.add(layers.Dense(10, activation='softmax'))"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "7yrsmpI6EjJ2",
"colab_type": "code",
"colab": {}
},
"source": [
"model.compile(optimizer='adam',\n",
" loss='sparse_categorical_crossentropy',\n",
" metrics=['accuracy'])"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "bP0hVm4aElS7",
"colab_type": "code",
"outputId": "5aa6a516-e4e3-4102-f920-da9b2910e92f",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 697
}
},
"source": [
"history = model.fit(X_train,\n",
" y_train,\n",
" epochs=20,\n",
" batch_size=128)\n",
" "
],
"execution_count": 28,
"outputs": [
{
"output_type": "stream",
"text": [
"Epoch 1/20\n",
"800/800 [==============================] - 0s 359us/step - loss: 2.2210 - acc: 0.1950\n",
"Epoch 2/20\n",
"800/800 [==============================] - 0s 37us/step - loss: 1.9311 - acc: 0.3700\n",
"Epoch 3/20\n",
"800/800 [==============================] - 0s 36us/step - loss: 1.6909 - acc: 0.4062\n",
"Epoch 4/20\n",
"800/800 [==============================] - 0s 38us/step - loss: 1.5122 - acc: 0.4450\n",
"Epoch 5/20\n",
"800/800 [==============================] - 0s 36us/step - loss: 1.3603 - acc: 0.5288\n",
"Epoch 6/20\n",
"800/800 [==============================] - 0s 33us/step - loss: 1.2343 - acc: 0.5837\n",
"Epoch 7/20\n",
"800/800 [==============================] - 0s 33us/step - loss: 1.1415 - acc: 0.6250\n",
"Epoch 8/20\n",
"800/800 [==============================] - 0s 35us/step - loss: 1.0577 - acc: 0.6700\n",
"Epoch 9/20\n",
"800/800 [==============================] - 0s 35us/step - loss: 0.9940 - acc: 0.6737\n",
"Epoch 10/20\n",
"800/800 [==============================] - 0s 37us/step - loss: 0.9418 - acc: 0.7037\n",
"Epoch 11/20\n",
"800/800 [==============================] - 0s 33us/step - loss: 0.8877 - acc: 0.7075\n",
"Epoch 12/20\n",
"800/800 [==============================] - 0s 36us/step - loss: 0.8330 - acc: 0.7275\n",
"Epoch 13/20\n",
"800/800 [==============================] - 0s 32us/step - loss: 0.8033 - acc: 0.7325\n",
"Epoch 14/20\n",
"800/800 [==============================] - 0s 31us/step - loss: 0.7493 - acc: 0.7638\n",
"Epoch 15/20\n",
"800/800 [==============================] - 0s 33us/step - loss: 0.7200 - acc: 0.7700\n",
"Epoch 16/20\n",
"800/800 [==============================] - 0s 35us/step - loss: 0.6877 - acc: 0.7725\n",
"Epoch 17/20\n",
"800/800 [==============================] - 0s 36us/step - loss: 0.6598 - acc: 0.7875\n",
"Epoch 18/20\n",
"800/800 [==============================] - 0s 35us/step - loss: 0.6188 - acc: 0.8025\n",
"Epoch 19/20\n",
"800/800 [==============================] - 0s 34us/step - loss: 0.5913 - acc: 0.8113\n",
"Epoch 20/20\n",
"800/800 [==============================] - 0s 35us/step - loss: 0.5641 - acc: 0.8287\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "0m1J0_wUFK4C",
"colab_type": "code",
"outputId": "8aec7343-2b60-463b-b6fa-061e78e802d9",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"test_loss, test_acc = model.evaluate(X_test,y_test)"
],
"execution_count": 29,
"outputs": [
{
"output_type": "stream",
"text": [
"200/200 [==============================] - 0s 364us/step\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "f6HrjXeUF0Ko",
"colab_type": "code",
"outputId": "890e16ef-706b-4007-895a-fbb5d4e5bf4b",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"print('test_acc: ',test_acc)"
],
"execution_count": 30,
"outputs": [
{
"output_type": "stream",
"text": [
"test_acc: 0.655\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "3yQmP_f5Kq0w",
"colab_type": "text"
},
"source": [
"Tes accuracy is less than training dataa accuracy. This hints at Overfitting"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "-U2qzRJoHV9O",
"colab_type": "text"
},
"source": [
"## Validating our approach\n",
"Let's set apart 200 samples in our training data to use as a validation set:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "xJNbvYZoF7ZT",
"colab_type": "code",
"colab": {}
},
"source": [
"x_val = X_train[:200]\n",
"partial_x_train = X_train[200:]\n",
"\n",
"y_val = y_train[:200]\n",
"partial_y_train = y_train[200:]"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "L1EkG59EHeEV",
"colab_type": "text"
},
"source": [
"Now let's train our network for 20 epochs:"
]
},
{
"cell_type": "code",
"metadata": {
"id": "Dp3G4P3aP4k2",
"colab_type": "code",
"outputId": "a58c38bc-5635-455b-e871-f691da88f47e",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 1000
}
},
"source": [
"model = models.Sequential()\n",
"model.add(layers.Dense(512, activation='relu', input_shape=(X_train.shape[1],)))\n",
"model.add(layers.Dense(256, activation='relu'))\n",
"model.add(layers.Dense(128, activation='relu'))\n",
"model.add(layers.Dense(64, activation='relu'))\n",
"model.add(layers.Dense(10, activation='softmax'))\n",
"\n",
"model.compile(optimizer='adam',\n",
" loss='sparse_categorical_crossentropy',\n",
" metrics=['accuracy'])\n",
"\n",
"model.fit(partial_x_train,\n",
" partial_y_train,\n",
" epochs=30,\n",
" batch_size=512,\n",
" validation_data=(x_val, y_val))\n",
"results = model.evaluate(X_test, y_test)"
],
"execution_count": 32,
"outputs": [
{
"output_type": "stream",
"text": [
"Train on 600 samples, validate on 200 samples\n",
"Epoch 1/30\n",
"600/600 [==============================] - 0s 809us/step - loss: 2.3109 - acc: 0.1333 - val_loss: 2.1663 - val_acc: 0.2750\n",
"Epoch 2/30\n",
"600/600 [==============================] - 0s 56us/step - loss: 2.1620 - acc: 0.2783 - val_loss: 2.0564 - val_acc: 0.3150\n",
"Epoch 3/30\n",
"600/600 [==============================] - 0s 58us/step - loss: 2.0536 - acc: 0.3100 - val_loss: 1.9622 - val_acc: 0.3550\n",
"Epoch 4/30\n",
"600/600 [==============================] - 0s 50us/step - loss: 1.9422 - acc: 0.3583 - val_loss: 1.8722 - val_acc: 0.3600\n",
"Epoch 5/30\n",
"600/600 [==============================] - 0s 54us/step - loss: 1.8299 - acc: 0.3733 - val_loss: 1.7802 - val_acc: 0.3700\n",
"Epoch 6/30\n",
"600/600 [==============================] - 0s 65us/step - loss: 1.7248 - acc: 0.4017 - val_loss: 1.6665 - val_acc: 0.3850\n",
"Epoch 7/30\n",
"600/600 [==============================] - 0s 56us/step - loss: 1.6145 - acc: 0.4200 - val_loss: 1.5434 - val_acc: 0.4300\n",
"Epoch 8/30\n",
"600/600 [==============================] - 0s 54us/step - loss: 1.5144 - acc: 0.4667 - val_loss: 1.4647 - val_acc: 0.4500\n",
"Epoch 9/30\n",
"600/600 [==============================] - 0s 55us/step - loss: 1.4252 - acc: 0.5450 - val_loss: 1.4061 - val_acc: 0.5200\n",
"Epoch 10/30\n",
"600/600 [==============================] - 0s 56us/step - loss: 1.3306 - acc: 0.5817 - val_loss: 1.3682 - val_acc: 0.5350\n",
"Epoch 11/30\n",
"600/600 [==============================] - 0s 52us/step - loss: 1.2527 - acc: 0.5800 - val_loss: 1.3112 - val_acc: 0.5500\n",
"Epoch 12/30\n",
"600/600 [==============================] - 0s 58us/step - loss: 1.1863 - acc: 0.5900 - val_loss: 1.2546 - val_acc: 0.5550\n",
"Epoch 13/30\n",
"600/600 [==============================] - 0s 55us/step - loss: 1.1290 - acc: 0.6050 - val_loss: 1.2003 - val_acc: 0.5750\n",
"Epoch 14/30\n",
"600/600 [==============================] - 0s 56us/step - loss: 1.0640 - acc: 0.6317 - val_loss: 1.1856 - val_acc: 0.5600\n",
"Epoch 15/30\n",
"600/600 [==============================] - 0s 58us/step - loss: 1.0247 - acc: 0.6367 - val_loss: 1.1477 - val_acc: 0.5600\n",
"Epoch 16/30\n",
"600/600 [==============================] - 0s 55us/step - loss: 0.9571 - acc: 0.6600 - val_loss: 1.1560 - val_acc: 0.5550\n",
"Epoch 17/30\n",
"600/600 [==============================] - 0s 58us/step - loss: 0.9401 - acc: 0.6817 - val_loss: 1.0916 - val_acc: 0.6200\n",
"Epoch 18/30\n",
"600/600 [==============================] - 0s 56us/step - loss: 0.8746 - acc: 0.7167 - val_loss: 1.0747 - val_acc: 0.6350\n",
"Epoch 19/30\n",
"600/600 [==============================] - 0s 59us/step - loss: 0.8552 - acc: 0.7000 - val_loss: 1.0450 - val_acc: 0.6350\n",
"Epoch 20/30\n",
"600/600 [==============================] - 0s 58us/step - loss: 0.8123 - acc: 0.7317 - val_loss: 1.0834 - val_acc: 0.5850\n",
"Epoch 21/30\n",
"600/600 [==============================] - 0s 60us/step - loss: 0.7957 - acc: 0.7383 - val_loss: 1.0557 - val_acc: 0.6250\n",
"Epoch 22/30\n",
"600/600 [==============================] - 0s 59us/step - loss: 0.7414 - acc: 0.7533 - val_loss: 1.0487 - val_acc: 0.6400\n",
"Epoch 23/30\n",
"600/600 [==============================] - 0s 60us/step - loss: 0.7086 - acc: 0.7600 - val_loss: 1.0384 - val_acc: 0.6400\n",
"Epoch 24/30\n",
"600/600 [==============================] - 0s 62us/step - loss: 0.7070 - acc: 0.7617 - val_loss: 0.9994 - val_acc: 0.6200\n",
"Epoch 25/30\n",
"600/600 [==============================] - 0s 58us/step - loss: 0.6567 - acc: 0.8000 - val_loss: 1.0234 - val_acc: 0.6400\n",
"Epoch 26/30\n",
"600/600 [==============================] - 0s 60us/step - loss: 0.6259 - acc: 0.7917 - val_loss: 1.0489 - val_acc: 0.6250\n",
"Epoch 27/30\n",
"600/600 [==============================] - 0s 58us/step - loss: 0.6100 - acc: 0.7950 - val_loss: 1.0474 - val_acc: 0.6000\n",
"Epoch 28/30\n",
"600/600 [==============================] - 0s 58us/step - loss: 0.5934 - acc: 0.8017 - val_loss: 1.0269 - val_acc: 0.6200\n",
"Epoch 29/30\n",
"600/600 [==============================] - 0s 65us/step - loss: 0.5636 - acc: 0.8267 - val_loss: 1.0481 - val_acc: 0.6300\n",
"Epoch 30/30\n",
"600/600 [==============================] - 0s 75us/step - loss: 0.5280 - acc: 0.8367 - val_loss: 1.0583 - val_acc: 0.6400\n",
"200/200 [==============================] - 0s 75us/step\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dljqHfDPI6lH",
"colab_type": "text"
},
"source": [
""
]
},
{
"cell_type": "code",
"metadata": {
"id": "Mvi9it1SI4aR",
"colab_type": "code",
"outputId": "88244c5b-6dfb-4997-d336-7c0717bfb607",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"results"
],
"execution_count": 33,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"[1.0343968296051025, 0.635]"
]
},
"metadata": {
"tags": []
},
"execution_count": 33
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "eNDl2m49RoY4",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
},
"outputId": "fc7bebcc-b79a-464e-e545-f467c201fe1d"
},
"source": [
"print(model.evaluate(X_test, y_test))"
],
"execution_count": 36,
"outputs": [
{
"output_type": "stream",
"text": [
"200/200 [==============================] - 0s 97us/step\n",
"[1.0343968296051025, 0.635]\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "r3hb8s1l4rBA",
"colab_type": "text"
},
"source": [
"## Predictions on Test Data"
]
},
{
"cell_type": "code",
"metadata": {
"id": "gudBAhIXJIi2",
"colab_type": "code",
"colab": {}
},
"source": [
"predictions = model.predict(X_test)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "Xb7bVPSwJQF0",
"colab_type": "code",
"outputId": "fa3ece15-682f-41de-f760-6564029248b1",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"predictions[0].shape"
],
"execution_count": 38,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"(10,)"
]
},
"metadata": {
"tags": []
},
"execution_count": 38
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "llusRQV0JRy9",
"colab_type": "code",
"outputId": "a856289d-883a-47cb-c0fb-ec148330a60a",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"np.sum(predictions[0])"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"1.0"
]
},
"metadata": {
"tags": []
},
"execution_count": 27
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "0eoEuSZqJTdU",
"colab_type": "code",
"outputId": "94c17d00-dd7f-40a1-84d2-78d1ebde6103",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
}
},
"source": [
"np.argmax(predictions[0])"
],
"execution_count": 0,
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"8"
]
},
"metadata": {
"tags": []
},
"execution_count": 28
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Utgt1bXfJVRN",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {
"id": "PJGkAegAPt3P",
"colab_type": "text"
},
"source": [
"# CNN + LSTM"
]
},
{
"cell_type": "code",
"metadata": {
"id": "42NVw9hoPvyU",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 232
},
"outputId": "32ef4c7d-c1ee-4ef7-bf80-8d5dd76ba4ac"
},
"source": [
"from PIL import Image\n",
"\n",
"genres = 'blues classical country disco hiphop jazz metal pop reggae rock'.split()\n",
"X = []\n",
"Y = []\n",
"\n",
"for g in genres:\n",
" for filename in os.listdir(f'./drive/My Drive/Thesis/music_genre/007/img_data/{g}'):\n",
" x = np.asarray(Image.open(f'./img_data/{g}/{filename}'))\n",
" X.append(x)\n",
" Y.append(g)\n",
" print(f'{g} completed')"
],
"execution_count": 7,
"outputs": [
{
"output_type": "error",
"ename": "FileNotFoundError",
"evalue": "ignored",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mFileNotFoundError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-7-30d0ee010d5a>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 7\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0mg\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mgenres\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 8\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mfilename\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlistdir\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34mf'./drive/My Drive/Thesis/music_genre/007/img_data/{g}'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 9\u001b[0m \u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0masarray\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mImage\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mopen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34mf'./img_data/{g}/{filename}'\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 10\u001b[0m \u001b[0mX\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
"\u001b[0;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: './drive/My Drive/Thesis/music_genre/007/img_data/blues'"
]
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "ErTffoXuTolO",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 51
},
"outputId": "5f098ddc-33a1-487f-e5ae-bf9aa20ee914"
},
"source": [
"X = np.asarray(X)\n",
"\n",
"from sklearn.preprocessing import LabelEncoder\n",
"from sklearn.preprocessing import OneHotEncoder\n",
"# integer encode\n",
"label_encoder = LabelEncoder()\n",
"integer_encoded = label_encoder.fit_transform(Y)\n",
"# print(integer_encoded)\n",
"# binary encode\n",
"onehot_encoder = OneHotEncoder(sparse=False)\n",
"integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)\n",
"onehot_encoded = onehot_encoder.fit_transform(integer_encoded)\n",
"# print(onehot_encoded)\n",
"\n",
"Y= onehot_encoded\n",
"# Y = encoder.fit_transform(Y)\n",
"print(X.shape)\n",
"print(Y.shape)\n"
],
"execution_count": 86,
"outputs": [
{
"output_type": "stream",
"text": [
"(1000, 720, 720, 4)\n",
"(1000, 10)\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Ip7wR2fpR1jj",
"colab_type": "code",
"colab": {}
},
"source": [
"import keras\n",
"\n",
"c_lstm = keras.models.Sequential([\n",
" keras.layers.Conv2D(64, (3, 3), activation=keras.activations.relu, input_shape=(720, 720, 4)),\n",
" keras.layers.MaxPooling2D(2, 2),\n",
" \n",
" keras.layers.Conv2D(128, (3, 3), activation=keras.activations.relu),\n",
" keras.layers.MaxPooling2D(2, 2),\n",
" \n",
" keras.layers.Flatten(),\n",
" keras.layers.Dense(64, activation=keras.activations.relu),\n",
" keras.layers.Dense(10, activation=keras.activations.softmax),\n",
"])"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "roQYSSE-T_AL",
"colab_type": "code",
"colab": {}
},
"source": [
"c_lstm.compile(optimizer=keras.optimizers.RMSprop(0.001), loss=keras.losses.categorical_crossentropy)"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "5gyXgnyaVHw_",
"colab_type": "code",
"colab": {
"base_uri": "https://localhost:8080/",
"height": 34
},
"outputId": "73e3bd9a-c5d4-4144-fe11-bfa767a787f2"
},
"source": [
"early_stopping = keras.callbacks.EarlyStopping(monitor='loss', patience=4)\n",
"c_lstm.fit(X, Y, epochs=1, callbacks=[early_stopping])"
],
"execution_count": 0,
"outputs": [
{
"output_type": "stream",
"text": [
"Epoch 1/1\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "VirIWafpVzt3",
"colab_type": "code",
"colab": {}
},
"source": [
""
],
"execution_count": 0,
"outputs": []
}
]
}
@nadhifaah
Copy link

Can I get the authorization code? I need to analyze your code to solve my college task. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment