{"nbformat":4,"nbformat_minor":0,"metadata":{"colab":{"provenance":[],"toc_visible":true},"kernelspec":{"name":"python3","display_name":"Python 3"},"accelerator":"GPU"},"cells":[{"cell_type":"markdown","metadata":{"id":"LpnEVMCYYlnD"},"source":["# CIFAR-100 και Tranfer Learning"]},{"cell_type":"code","metadata":{"id":"STXQMBuN3nZ6","executionInfo":{"status":"ok","timestamp":1671004792149,"user_tz":-120,"elapsed":468,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["from __future__ import absolute_import, division, print_function, unicode_literals # legacy compatibility\n","\n","import tensorflow as tf\n","from tensorflow.keras import datasets, layers, models\n","from tensorflow.keras.preprocessing.image import ImageDataGenerator\n","from tensorflow.keras.utils import to_categorical\n","\n","import numpy as np\n","import pandas as pd\n","import matplotlib.pyplot as plt"],"execution_count":2,"outputs":[]},{"cell_type":"markdown","source":["## To CIFAR-100\n","\n","![](https://datarepository.wolframcloud.com/resources/images/69f/69f1e629-81e6-4eaa-998f-f6734fcd2cb3-io-4-o.en.gif)\n","\n","Τα [CIFAR-10 και CIFAR-100](https://www.cs.toronto.edu/~kriz/cifar.html) είναι μαζί με το MNIST τα διασημότερα dataset στην όραση υπολογιστών\n","\n","Αποτελούν υποσύνολα του συνόλου δεδομένων 80 million tiny images. Συλλέχθηκαν από τους Alex Krizhevsky (του AlexNet), Vinod Nair και Geoffrey Hinton.\n","\n","Το CIFAR-10 έχει 10 κατηγορίες εικόνων και το CIFAR-100 100.\n","\n","CIFAR σημαίνει Canadian Institute for Advanced Research.\n","\n","Το 2004, ο Geoffrey Hinton άρχισε να ηγείται του προγράμματος Νευρωνικός Υπολογισμός και Προσαρμοστική Αντίληψη του CIFAR. Στα μέλη του περιλαμβάνονται οι Yoshua Bengio και Yann LeCun, μεταξύ άλλων νευροεπιστημόνων, επιστημόνων υπολογιστών, βιολόγων, ηλεκτρολόγων μηχανικών, φυσικών και ψυχολόγων. \n","\n","Σήμερα, οι τρεις τους αναγνωρίζονται ευρέως ως οι πρωτοπόροι της βαθιάς μάθησης. \n","\n","Το 2019, η Association for Computing Machinery (ACM), ονόμασε τους Hinton, Bengio και LeCun ως αποδέκτες του βραβείου ACM A.M. Turing Award 2018 για τις εννοιολογικές και μηχανικές ανακαλύψεις που έκαναν τα βαθιά νευρωνικά δίκτυα ένα κρίσιμο συστατικό της πληροφορικής.\n","\n","Oυσιαστικά, το Turing Award απονεμήθηκε στο CIFAR, το οποίο μετράει επίσης 20 νομπελίστες στις τάξεις του. \n","\n","\n","To CIFAR-100 δεν είναι εύκολο dataset. Έχει πολλές κατηγορίες, κάποιες πολύ κοντινές, και η ανάλυση είναι χαμηλή, 32x32 pixels."],"metadata":{"id":"M40EutWIR0bY"}},{"cell_type":"markdown","metadata":{"id":"DfEMjsB4Yurm"},"source":["## Εισαγωγή και επισκόπηση του συνόλου δεδομένων"]},{"cell_type":"code","metadata":{"id":"OCW71UaGzz0Q","colab":{"base_uri":"https://localhost:8080/"},"outputId":"d4e2e97b-eee5-4526-8ef1-e016819c6f81","executionInfo":{"status":"ok","timestamp":1671004906214,"user_tz":-120,"elapsed":17223,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["# load the entire dataset\n","(x_train_ds, y_train_ds), (x_test_ds, y_test_ds) = tf.keras.datasets.cifar100.load_data(label_mode='fine')"],"execution_count":3,"outputs":[{"output_type":"stream","name":"stdout","text":["Downloading data from https://www.cs.toronto.edu/~kriz/cifar-100-python.tar.gz\n","169001437/169001437 [==============================] - 14s 0us/step\n"]}]},{"cell_type":"code","metadata":{"id":"kGKYHffEE1do","colab":{"base_uri":"https://localhost:8080/"},"outputId":"c1dc6e79-a2f9-497f-9a6c-43aeb0fd19df","executionInfo":{"status":"ok","timestamp":1671004908533,"user_tz":-120,"elapsed":5,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["print(x_train_ds.shape)"],"execution_count":4,"outputs":[{"output_type":"stream","name":"stdout","text":["(50000, 32, 32, 3)\n"]}]},{"cell_type":"code","metadata":{"id":"PgIN2h_KuCp_","executionInfo":{"status":"ok","timestamp":1671004920746,"user_tz":-120,"elapsed":615,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["# διαβάζουμρ τα ονόματα των κλάσεων από ένα text file που τα έχουμε αποθηκεύσει \n","\n","CIFAR100_LABELS_LIST = pd.read_csv('https://pastebin.com/raw/qgDaNggt', sep=',', header=None).astype(str).values.tolist()[0]"],"execution_count":5,"outputs":[]},{"cell_type":"code","metadata":{"id":"_B4-tvVOQq3j","colab":{"base_uri":"https://localhost:8080/"},"outputId":"3c04b238-506e-4b02-d8a2-ddf906448dfa","executionInfo":{"status":"ok","timestamp":1671004922891,"user_tz":-120,"elapsed":7,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["# print our classes\n","print(CIFAR100_LABELS_LIST)"],"execution_count":6,"outputs":[{"output_type":"stream","name":"stdout","text":["['apple', ' aquarium_fish', ' baby', ' bear', ' beaver', ' bed', ' bee', ' beetle', ' bicycle', ' bottle', ' bowl', ' boy', ' bridge', ' bus', ' butterfly', ' camel', ' can', ' castle', ' caterpillar', ' cattle', ' chair', ' chimpanzee', ' clock', ' cloud', ' cockroach', ' couch', ' crab', ' crocodile', ' cup', ' dinosaur', ' dolphin', ' elephant', ' flatfish', ' forest', ' fox', ' girl', ' hamster', ' house', ' kangaroo', ' keyboard', ' lamp', ' lawn_mower', ' leopard', ' lion', ' lizard', ' lobster', ' man', ' maple_tree', ' motorcycle', ' mountain', ' mouse', ' mushroom', ' oak_tree', ' orange', ' orchid', ' otter', ' palm_tree', ' pear', ' pickup_truck', ' pine_tree', ' plain', ' plate', ' poppy', ' porcupine', ' possum', ' rabbit', ' raccoon', ' ray', ' road', ' rocket', ' rose', ' sea', ' seal', ' shark', ' shrew', ' skunk', ' skyscraper', ' snail', ' snake', ' spider', ' squirrel', ' streetcar', ' sunflower', ' sweet_pepper', ' table', ' tank', ' telephone', ' television', ' tiger', ' tractor', ' train', ' trout', ' tulip', ' turtle', ' wardrobe', ' whale', ' willow_tree', ' wolf', ' woman', ' worm']\n"]}]},{"cell_type":"code","metadata":{"id":"QpGXgTs_5ZCk","colab":{"base_uri":"https://localhost:8080/","height":459},"outputId":"e20e7b8a-f588-442c-91e7-70f972a7c508","executionInfo":{"status":"ok","timestamp":1671004928132,"user_tz":-120,"elapsed":2568,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["# get (train) dataset dimensions\n","data_size, img_rows, img_cols, img_channels = x_train_ds.shape\n","\n","# set validation set percentage (wrt the training set size)\n","validation_percentage = 0.15\n","val_size = round(validation_percentage * data_size)\n","\n","# Reserve val_size samples for validation and normalize all values\n","x_val = x_train_ds[-val_size:]/255\n","y_val = y_train_ds[-val_size:]\n","x_train = x_train_ds[:-val_size]/255\n","y_train = y_train_ds[:-val_size]\n","x_test = x_test_ds/255\n","y_test = y_test_ds\n","\n","print(len(x_val))\n","\n","# summarize loaded dataset\n","print('Train: X=%s, y=%s' % (x_train.shape, y_train.shape))\n","print('Validation: X=%s, y=%s' % (x_val.shape, y_val.shape))\n","print('Test: X=%s, y=%s' % (x_test.shape, y_test.shape))\n","\n","# get class label from class index\n","def class_label_from_index(fine_category):\n"," return(CIFAR100_LABELS_LIST[fine_category.item(0)])\n","\n","# plot first few images\n","plt.figure(figsize=(6, 6))\n","for i in range(9):\n","\t# define subplot\n"," plt.subplot(330 + 1 + i).set_title(class_label_from_index(y_train[i]))\n","\t# plot raw pixel data\n"," plt.imshow(x_train[i], cmap=plt.get_cmap('gray'))\n"," #show the figure\n","plt.show()"],"execution_count":7,"outputs":[{"output_type":"stream","name":"stdout","text":["7500\n","Train: X=(42500, 32, 32, 3), y=(42500, 1)\n","Validation: X=(7500, 32, 32, 3), y=(7500, 1)\n","Test: X=(10000, 32, 32, 3), y=(10000, 1)\n"]},{"output_type":"display_data","data":{"text/plain":["
"],"image/png":"\n"},"metadata":{"needs_background":"light"}}]},{"cell_type":"markdown","metadata":{"id":"0cniJE8eZQAA"},"source":["## Συναρτήσεις εκπαίδευσης"]},{"cell_type":"markdown","metadata":{"id":"G3xB9x5JZjSN"},"source":["Θα χρησιμοποιήσουμε την ιδιότητα prefetch του TF2 για καλύτερες επιδόσεις στην εκπαίδευση: "]},{"cell_type":"code","metadata":{"id":"KPFYqOmIa5Fr","executionInfo":{"status":"ok","timestamp":1671004956810,"user_tz":-120,"elapsed":5143,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["# we user prefetch https://www.tensorflow.org/api_docs/python/tf/data/Dataset#prefetch \n","# see also AUTOTUNE\n","# the dataset is now \"infinite\"\n","\n","BATCH_SIZE = 128\n","AUTOTUNE = tf.data.experimental.AUTOTUNE # https://www.tensorflow.org/guide/data_performance\n","\n","def _input_fn(x,y, BATCH_SIZE):\n"," ds = tf.data.Dataset.from_tensor_slices((x,y))\n"," ds = ds.shuffle(buffer_size=data_size)\n"," ds = ds.repeat()\n"," ds = ds.batch(BATCH_SIZE)\n"," ds = ds.prefetch(buffer_size=AUTOTUNE)\n"," return ds\n","\n","train_ds =_input_fn(x_train,y_train, BATCH_SIZE) #PrefetchDataset object\n","validation_ds =_input_fn(x_val,y_val, BATCH_SIZE) #PrefetchDataset object\n","test_ds =_input_fn(x_test,y_test, BATCH_SIZE) #PrefetchDataset object\n","\n","# steps_per_epoch and validation_steps for training and validation: https://www.tensorflow.org/guide/keras/train_and_evaluate\n","\n","def train_model(model, epochs = 10, steps_per_epoch = 2, validation_steps = 1):\n"," history = model.fit(train_ds, epochs=epochs, steps_per_epoch=steps_per_epoch, validation_data=validation_ds, validation_steps=validation_steps)\n"," return(history)"],"execution_count":8,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"onYIggE9Z2f4"},"source":["## Γραφικές παραστάσεις εκπαίδευσης και απόδοση στο σύνολο ελέγχου"]},{"cell_type":"code","metadata":{"id":"vdWPm3zqbRo1","executionInfo":{"status":"ok","timestamp":1671004958313,"user_tz":-120,"elapsed":3,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["# plot diagnostic learning curves\n","def summarize_diagnostics(history):\n","\tplt.figure(figsize=(8, 8))\n","\tplt.suptitle('Training Curves')\n","\t# plot loss\n","\tplt.subplot(211)\n","\tplt.title('Cross Entropy Loss')\n","\tplt.plot(history.history['loss'], color='blue', label='train')\n","\tplt.plot(history.history['val_loss'], color='orange', label='val')\n","\tplt.legend(loc='upper right')\n","\t# plot accuracy\n","\tplt.subplot(212)\n","\tplt.title('Classification Accuracy')\n","\tplt.plot(history.history['accuracy'], color='blue', label='train')\n","\tplt.plot(history.history['val_accuracy'], color='orange', label='val')\n","\tplt.legend(loc='lower right')\n","\treturn plt\n"," \n","# print test set evaluation metrics\n","def model_evaluation(model, evaluation_steps):\n","\tprint('\\nTest set evaluation metrics')\n","\tloss0,accuracy0 = model.evaluate(test_ds, steps = evaluation_steps)\n","\tprint(\"loss: {:.2f}\".format(loss0))\n","\tprint(\"accuracy: {:.2f}\".format(accuracy0))\n","\n","def model_report(model, history, evaluation_steps = 10):\n","\tplt = summarize_diagnostics(history)\n","\tplt.show()\n","\tmodel_evaluation(model, evaluation_steps)"],"execution_count":9,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"Waa4c40Yay3k"},"source":["## Μοντέλα δικτύων"]},{"cell_type":"markdown","metadata":{"id":"cFTzQtMOa3Rv"},"source":["### Ένα μικρό συνελικτικό δίκτυο \"from scratch\""]},{"cell_type":"code","metadata":{"id":"LtcgTkHojt0G","executionInfo":{"status":"ok","timestamp":1671004963560,"user_tz":-120,"elapsed":3,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["# a simple CNN https://www.tensorflow.org/tutorials/images/cnn\n","\n","def init_simple_model(summary):\n"," model = models.Sequential()\n"," model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32,32,3)))\n"," model.add(layers.MaxPooling2D((2, 2)))\n"," model.add(layers.Conv2D(64, (3, 3), activation='relu'))\n"," model.add(layers.MaxPooling2D((2, 2)))\n"," model.add(layers.Conv2D(64, (3, 3), activation='relu'))\n"," model.add(layers.Flatten())\n"," model.add(layers.Dense(64, activation='relu'))\n"," model.add(layers.Dense(100, activation='softmax'))\n"," \n"," model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.0001), loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=[\"accuracy\"])\n"," if summary: \n"," model.summary()\n"," return model"],"execution_count":10,"outputs":[]},{"cell_type":"code","metadata":{"id":"dSbtouO9lGvj","colab":{"base_uri":"https://localhost:8080/"},"outputId":"e8573e93-f087-4613-ad70-d80da162fe42","executionInfo":{"status":"ok","timestamp":1671005068592,"user_tz":-120,"elapsed":42856,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["SIMPLE_MODEL = init_simple_model(summary = True)\n","SIMPLE_MODEL_history = train_model(SIMPLE_MODEL, 50, 30, 5)"],"execution_count":11,"outputs":[{"output_type":"stream","name":"stdout","text":["Model: \"sequential\"\n","_________________________________________________________________\n"," Layer (type) Output Shape Param # \n","=================================================================\n"," conv2d (Conv2D) (None, 30, 30, 32) 896 \n"," \n"," max_pooling2d (MaxPooling2D (None, 15, 15, 32) 0 \n"," ) \n"," \n"," conv2d_1 (Conv2D) (None, 13, 13, 64) 18496 \n"," \n"," max_pooling2d_1 (MaxPooling (None, 6, 6, 64) 0 \n"," 2D) \n"," \n"," conv2d_2 (Conv2D) (None, 4, 4, 64) 36928 \n"," \n"," flatten (Flatten) (None, 1024) 0 \n"," \n"," dense (Dense) (None, 64) 65600 \n"," \n"," dense_1 (Dense) (None, 100) 6500 \n"," \n","=================================================================\n","Total params: 128,420\n","Trainable params: 128,420\n","Non-trainable params: 0\n","_________________________________________________________________\n","Epoch 1/50\n","30/30 [==============================] - 9s 16ms/step - loss: 4.6040 - accuracy: 0.0109 - val_loss: 4.6045 - val_accuracy: 0.0172\n","Epoch 2/50\n","30/30 [==============================] - 0s 10ms/step - loss: 4.6026 - accuracy: 0.0107 - val_loss: 4.5965 - val_accuracy: 0.0141\n","Epoch 3/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.5977 - accuracy: 0.0156 - val_loss: 4.5958 - val_accuracy: 0.0125\n","Epoch 4/50\n","30/30 [==============================] - 0s 10ms/step - loss: 4.5892 - accuracy: 0.0203 - val_loss: 4.5869 - val_accuracy: 0.0156\n","Epoch 5/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.5793 - accuracy: 0.0161 - val_loss: 4.5750 - val_accuracy: 0.0156\n","Epoch 6/50\n","30/30 [==============================] - 0s 7ms/step - loss: 4.5597 - accuracy: 0.0201 - val_loss: 4.5606 - val_accuracy: 0.0203\n","Epoch 7/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.5262 - accuracy: 0.0258 - val_loss: 4.5338 - val_accuracy: 0.0203\n","Epoch 8/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.4898 - accuracy: 0.0253 - val_loss: 4.4744 - val_accuracy: 0.0188\n","Epoch 9/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.4764 - accuracy: 0.0232 - val_loss: 4.4875 - val_accuracy: 0.0172\n","Epoch 10/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.4354 - accuracy: 0.0297 - val_loss: 4.4201 - val_accuracy: 0.0328\n","Epoch 11/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.4150 - accuracy: 0.0276 - val_loss: 4.3638 - val_accuracy: 0.0422\n","Epoch 12/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.3732 - accuracy: 0.0378 - val_loss: 4.3530 - val_accuracy: 0.0328\n","Epoch 13/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.3227 - accuracy: 0.0440 - val_loss: 4.3296 - val_accuracy: 0.0453\n","Epoch 14/50\n","30/30 [==============================] - 0s 7ms/step - loss: 4.3011 - accuracy: 0.0544 - val_loss: 4.2790 - val_accuracy: 0.0594\n","Epoch 15/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.2811 - accuracy: 0.0461 - val_loss: 4.2956 - val_accuracy: 0.0484\n","Epoch 16/50\n","30/30 [==============================] - 0s 9ms/step - loss: 4.2068 - accuracy: 0.0617 - val_loss: 4.2259 - val_accuracy: 0.0594\n","Epoch 17/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.2053 - accuracy: 0.0591 - val_loss: 4.1944 - val_accuracy: 0.0625\n","Epoch 18/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.1777 - accuracy: 0.0698 - val_loss: 4.1646 - val_accuracy: 0.0688\n","Epoch 19/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.1536 - accuracy: 0.0688 - val_loss: 4.1479 - val_accuracy: 0.0641\n","Epoch 20/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.1566 - accuracy: 0.0688 - val_loss: 4.1805 - val_accuracy: 0.0781\n","Epoch 21/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.1033 - accuracy: 0.0802 - val_loss: 4.1246 - val_accuracy: 0.0750\n","Epoch 22/50\n","30/30 [==============================] - 0s 7ms/step - loss: 4.1058 - accuracy: 0.0833 - val_loss: 4.0961 - val_accuracy: 0.0844\n","Epoch 23/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.1057 - accuracy: 0.0773 - val_loss: 4.1447 - val_accuracy: 0.0812\n","Epoch 24/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.0758 - accuracy: 0.0812 - val_loss: 4.0685 - val_accuracy: 0.0766\n","Epoch 25/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.0446 - accuracy: 0.0951 - val_loss: 4.0530 - val_accuracy: 0.1047\n","Epoch 26/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.0492 - accuracy: 0.0930 - val_loss: 4.0672 - val_accuracy: 0.0781\n","Epoch 27/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.0408 - accuracy: 0.0906 - val_loss: 4.1191 - val_accuracy: 0.0734\n","Epoch 28/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.9908 - accuracy: 0.1031 - val_loss: 4.0505 - val_accuracy: 0.0906\n","Epoch 29/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.0043 - accuracy: 0.0987 - val_loss: 4.0430 - val_accuracy: 0.0797\n","Epoch 30/50\n","30/30 [==============================] - 0s 8ms/step - loss: 4.0033 - accuracy: 0.0964 - val_loss: 3.9554 - val_accuracy: 0.1047\n","Epoch 31/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.9742 - accuracy: 0.1081 - val_loss: 4.0049 - val_accuracy: 0.1125\n","Epoch 32/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.9675 - accuracy: 0.0987 - val_loss: 4.0033 - val_accuracy: 0.0891\n","Epoch 33/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.9868 - accuracy: 0.1010 - val_loss: 3.9739 - val_accuracy: 0.1141\n","Epoch 34/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.9273 - accuracy: 0.1094 - val_loss: 3.9180 - val_accuracy: 0.1141\n","Epoch 35/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.9448 - accuracy: 0.1055 - val_loss: 4.0075 - val_accuracy: 0.1063\n","Epoch 36/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.9008 - accuracy: 0.1133 - val_loss: 3.9301 - val_accuracy: 0.1000\n","Epoch 37/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8969 - accuracy: 0.1138 - val_loss: 3.9767 - val_accuracy: 0.1016\n","Epoch 38/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.9266 - accuracy: 0.1141 - val_loss: 4.0029 - val_accuracy: 0.0969\n","Epoch 39/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8800 - accuracy: 0.1161 - val_loss: 3.9165 - val_accuracy: 0.1141\n","Epoch 40/50\n","30/30 [==============================] - 0s 9ms/step - loss: 3.8801 - accuracy: 0.1159 - val_loss: 3.8286 - val_accuracy: 0.1312\n","Epoch 41/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8655 - accuracy: 0.1255 - val_loss: 3.9459 - val_accuracy: 0.1172\n","Epoch 42/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8472 - accuracy: 0.1273 - val_loss: 3.9387 - val_accuracy: 0.0969\n","Epoch 43/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8921 - accuracy: 0.1169 - val_loss: 3.8542 - val_accuracy: 0.1281\n","Epoch 44/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8983 - accuracy: 0.1172 - val_loss: 3.8667 - val_accuracy: 0.1063\n","Epoch 45/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8460 - accuracy: 0.1289 - val_loss: 3.8271 - val_accuracy: 0.1234\n","Epoch 46/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8294 - accuracy: 0.1221 - val_loss: 3.7809 - val_accuracy: 0.1297\n","Epoch 47/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8391 - accuracy: 0.1266 - val_loss: 3.9372 - val_accuracy: 0.0922\n","Epoch 48/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.7936 - accuracy: 0.1289 - val_loss: 3.8227 - val_accuracy: 0.1437\n","Epoch 49/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8124 - accuracy: 0.1302 - val_loss: 3.8468 - val_accuracy: 0.1281\n","Epoch 50/50\n","30/30 [==============================] - 0s 8ms/step - loss: 3.8220 - accuracy: 0.1292 - val_loss: 3.8324 - val_accuracy: 0.1141\n"]}]},{"cell_type":"markdown","metadata":{"id":"12z4zsoVbPWR"},"source":["### Μεταφορά μάθησης: VGG16\n","Θα χρησιμοποιήσουμε ένα [VGG16](https://www.tensorflow.org/api_docs/python/tf/keras/applications/VGG16) προεκπαιδευμένο στο ImageNet, χωρίς το classification head."]},{"cell_type":"code","metadata":{"id":"6QJueWvUXUTJ","executionInfo":{"status":"ok","timestamp":1671005726048,"user_tz":-120,"elapsed":457,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["# transfer learning: VGG16 trained on ImageNet without the top layer\n","\n","def init_VGG16_model(summary):\n"," vgg_model=tf.keras.applications.VGG16(input_shape=(32,32,3), include_top=False, weights='imagenet')\n"," \n"," VGG16_MODEL=vgg_model.layers[0](vgg_model)\n","\n"," # unfreeze conv layers\n"," VGG16_MODEL.trainable=True\n"," \n"," dropout_layer = tf.keras.layers.Dropout(rate = 0.5)\n"," global_average_layer = tf.keras.layers.GlobalAveragePooling2D()\n","\n"," # add top layer for CIFAR100 classification\n"," prediction_layer = tf.keras.layers.Dense(len(CIFAR100_LABELS_LIST),activation='softmax')\n"," model = tf.keras.Sequential([VGG16_MODEL, dropout_layer, global_average_layer, prediction_layer])\n"," model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.00005), loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=[\"accuracy\"])\n"," if summary: \n"," model.summary()\n"," return model"],"execution_count":14,"outputs":[]},{"cell_type":"code","metadata":{"id":"2bZChKpdh0Cn","colab":{"base_uri":"https://localhost:8080/"},"outputId":"ccd5968c-8fcf-42a6-968d-d9b642d30863","executionInfo":{"status":"ok","timestamp":1671005890743,"user_tz":-120,"elapsed":147219,"user":{"displayName":"Giorgos Siolas","userId":"10127542075805046236"}}},"source":["VGG16_MODEL = init_VGG16_model(True)\n","VGG16_MODEL_history = train_model(VGG16_MODEL, 50, 30, 5)\n","\n","#model_report(VGG16_MODEL, VGG16_MODEL_history, 30)"],"execution_count":15,"outputs":[{"output_type":"stream","name":"stdout","text":["Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5\n","58889256/58889256 [==============================] - 3s 0us/step\n","Model: \"sequential_1\"\n","_________________________________________________________________\n"," Layer (type) Output Shape Param # \n","=================================================================\n"," vgg16 (Functional) (None, 1, 1, 512) 14714688 \n"," \n"," dropout (Dropout) (None, 1, 1, 512) 0 \n"," \n"," global_average_pooling2d (G (None, 512) 0 \n"," lobalAveragePooling2D) \n"," \n"," dense_2 (Dense) (None, 100) 51300 \n"," \n","=================================================================\n","Total params: 14,765,988\n","Trainable params: 14,765,988\n","Non-trainable params: 0\n","_________________________________________________________________\n","Epoch 1/50\n","30/30 [==============================] - 4s 65ms/step - loss: 4.7653 - accuracy: 0.0120 - val_loss: 4.5828 - val_accuracy: 0.0312\n","Epoch 2/50\n","30/30 [==============================] - 2s 59ms/step - loss: 4.5938 - accuracy: 0.0128 - val_loss: 4.5713 - val_accuracy: 0.0156\n","Epoch 3/50\n","30/30 [==============================] - 2s 59ms/step - loss: 4.5901 - accuracy: 0.0141 - val_loss: 4.5555 - val_accuracy: 0.0156\n","Epoch 4/50\n","30/30 [==============================] - 2s 59ms/step - loss: 4.5692 - accuracy: 0.0151 - val_loss: 4.5264 - val_accuracy: 0.0250\n","Epoch 5/50\n","30/30 [==============================] - 2s 59ms/step - loss: 4.5299 - accuracy: 0.0240 - val_loss: 4.4326 - val_accuracy: 0.0531\n","Epoch 6/50\n","30/30 [==============================] - 2s 60ms/step - loss: 4.4970 - accuracy: 0.0299 - val_loss: 4.3707 - val_accuracy: 0.0875\n","Epoch 7/50\n","30/30 [==============================] - 2s 60ms/step - loss: 4.3968 - accuracy: 0.0484 - val_loss: 4.2963 - val_accuracy: 0.0828\n","Epoch 8/50\n","30/30 [==============================] - 2s 63ms/step - loss: 4.3287 - accuracy: 0.0625 - val_loss: 4.1478 - val_accuracy: 0.0906\n","Epoch 9/50\n","30/30 [==============================] - 2s 61ms/step - loss: 4.2562 - accuracy: 0.0802 - val_loss: 4.0623 - val_accuracy: 0.1000\n","Epoch 10/50\n","30/30 [==============================] - 2s 61ms/step - loss: 4.1256 - accuracy: 0.0956 - val_loss: 3.9631 - val_accuracy: 0.1297\n","Epoch 11/50\n","30/30 [==============================] - 2s 60ms/step - loss: 3.9432 - accuracy: 0.1260 - val_loss: 3.8004 - val_accuracy: 0.1703\n","Epoch 12/50\n","30/30 [==============================] - 2s 61ms/step - loss: 3.8514 - accuracy: 0.1471 - val_loss: 3.5655 - val_accuracy: 0.2109\n","Epoch 13/50\n","30/30 [==============================] - 2s 61ms/step - loss: 3.7256 - accuracy: 0.1651 - val_loss: 3.3720 - val_accuracy: 0.2516\n","Epoch 14/50\n","30/30 [==============================] - 2s 61ms/step - loss: 3.6062 - accuracy: 0.1927 - val_loss: 3.3615 - val_accuracy: 0.2234\n","Epoch 15/50\n","30/30 [==============================] - 2s 60ms/step - loss: 3.5270 - accuracy: 0.1940 - val_loss: 3.1731 - val_accuracy: 0.2609\n","Epoch 16/50\n","30/30 [==============================] - 2s 61ms/step - loss: 3.3376 - accuracy: 0.2289 - val_loss: 3.0568 - val_accuracy: 0.2969\n","Epoch 17/50\n","30/30 [==============================] - 2s 61ms/step - loss: 3.2549 - accuracy: 0.2352 - val_loss: 2.9977 - val_accuracy: 0.2969\n","Epoch 18/50\n","30/30 [==============================] - 2s 61ms/step - loss: 3.2195 - accuracy: 0.2521 - val_loss: 2.9309 - val_accuracy: 0.2875\n","Epoch 19/50\n","30/30 [==============================] - 2s 61ms/step - loss: 3.0773 - accuracy: 0.2721 - val_loss: 2.8319 - val_accuracy: 0.3266\n","Epoch 20/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.9823 - accuracy: 0.3008 - val_loss: 2.7604 - val_accuracy: 0.3469\n","Epoch 21/50\n","30/30 [==============================] - 2s 62ms/step - loss: 2.9090 - accuracy: 0.3016 - val_loss: 2.7093 - val_accuracy: 0.3609\n","Epoch 22/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.8642 - accuracy: 0.3143 - val_loss: 2.5988 - val_accuracy: 0.3484\n","Epoch 23/50\n","30/30 [==============================] - 2s 62ms/step - loss: 2.6695 - accuracy: 0.3432 - val_loss: 2.4390 - val_accuracy: 0.3688\n","Epoch 24/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.6098 - accuracy: 0.3589 - val_loss: 2.4339 - val_accuracy: 0.3594\n","Epoch 25/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.6127 - accuracy: 0.3508 - val_loss: 2.3084 - val_accuracy: 0.4109\n","Epoch 26/50\n","30/30 [==============================] - 2s 62ms/step - loss: 2.4964 - accuracy: 0.3708 - val_loss: 2.3636 - val_accuracy: 0.3734\n","Epoch 27/50\n","30/30 [==============================] - 2s 62ms/step - loss: 2.4971 - accuracy: 0.3815 - val_loss: 2.3976 - val_accuracy: 0.4031\n","Epoch 28/50\n","30/30 [==============================] - 2s 60ms/step - loss: 2.4456 - accuracy: 0.3935 - val_loss: 2.2308 - val_accuracy: 0.4516\n","Epoch 29/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.4140 - accuracy: 0.3917 - val_loss: 2.2595 - val_accuracy: 0.4344\n","Epoch 30/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.3683 - accuracy: 0.4031 - val_loss: 2.1995 - val_accuracy: 0.4453\n","Epoch 31/50\n","30/30 [==============================] - 2s 62ms/step - loss: 2.3262 - accuracy: 0.4102 - val_loss: 2.1314 - val_accuracy: 0.4516\n","Epoch 32/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.2991 - accuracy: 0.4120 - val_loss: 2.0713 - val_accuracy: 0.4688\n","Epoch 33/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.2715 - accuracy: 0.4122 - val_loss: 2.0959 - val_accuracy: 0.4609\n","Epoch 34/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.1794 - accuracy: 0.4479 - val_loss: 2.0017 - val_accuracy: 0.4688\n","Epoch 35/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.0622 - accuracy: 0.4641 - val_loss: 2.0641 - val_accuracy: 0.4656\n","Epoch 36/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.0812 - accuracy: 0.4656 - val_loss: 2.0374 - val_accuracy: 0.4578\n","Epoch 37/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.0821 - accuracy: 0.4656 - val_loss: 1.9930 - val_accuracy: 0.4609\n","Epoch 38/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.0199 - accuracy: 0.4794 - val_loss: 2.0539 - val_accuracy: 0.4750\n","Epoch 39/50\n","30/30 [==============================] - 2s 60ms/step - loss: 1.9943 - accuracy: 0.4857 - val_loss: 1.9744 - val_accuracy: 0.4781\n","Epoch 40/50\n","30/30 [==============================] - 2s 61ms/step - loss: 2.0240 - accuracy: 0.4734 - val_loss: 1.9476 - val_accuracy: 0.4781\n","Epoch 41/50\n","30/30 [==============================] - 2s 62ms/step - loss: 1.9945 - accuracy: 0.4826 - val_loss: 1.9304 - val_accuracy: 0.4906\n","Epoch 42/50\n","30/30 [==============================] - 2s 61ms/step - loss: 1.9549 - accuracy: 0.4977 - val_loss: 1.9342 - val_accuracy: 0.4875\n","Epoch 43/50\n","30/30 [==============================] - 2s 61ms/step - loss: 1.9270 - accuracy: 0.5036 - val_loss: 1.8346 - val_accuracy: 0.5281\n","Epoch 44/50\n","30/30 [==============================] - 2s 62ms/step - loss: 1.9759 - accuracy: 0.4859 - val_loss: 1.9668 - val_accuracy: 0.4766\n","Epoch 45/50\n","30/30 [==============================] - 2s 61ms/step - loss: 1.7994 - accuracy: 0.5273 - val_loss: 1.9548 - val_accuracy: 0.4734\n","Epoch 46/50\n","30/30 [==============================] - 2s 61ms/step - loss: 1.8003 - accuracy: 0.5143 - val_loss: 1.7756 - val_accuracy: 0.5266\n","Epoch 47/50\n","30/30 [==============================] - 2s 60ms/step - loss: 1.7769 - accuracy: 0.5320 - val_loss: 1.9514 - val_accuracy: 0.4922\n","Epoch 48/50\n","30/30 [==============================] - 2s 60ms/step - loss: 1.7765 - accuracy: 0.5271 - val_loss: 1.8205 - val_accuracy: 0.5063\n","Epoch 49/50\n","30/30 [==============================] - 2s 60ms/step - loss: 1.6827 - accuracy: 0.5594 - val_loss: 1.7065 - val_accuracy: 0.5656\n","Epoch 50/50\n","30/30 [==============================] - 2s 60ms/step - loss: 1.6864 - accuracy: 0.5516 - val_loss: 1.8616 - val_accuracy: 0.5203\n"]}]},{"cell_type":"markdown","metadata":{"id":"OadPASXYOllM"},"source":["### Μεταφορά μάθησης: ResNet152V2\n","Θα χρησιμοποιήσουμε ένα [ResNet152V2](https://www.tensorflow.org/api_docs/python/tf/keras/applications/ResNet152V2) προεκπαιδευμένο στο ImageNet, χωρίς το classification head."]},{"cell_type":"code","metadata":{"id":"-6WG_6YJM4T7"},"source":["# transfer learning: ResNet152V2 trained on ImageNet without the top layer\n","\n","\n","def init_ResNet152V2_model(summary):\n"," ResNet152V2_model=tf.keras.applications.ResNet152V2(input_shape=(32,32,3), include_top=False, weights='imagenet')\n"," \n"," ResNet152V2_MODEL=ResNet152V2_model.layers[0](ResNet152V2_model)\n","\n"," # unfreeze conv layers\n"," ResNet152V2_MODEL.trainable=True\n"," \n","# dropout_layer = tf.keras.layers.Dropout(rate = 0.5)\n","# global_average_layer = tf.keras.layers.GlobalAveragePooling2D()\n","\n"," # add top layer for CIFAR100 classification\n"," prediction_layer = tf.keras.layers.Dense(len(CIFAR100_LABELS_LIST),activation='softmax')\n"," model = tf.keras.Sequential([ResNet152V2_MODEL, prediction_layer])\n"," #model = tf.keras.Sequential([ResNet152V2_MODEL, dropout_layer, global_average_layer, prediction_layer])\n"," model.compile(optimizer=tf.optimizers.Adam(learning_rate=0.00005), loss=tf.keras.losses.sparse_categorical_crossentropy, metrics=[\"accuracy\"])\n"," if summary: \n"," model.summary()\n"," return model"],"execution_count":null,"outputs":[]},{"cell_type":"code","metadata":{"id":"C3TznZ7ENl30"},"source":["ResNet152V2_MODEL = init_ResNet152V2_model(True)\n","ResNet152V2_MODEL_history = train_model(ResNet152V2_MODEL, 50, 30, 5)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"HUfob-kNfaSD"},"source":["# Βελτίωση της επίδοσης με πειράματα\n","\n","Από τα τρια προηγούμενα νευρωνικά, με τον μικρό αριθμό εποχών που θέσαμε, μοιάζει ότι δεν ισχύει το \"bigger is better\" καθώς το VGG16 φαίνεται να είναι το καταλληλότερο για το task.\n","\n","- Μπορείτε να δώσετε κάποιες ερμηνείες για την επίδοση των τριών μοντέλων;\n","\n","- Μπορείτε να βρείτε μέθοδο (μοντέλο και εκπαιδευτική διαδικασία) για την βελτίωση των επιδόσεων στο CIFAR-100;\n","\n","Α word of wisdom για το θέμα αυτό είναι: το κατάλληλο μοντέλο εξαρτάται από τρία πράγματα: *πόσο χρόνο έχουμε, τί υπολογιστικούς πόρους διαθέτουμε, ποιες επιδόσεις θεωρούμε καλές.*\n","\n","Στη συνέχεια εξετάζουμε κάποιες από τις δυνατότητες που έχουμε για να βελτιώσουμε τις επιδόσεις των νευρωνικών μας με πειράματα."]},{"cell_type":"markdown","metadata":{"id":"FiVKTH5rmhWn"},"source":["## Δοκιμές διαφορετικών μοντέλων\n","\n","Μπορείτε είτε να δοκιμάσετε μοντέλα \"from scratch\", όπου ορίζετε την αρχιτεκτονική του δικτύου όπως θέλετε, είτε να χρησιμοποιήσετε μεταφορά μάθησης.\n"]},{"cell_type":"markdown","metadata":{"id":"pAbqI8hapbX9"},"source":["\n","### Μοντέλα \"from scratch\"\n","\n","Μπορείτε να τροποποιήσετε/αλλάξετε το αρχικό μικρό συνελικτικό δίκτυο του παραδείγματος. Μπορείτε να συμβουλευτείτε \n","- τη [βιβλιογραφία απο το leaderboard του CIFAR-100](https://benchmarks.ai/cifar-100) για αρχιτεκτονικές και παραμέτρους των δικτύων\n","- ή/και να πάρετε ιδέες [από σχετική αναζήτηση στο Google Scholar](https://scholar.google.gr/scholar?hl=en&as_sdt=0%2C5&q=cifar+100+cnn&oq=cifa)"]},{"cell_type":"markdown","metadata":{"id":"8LZIFj-AmlPv"},"source":["### Μεταφορά μάθησης\n","\n","Εναλλακτικά, μπορείτε να χρησιμοποιήσετ τη [μεταφορά μάθησης του tf2](https://www.tensorflow.org/tutorials/images/transfer_learning). Σε αντίθεση με τα μοντέλα \"from scratch\" η μεταφορά μάθησης μας επιστρέφει έτοιμα μοντέλα με προκαθορισμένη αρχιτεκτονική στην οποία μπορούμε γενικά μόνο να προσθέσουμε επίπεδα, τα οποία συνήθως περιορίζοντα σε πλήρως διασυνδεδεμένα επίπεδα που εξειδικεύονται στο συγκεκριμένο task ταξινόμησης που έχουμε να επιτελέσουμε. "]},{"cell_type":"markdown","metadata":{"id":"GrCxOjJ7ush3"},"source":["#### Εκπαίδευση βαρών\n","\n","Ταυτόχρονα με την αρχιτεκτονική, στη μεταφορά μάθησης εισάγουμε και τη γνώση που έχει αποκτήσει το μοντέλο, δηλαδή τις τιμές των βαρών του όπως έχουν προκύψει μετά από εκπαίδευση συνήθως στο (τεράστιο) ImageNet. Οταν εισάγουμε ένα μοντέλο με μεταφορά μάθησης έχουμε τρεις επιλογές για την εκπαίδευση:\n","- να παγώσουμε τη συνελικτική βάση και να εκπαιδεύσουμε την κεφαλή ταξινόμησης (classification head). Αυτό αντιστοιχεί στο να χρησιμοποιήσουμε τη συνελικτική βάση για εξαγωγή χαρακτηριστικών (feature extraction), σημαία trainable = False.\n","- να συνεχίσουμε να εκπαιδεύουμε όλα τα επίπεδα του δικτύου, σημαία trainable = True.\n","- να εκπαιδευτεί μόνο ένα ποσοστό των επιπέδων, εβρισκόμενο προς την έξοδο του δικτύου. Οι σημαίες trainable εδώ θα πρέπει να οριστούν ανά επίπεδο.\n"]},{"cell_type":"markdown","metadata":{"id":"6WFvmWr9xEUz"},"source":["\n","#### Διαθέσιμα μοντέλα για μεταφορά μάθησης στο tf2\n","\n","1. tf.keras.applications. Ο πιο απλός τρόπος για να κάνουμε μεταφορά μάθησης στο tf2 είναι μέσω του [tf.keras.applications](https://www.tensorflow.org/api_docs/python/tf/keras/applications) που παρέχει προεκπαιδευμένα μοντέλα από το Keras και συγκεκριμένα τα δίκτυα: DenseNet, Inception-ResNet V2, Inception V3, MobileNet v1, MobileNet v2, NASNet-A, ResNet, ResNet v2, VGG16, VGG19 και Xception V1. Η εισαγωγή των μοντέλων γίνεται παρόμοια με αυτή που δείξαμε παραπάνω για το VGG16.\n","\n","2. TensorFlow Hub. Μπορείτε επίσης να χρησιμοποιήσετε μοντέλα τα οποία είναι διαθέσιμα στο αποθετήριο [TensoFlow Hub](https://tfhub.dev/s?fine-tunable=yes&module-type=image-augmentation,image-classification,image-feature-vector,image-generator,image-object-detection,image-others,image-style-transfer,image-rnn-agent&tf-version=tf2) το οποίο περιλαμβάνει πάνω από 100 προεκπαιδευμένα μοντέλα.\n","\n","3. Αποθηκευμένα μοντέλα απο τρίτες πηγές. Μπορείτε επίσης να κάνετε μεταφορά μάθησης από τρίτες πηγές, είτε του συνόλου του νευρωνικού, αρχιτεκτονικής και βαρών, είτε μόνο της αρχιτεκτονικής ή των βαρών. Το μοντέλο θα πρέπει να έχει αποθηκευθεί σε ένα από τα δύο φορμάτ, Keras HDF5 format (.h5 ή .keras) ή στο SavedModel format που αναφέραμε στην εισαγωγή. Τα βάρη μπορούν να εισαχθούν και μόνα τους ως Checkpoints. Για περισσότερα, διαβάστε σχετικά τα λήμματα [\"Save and load models\"](https://www.tensorflow.org/tutorials/keras/save_and_load), [\"Save and serialize\"](https://www.tensorflow.org/guide/keras/save_and_serialize), [\"Using the SavedModel format\"](https://www.tensorflow.org/guide/saved_model) και δείτε για παράδειγμα πως μπορούμε να κάνουμε μεταφορά μάθησης από τα state-of-the-art EfficientNets ([1](https://www.dlology.com/blog/transfer-learning-with-efficientnet/), [2](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnethttps://)).\n","\n","Σημειώστε ότι πολλά μοντέλα απαιτούν μεγαλύτερες διαστάσεις στην είσοδο από αυτές του CIFAR-100 και κατά συνέπεια τα δεδομένα πρέπει να [μετασχηματιστούν](https://www.tensorflow.org/api_docs/python/tf/image/resize). Προσέξτε ωστόσο τις απαιτήσεις σε μνήμη όταν αυτοί οι μετασχηματισμοί γίνονται απευθείας σε μεταβλητές (βλ. πιο κάτω \"Διαχείριση μνήμης\"). \n"]},{"cell_type":"markdown","metadata":{"id":"qTOi1CQsPOTT"},"source":["#### Zoos and Gardens\n","\n","![](https://camo.githubusercontent.com/ffc27b8ea00af45c0e71785dda0d04658e964a1f8df583634637c61123add357/68747470733a2f2f73746f726167652e676f6f676c65617069732e636f6d2f6d6f64656c5f67617264656e5f6172746966616374732f54465f4d6f64656c5f47617264656e2e706e67)\n","\n","Οι διάφορες βιβλιοθήκες -όχι μόνο η TensorFlow- αποκαλούν τη συλλογή εκπαιδευμένων μοντέλων τους \"Zoo\" ή \"Garden\". Παράδειγμα το [TensorFlow Model Garden](https://github.com/tensorflow/models)."]},{"cell_type":"markdown","metadata":{"id":"m9J7yIJrr4Ku"},"source":["### Επαύξηση δεδομένων\n","\n","Μια τεχνική που μπορεί να σας δώσει καλά αποτελέσματα είναι η επάυξηση δεδομένων (data augmentation). Η επαύξηση δεδομένων επιτρέπει να δημιουργήσουμε μεγαλύτερη ποικιλία στα δεδομένα εφαρμόζοντας τυχαίους αλλά ρεαλιστικούς μετασχηματισμούς στις εικόνες, όπως πχ η περιστροφη.\n","\n","Μπορούμε να κάνουμε data augmetation με δύο τρόπους: με επίπεδα προεπεξεργασίας του Keras, ή με χρήση του tf.image. Δείτε [εδώ](https://www.tensorflow.org/tutorials/images/data_augmentation) σχετικά από το documentation του TensorFlow και [εδώ](https://stepup.ai/train_data_augmentation_keras/) ένα πρακτικό παράδειγμα στο CIFAR-10."]},{"cell_type":"markdown","metadata":{"id":"SRI_3XhBQ7sb"},"source":["## Παρατηρήσεις ως προς τη βελτιστοποίηση"]},{"cell_type":"markdown","metadata":{"id":"mtVx8MsZRQrn"},"source":["### Διαχείριση μνήμης (TFRecord)\n","\n","Η φόρτωση δεδομένων με τον τρόπο που το κάναμε παραπάνω στο απλό παράδειγμα υλοποίησης είναι πολύ βολική αλλά δεν είναι καθόλου αποτελεσματική ως προς τη διαχείριση της μνήμης. Συγκεκριμένα, με τον τρόπο αυτό, τα δεδομένα αποθηκεύονται απευθείας σε μεταβλητές, οι οποίες όλες μαζί καταλαμβάνουν τη RAM της CPU ή της GPU, κάτι που κάνει αδύνατη τη διαχείριση μεγάλων datasets ή τον μεταχηματισμό των δεδομένων όπως όταν κάνουμε αύξηση δεδομένων (data augmentation).\n","\n","Για να παρακαμφθεί αυτό το πρόβλημα, υπάρχει η δυνατότητα της σειριοποίησης των δεδομένων (serialization) και της αποθήκευσής τους σε αρχεία μεσαίου μεγέθους (κάποιων MB) τα οποία μπορούνα να αναγνωστούν γραμμικά. Το φορμάτ TFRecord είναι ένα φορμάτ που επιτρέπει την αποθήκευση σειράς δυαδικών εγγραφών. Διαβάστε τα σχετικά λήμματα [TFRecord and tf.Example](https://www.tensorflow.org/tutorials/load_data/tfrecord) και [tf.data: Build TensorFlow input pipelines](https://www.tensorflow.org/guide/data). \n","\n","Σημειώστε ότι με τη μέθοδο αυτή θα πρέπει να γίνει import η `tensorflow_datasets` και να χρησιμοποιήσουμε την `tfds.load` ώστε να αποθηκευθεί το σύνολο δεδομένων σε αρχεία tfrecord στο δίσκο (δείτε [εδώ](https://colab.research.google.com/github/tensorflow/datasets/blob/master/docs/overview.ipynb) ένα παράδειγμα). Φυσικά μπορούμε να μετατρέψουμε και τα πρωτογενή δεδομένα (raw data) του dataset όπως αρχεία jpg σε φορματ tfrecord όπως [εδώ](https://towardsdatascience.com/working-with-tfrecords-and-tf-train-example-36d111b3ff4d).\n","\n"]},{"cell_type":"markdown","metadata":{"id":"yypACH_oZx_i"},"source":["### Υπερεκπαίδευση\n","\n","Μπορείτε να πειραματιστείτε ως προς τον έλεγχο της υπερεκπαίδευσης (overfitting) με διάφορους τρόπους. Μεταξύ αυτών μπορούμε να αναφέρουμε τους εξής:\n","- Πρόωρος τερματισμός (early stopping). Μια μέθοδος που τερματίζει την εκπαίδευση αν δεν υπάρχει βελτίωση ως προς τη μετρική απόδοσης που παρακολουθούμε. [tf.keras.callbacks.EarlyStopping](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/EarlyStoppinghttps://)\n","- Dropout. Μια άλλη τεχνική για τη μείωση της υπερεκπαίδευσης είναι το Dropout. Είναι ένα είδος ομαλοποίησης (regularization) που επιβάλλει στα βάρη του δικτύου να παίρνουν μόνο μικρές τιμές. Εάν εφαρμόσουε dropout σε ένα επίπεδο του δικτύου, τότε ένα ποσοστό των βαρών του γίνεται τυχαία μηδενικό κατά την εκπαίδευση. [Dropout](https://www.tensorflow.org/tutorials/images/classification#dropout)\n","- Επαύξηση δεδομένων. Η υπερεκπαίδευση συνήθως συμβαίνει όταν έχουμε λίγα ή/και πολύ όμοια δεδομένα εκπαίδευσης. Ένας τρόπος να διορθωθεί αυτό το πρόβλημα είναι να αυξήσουμε τα δεδομένα (data augmentation). Το data augmentation δημιουργεί νέα δεδομένα εκπαίδευσης με βάση τα υπάρχοντα εφαρμόζοντας τυχαίους μετασχηματισμούς ώστε να προκύπτουν αληθοφανείς εικόνες. [Data augmentation](https://www.tensorflow.org/tutorials/images/classification#data_augmentation), [ImageDataGenerator](https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator#class_imagedatagenerator)\n","\n","Βλέπε επίσης [Image classification](https://www.tensorflow.org/tutorials/images/classification)."]},{"cell_type":"markdown","metadata":{"id":"MBLBjFFFmsUP"},"source":["### Χρόνος εκπαίδευσης\n","\n","Το TensorFlow 2 προσφέρει νέους ή βελτιώνει διάφορους μηχανισμούς βελτιστοποίησης της εκπαίδευσης. Μεταξύ αυτών έχουμε τους εξής:\n","- Data prefetching (το χρησιμοποιήσαμε παραπάνω)\n","- Data reading parallelization \n","- Map transformation parallelization\n","- Caching\n","- Reducing memory footprint\n","\n","Συμβουλευτείτε για τα παραπάνω το [Better performance with the tf.data API](https://www.tensorflow.org/guide/data_performance)"]},{"cell_type":"markdown","metadata":{"id":"s3ddu1ECoCGQ"},"source":["### Εργαλεία υψηλού επιπέδου\n","\n","Μεταξύ των εργαλείων βελτιστοποίησης υψηλού επιπέδου (high-level) του TensorFlow μπορούμε να αναφέρουμε τα ακόλουθα:\n","\n","- [TensorBoard](https://www.tensorflow.org/tensorboard/get_started) και [What-If Tool](https://www.tensorflow.org/tensorboard/what_if_tool) Επικουρικό εργαλείο οπτικοποίησης/ανάλυσης για τον πειραματισμό στη εκπαίδευση\n","- [tf-explain](https://tf-explain.readthedocs.io/en/latest/) Προσφέρει μεθόδους επεξηγισιμότητας για το tf2\n","- [Keras Tuner](https://github.com/keras-team/keras-tuner) Βελτιστοποίηση υπερπαραμέτρων του Keras στο TensorFlow 2.0\n","- [AutoAugment](https://github.com/tensorflow/models/tree/master/research/autoaugment) Εκμάθηση της πολιτικης επαύξησης από τα δεδομένα"]},{"cell_type":"markdown","metadata":{"id":"VNHFNS981Qh0"},"source":["# Για τις χριστουγεννιάτικες διακοπές\n"]},{"cell_type":"markdown","source":["\n","## What's next for Deep Learning\n","\n","Οι νονοί της Τεχνητής Νοημοσύνης και νικητές του βραβείου Turing 2018 της ACM, Geoffrey Hinton, Yann LeCun και Yoshua Bengio, μοιράστηκαν τη σκηνή στη Νέα Υόρκη το βράδυ της Κυριακής 9 Φεβρουαρίου σε μια εκδήλωση που διοργάνωσε το τριακοστό τέταρτο συνέδριο AAAI για την Τεχνητή Νοημοσύνη (AAAI 2020). \n","\n","Η τριάδα των ερευνητών έχει καταστήσει τα βαθιά νευρωνικά δίκτυα ένα κρίσιμο συστατικό της πληροφορικής, και σε μεμονωμένες ομιλίες και μια συζήτηση σε πάνελ συζήτησαν τις απόψεις τους σχετικά με τις τρέχουσες προκλήσεις που αντιμετωπίζει η βαθιά μάθηση και το πού θα πρέπει να κατευθυνθεί."],"metadata":{"id":"uzoS4hnnY1pN"}},{"cell_type":"code","metadata":{"id":"ECLw-wy_wGvm"},"source":["from IPython.display import IFrame\n","IFrame(src='https://www.youtube.com/embed/UX8OubxsY8w', width=640, height=480)"],"execution_count":null,"outputs":[]},{"cell_type":"markdown","metadata":{"id":"mwhiplJkzR1W"},"source":["## Αντίλογος\n"]},{"cell_type":"markdown","source":["\n","### Η κριτική του Schmidhuber για τους LBH\n","\n","(Ο Schmidhuber και ο Hochreiter [εισήγαγαν τα LSTM το 1997](https://www.bioinf.jku.at/publications/older/2604.pdf))\n","\n","*Machine learning is the science of credit assignment. The machine learning community itself profits from proper credit assignment to its members. The inventor of an important method should get credit for inventing it. She may not always be the one who popularizes it. Then the popularizer should get credit for popularizing it (but not for inventing it). Relatively young research areas such as machine learning should adopt the honor code of mature fields such as mathematics: if you have a new theorem, but use a proof technique similar to somebody else's, you must make this very clear. If you \"re-invent\" something that was already known, and only later become aware of this, you must at least make it clear later.*\n","\n","*As a case in point, let me now comment on a recent article in [Nature (2015) about \"deep learning\"](http://www.nature.com/nature/journal/v521/n7553/full/nature14539.html) in artificial neural networks (NNs), by LeCun & Bengio & Hinton (LBH for short), three CIFAR-funded collaborators who call themselves the \"deep learning conspiracy\" (e.g., LeCun, 2015). They heavily cite each other. Unfortunately, however, they fail to credit the pioneers of the field, which originated half a century ago. All references below are taken from the recent [deep learning overview](http://www.idsia.ch/~juergen/deep-learning-overview.html) (Schmidhuber, 2015), except for a few papers listed beneath this critique focusing on nine items.*\n","\n","[Full text](http://people.idsia.ch/~juergen/deep-learning-conspiracy.html)\n","\n","\n","Στο πρώτο του claim ο Schmidhuber γράφει:\n","\n","Η έρευνα του LBH δεν αναφέρει καν τον πατέρα της βαθιάς μάθησης, τον Alexey Grigorevich Ivakhnenko, ο οποίος δημοσίευσε τους πρώτους γενικούς, λειτουργικούς αλγορίθμους μάθησης για βαθιά δίκτυα (π.χ. Ivakhnenko and Lapa, 1965). \n","\n","Εντοπίσαμε το εν λόγω \"Ivakhnenko, A. G. and Lapa, V. G. (1965). Cybernetic Predicting Devices. CCM Information Corporation.\" μεταφρασμένο το 1966 [εδώ](https://drive.google.com/file/d/1cSKixS3_kaVghwETTvReQcIpFisJ-H6Y/view?usp=sharing).\n","\n","![](https://i.imgur.com/bghR7pk.png)\n","\n","Κοιτάξτε το κεφάλαιο 4, σελίδα 148 του κειμένου και 155 του PDF, για να δείτε αν έχει δίκιο ο Schmidhuber!\n"],"metadata":{"id":"j5jOycHJzT4F"}}]}