Login
main >   machine_learning >  


Introduction to machine learning with Python:

The book:
https://www.amazon.com/Introduction-Machine-Learning-Python-Scientists/dp/1449369413
The books code:
https://github.com/amueller/introduction_to_ml_with_python

Setup environment

cd <environments directory>
which python3
virtualenv -p /usr/bin/python3 ml
source ml/bin/activate
pip install -U numpy scipy matplotlib ipython scikit-learn wheel pandas opencv-python imutils Pillow keras tensorflow requests
sudo apt-get install python3-tk
sudo apt-get install -y libsm6 libxext6 libxrender-dev libglib2.0-0

Steps:
Simplest Prediction
Learn With Strengths
Complex Prediction
Survey ML

Existing Datasets
MNist to PNG: https://github.com/myleott/mnist_png

Create Numpy Dataset
Work with array: https://www.hackerearth.com/practice/machine-learning/data-manipulation-visualisation-r-python/tutorial-data-manipulation-numpy-pandas-python/tutorial/
Images dataset: https://www.youtube.com/watch?v=2OUMaahyxic
https://medium.com/@muskulpesent/create-numpy-array-of-images-fecb4e514c4b
https://www.kaggle.com/lgmoneda/from-image-files-to-numpy-arrays
https://www.researchgate.net/post/How_to_create_MNIST_type_database_from_images
https://www.geeksforgeeks.org/ml-label-encoding-of-datasets-in-python/
Google: From image files to Numpy Arrays

Upload Dataset
https://github.com/jaberg/skdata/wiki/How-to-Create-a-New-Dataset-Module

Choosing the right estimator

SKLearn Classifiers

Try out all Classifiers:
Grid Search and Pipelines
https://www.kdnuggets.com/2018/01/managing-machine-learning-workflows-scikit-learn-pipelines-part-3.html
https://stackoverflow.com/questions/23045318/scikit-grid-search-over-multiple-classifiers

Extract Text:
https://github.com/factful/ocr_testing
https://medium.com/@winston.smith.spb/python-ocr-for-pdf-or-compare-textract-pytesseract-and-pyocr-acb19122f38c
https://source.opennews.org/articles/so-many-ocr-options/

Clean text:
https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html

Sentences that are the same:
https://github.com/seatgeek/fuzzywuzzy
https://bergvca.github.io/2017/10/14/super-fast-string-matching.html

Image Object Stitching:
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4081273/

Follow Technologies:
https://www.ibm.com/cloud/watson-studio

https://stackoverflow.com/questions/48866205/detect-whether-the-checkbox-is-checked-using-opencv

Cool ML Stuff

Other Usefull Apps

opencv

https://greydanus.github.io/2016/08/21/handwriting/
https://towardsdatascience.com/build-a-handwritten-text-recognition-system-using-tensorflow-2326a3487cd5
https://arxiv.org/abs/1904.08095
https://medium.com/@pechyonkin/part-iv-capsnet-architecture-6a64422f7dce
https://www.researchgate.net/publication/37434879_Cursive_Character_Challenge_a_New_Database_for_Machine_Learning_and_Pattern_Recognition
https://www.google.com/amp/s/amp.reddit.com/r/MachineLearning/comments/42tuhx/offline_cursive_handwriting_recognition_in_python/
https://github.com/githubharald/SimpleHTR
https://link.springer.com/chapter/10.1007/978-3-540-76280-5_10
http://publications.idiap.ch/index.php/publications/show/364
https://distill.pub/2016/handwriting/


hidden1

hidden2