An introduction to machine learning

A brief overview of the area of machine learning

Jason J Pulikkottil
6 min readJul 19, 2022

Machine learning study is concerned with how computers learn from data. It is regarded as a branch of artificial intelligence.Machine learning algorithms are used in a variety of areas, including computer vision and health. Certain implementations use data and neural networks to mimic the operation of the brain.

Machine learning is a field that uses numerous ways to teach computers to do tasks. It entails computers learning from available data in order to do certain tasks. For basic jobs, it is easy to write algorithms that instruct the computer on how to carry out all of the processes necessary to solve the problem at hand.

Arthur Samuel, an IBM employee, created the phrase machine learning in 1959. During this time, the term self-teaching computers was utilised.
Modern machine learning has two goals: to categorise data and to predict future consequences.

Researchers were initially interested in letting machines learn from data in the early days of artificial intelligence. By 1980, expert systems had taken over AI, and statistics had fallen out of favour. Machine learning, which had been restructured as a separate area, began to grow in the 1990s. It shifted its focus from creating artificial intelligence to solving solved practical challenges. ML learns and predicts from passive observations, whereas AI requires an agent to interact with the environment in order to learn.

Machine learning and data mining frequently use the same methodologies and have substantial overlap. Data mining is concerned with discovering previously undiscovered characteristics in data. In machine learning, performance is often measured in terms of the capacity to replicate known information; in data mining, the main aim is to uncover previously undiscovered knowledge.

Loss functions express the difference between the model’s predictions and the actual problem cases. Models are taught to minimise some loss function on a training set of instances, hence they are strongly connected to optimization.

Characterizing the generalisation of various learning algorithms is a current research subject, particularly for deep-learning algorithms.

Machine learning and statistics are closely linked topics with unique goals.
Statistics infers population trends from a sample, whereas machine learning discovers generalizable prediction patterns.
Some statisticians have integrated machine learning approaches, resulting in a hybrid area known as statistical learning.

A learner’s main goal is to generalise from their experience.
Generalization refers to a learning machine’s capacity to execute accurately on new, previously encountered examples/tasks.
Computational examination of machine learning algorithms and their performance is an area of theoretical computer science.

Classification of machine learning methods:

  1. Supervised learning:

The goal is for the computer to learn a general rule that maps inputs to their desired output. Supervised learning algorithms construct a mathematical model of a set of data that includes the inputs as well as the desired output.
The method will be able to accurately predict the output for inputs that were not part of the training data if the function is optimum.
Similarity learning is a supervised machine learning subfield that is closely connected to regression and classification.

2. Unsupervised learning:

There are no labels supplied to the computer, leaving it to identify structure in its data and act on it as a means to a goal. Unsupervised learning algorithms use a collection of data that just comprises inputs and detect structure in the data, such as data point grouping or clustering. They learn through unlabeled, classified, or categorised test data. Clusters are the division of observations into subsets known as clusters based on predetermined criteria.

3. Reinforcement learning:

Reinforcement learning is a classification of machine learning that deals with how software agents should behave in a given environment. Many other fields, including as game theory and operations research, are studying the topic. Reinforcement learning algorithms are employed in self-driving cars or while learning to play a game against a human opponent.

Top ten most used machine learning algorithms

  1. Linear regression

The value of one variable can be predicted using a linear regression analysis based on the value of another variable.The coefficients of an equation containing one or more independent variables are estimated using this method of analysis. Linear regression is used to fit a straight line or surface that minimises the difference between anticipated and actual data.

2. Logistic regression

Logistic regression is a statistical model that is frequently used for classification and predictive analytics. Based on a given dataset of independent factors, it assesses the likelihood of an event occurring, such as voting or not voting. Because the outcome is a probability, the dependent variable has a range of 0 to 1.

3. Decision tree

The Decision Tree algorithm belongs to the family of supervised learning algorithms.It may also be used to solve regression and classification issues.
The purpose of employing a Decision Tree is to build a training model that can predict the class or value of the target variable.

4. SVM algorithm

Support Vector Machine, or SVM, is a popular Supervised Learning approach for classification and regression problems.The SVM algorithm’s purpose is to find the optimal line or decision boundary for classifying n-dimensional space.

5. Naive Bayes algorithm

The Bayes Theorem is the foundation of the Nave Bayes algorithm, which is utilised in a broad range of classification problems.
It has been used effectively for a variety of applications, but it excels at natural language processing problems.

6. KNN algorithm

One of the most basic Machine Learning algorithms is K-Nearest Neighbour.
The KNN algorithm maintains all available data and uses similarity to classify fresh data points. It may be used for both regression and classification, but it is typically employed for classification issues.

7. K-means cluster algorithm

The goal of k-means clustering is to divide n observations into k groups.
Each observation is part of a cluster with the closest mean. As a result of this, the data space is divided into Voronoi cells. The problem is computationally difficult, yet effective solutions rapidly converge to a local optimum.

8. Random forest algorithm

Random forest is a type of Supervised Machine Learning Algorithm that is commonly used in classification and regression issues. It builds decision trees from several samples, using the majority vote for classification and the average for regression. It can work with data sets that contain both continuous and categorical variables.

9. Dimensionality reduction algorithms

There are frequently too many characteristics in machine learning classification issues. The more qualities there are, the more difficult it is to visualise and work on the training set. Dimensionality reduction techniques are used in this case. Dimensionality reduction is the process of reducing the number of random variables under consideration by establishing a set of main variables.

10. Boosting algorithms

Boosting is a machine learning strategy for reducing mistakes in predictive data analysis. Data scientists use labelled data to teach machine learning algorithms to make educated judgments about unlabeled data. Boosting attempts to address this issue by training many models consecutively in order to increase the overall accuracy of the system.

Top ten python libraries for machine learning

Python has become the ideal language for data science, machine learning, artificial intelligence, and data analytics. Python’s platform neutrality and simplicity make it a favourite among machine learning engineers. Data scientists frequently use the computer language Python. Python’s flexibility enables professionals in machine learning to choose a programming language based on the nature of the assignment.

The following is a list of the top ten most popular Python libraries for machine learning.

  1. numpy
  2. pandas
  3. matplotlib
  4. scikit-learn
  5. tensorflow
  6. keras
  7. theano
  8. pytorch
  9. scipy
  10. seaborn

Top ten free courses on machine learning

  1. Practical Deep Learning for Coders - Fastai
  2. Machine Learning Crash Course - Google
  3. Making Friends With Machine Learning - YouTube
  4. Sci-kit Learn in 3 Hours - YouTube
  5. Deep Learning with PyTorch - YouTube
  6. Machine Learning Full Course - YouTube
  7. Machine Learning with Python - IBM Cognitive Class
  8. Machine Learning - Dimensionality Reduction - IBM Cognitive Class
  9. Machine Learning with R - IBM Cognitive Class
  10. Machine learning with Apache SystemML - IBM Cognitive Class

I appreciate you reading my Medium article. Please leave your comments with your ideas and feedback. Follow me if you want to read more.

--

--

Jason J Pulikkottil
Jason J Pulikkottil

Written by Jason J Pulikkottil

Web Developer | Subject-Matter Expert | Digital Creator | https://linktr.ee/pjjason

No responses yet