Machine Learning 简明教程
Machine Learning - Getting Started
Machine learning has become an increasingly important topic in recent years as the amount of data generated by businesses and individuals continues to grow at an exponential rate. From self-driving cars to personalized recommendations on streaming platforms, machine learning algorithms are now used in a wide range of applications.
Let’s explore what exactly machine learning is.
What is Machine learning?
Machine learning is a subset of Artificial Intelligence; as the name suggests, it is defined as the capability of a machine to learn to exhibit "intelligent behavior" like humans. Machine learning uses algorithms that are trained on datasets to understand patterns in the data and to create self-learning models that are capable of predicting outcomes.
Types of Machine Learning
我们可以将机器学习算法分为三种不同的类型 - 监督式、非监督式和强化学习。让我们详细讨论这三种类型 −
We can categorize the machine learning algorithms into three different types - supervised, unsupervised, and reinforcement learning. Let’s discuss these three types in detail −
Supervised Learning
Supervised learning that uses labeled dataset to train algorithms to understand data patterns and predict outcomes. For example, filtering a mail into inbox or spam folder.
The supervised learning further can be classified into two types − classification and regression.
有不同的监督式学习算法被广泛使用 −
There are different supervised learning algorithms that are widely used −
Linear Discriminant Analysis
Unsupervised Learning
Unsupervised learning is a type of Machine learning that uses unlabeled dataset to discover patterns without any explicit guidance or instruction. For example, customer segmentation i.e, dividing a company’s customers into groups that reflect similarity.
Further, we can classify the unsupervised learning algorithms into three types − clustering, association, and dimensionality reduction.
以下是一些常用的非监督式学习算法 −
Followings are some commonly used unsupervised learning algorithms −
Restricted Boltzmann machine (RBM)
Reinforcement Learning
Reinforcement learning algorithms are trained on datasets to make decisions and achieve optimized results by minimizing the trial and error method. For example, Robotics.
以下是一些常见的强化学习算法 −
Following are some common reinforcement learning algorithms −
Markov Decision Process (MDP)
Use Cases of Machine Learning
Let’s discuss some important real-life use cases of different types of machine learning algorithms
Supervised Learning
以下是监督式学习的一些实际用例 −
Following are some real-life use cases of supervised learning −
Image Classification
Spam Filtering
House Price Prediction
Signature Recognition
Weather Forecasting
Stock price prediction
Prerequisites to Get Started
若要开始使用机器学习,您应该对计算机科学基础知识有一些基本的了解。除了基本的计算机科学知识外,您还应熟悉以下内容 -
To get started with machine learning, you should have some basic understanding of computer science fundamentals. Along with basic computer science, you should be familiar with the following −
Programming languages
Libraries and Packages
Mathematics and statistics
Let’s discuss the above three prerequisites one by one.
Programming Languages: Python or R
有很多编程语言(如 C++、Java、Python、R、Julia 等)用于机器学习开发。您可以从您选择的任何编程语言开始。Python 编程广泛用于机器学习和数据科学。
There are many programming languages, such as C++, Java, Python, R, Julia, etc., that are used for machine learning development. You can start with any programming language of your choice. Python programming is widely used for machine learning and data science.
在本机器学习教程中,我们将使用 Python 和/或 R 编程来实现示例程序。
In this machine learning tutorial, we will be using Python and/ or R programming to implement the example programs.
在开始本教程之前,以下是需要介绍的一些基本主题 -
Following are some basic topics to cover before starting this tutorial −
Variables, basic data types
Data Structures: list, set, dictionaries
Loops and conditional statements
String formatting
Classes and Objects
Libraries and Packages
若要开始使用本机器学习教程,我们建议您熟悉一些库、包和模块,例如 NumPy、Pandas、Matplotlib 等。
To get started with this machine learning tutorial, we recommend getting familiar with some libraries, packages, and modules such as NumPy, Pandas, Matplotlib, etc.
由于在本教程中我们使用 Python 编程,因此您应该对以下库/包/模块有一些基本的了解 -
As we are using Python programming in this tutorial, you should have some basic understanding of the following libraries/ packages/ modules −
NumPy − for numeric computations.
Pandas − for data manipulation and preprocessing.
Scikit-learn − has implemented almost all the machine learning algorithms such as linear regression, logistic regression, k-means clustering, k-nearest neighbor, etc.
Matplotlib − for data visualization.
Mathematics and Statistics
Mathematics and statistics play important role in developing machine learning and data science related applications. Advanced mathematics is not required to get started but it helps to understand the machine learning concepts in great detail.
在开始机器学习教程之前,通常建议熟悉以下主题 -
The following topics are generally recommended to get familiar with before getting started with machine learning tutorial −
Variables, coefficients, functions.
Linear equations, logarithm and logarithmic equations, sigmoid function.
Vector and matrix, matrix multiplication, dot product
tensor and tensor ranks
Mean, median, mode, outliers, and standard deviation
Ability to read a histogram
Probability, conditional probability, Bayes rules
Concept of a derivative, gradient, or slope
Partial derivatives
Chain rule
Trigonometric functions (specially tanh) used in activation functions
Getting started with Machine Learning
You might wonder if Machine learning is hard to learn? The answer would be absolutely not; you will require a strong understanding of mathematics, computer science and coding, and should keep up with the AI trends. Well, excelling in Machine learning is something that every technophile dreams of but does not know where to start, so here are a few steps that help you get started.
Step 1 − Learn Prerequisites
There are a few prerequisites that lay the foundation to understand how algorithms and machine learning models work. Start by learning the basics of:
Any programming language like Python or R.
Libraries and Packages
Mathematics and Statistics(Like Calculus, Linear Algebra and more)
Step 2 − Learn Machine Learning Fundamentals
Before diving into machine learning, it’s important to have a solid understanding of the fundamentals. This includes learning about different types of Machine Learning methods such as regression, classification, clustering, dimensionality reduction, etc.
In this Machine Learning tutorial, we have covered all the machine learning concepts from basics to advanced, along with their implementations. You just need to start learning the tutorial chapter-wise and keep practicing the programming examples.
Step 3 − Explore Machine Learning Algorithms
算法构成了机器学习的基础,使计算机能够观察数据模式并预测输出。探索和理解 Naive Bayes, Random Forest, Decision tree 等基本算法。这将帮助你了解算法的工作流程。
Algorithms form the foundation of Machine learning, allowing computers to observe data patterns and predict output. Explore and understand essential algorithms like Naive Bayes, Random Forest, Decision tree, etc. This will help you understand the working flow of an algorithm.
Step 4 − Choose a Machine Learning Framework/ Library
机器学习有不同的工具,框架,软件和平台。具有挑战性的任务是根据你的模型选择最佳工具。机器学习工具的精通使你能够使用数据,训练你的模型,发现新方法并创建算法。一些常用的机器学习工具是 Scikit-learn, TensorFlow, PyTorch, 等等。
There are different tools, frameworks, software, and platforms for Machine learning. The challenging task is to select the best tool as per your model. Mastering machine learning tools enables you to work with data, train your model, discover new methods, and create algorithms. Some commonly used Machine learning tools are Scikit-learn, TensorFlow, PyTorch, and many more.
除了工具和算法之外,对 NumPy、SciPy、Matplotlib 等库有很好的掌握,将在你的机器学习之旅中为你提供帮助。
In addition to the tools and algorithms having a good grip on libraries like NumPy, SciPy, Matplotlib, etc., serves you well in your Machine Learning journey.
Step 5 − Practice with Real Data
Dataset is the backbone of any Machine Learning algorithm. This involves a large amount of data grouped into a collection. Datasets are used to train and test algorithms, analyze patterns, and gain insights.
有很多网站,如 Scikit-learn, TensorFlow, PyTorch, ,Google Dataset 搜索等提供公开可用的数据集。
There are many websites like Kaggle, Google Dataset search, and others that provide publicly available datasets.
Step 6 − Build Your Own Projects
After mastering the basics, it’s time to create your own project with a problem statement that you choose. This will help you apply what you have learned so far and will develop your skills further.
You can start with simple algorithms like classification or recommendation systems using pre-processed dataset, then move to developing complex algorithms once you are comfortable.
Step 7 − Participate in Machine Learning Communities
加入机器学习社区,如 Github ,这是一个与具有类似兴趣的人联系的好方法。通过这些社区,你将有机会向他人学习,分享经验,并获得对你的项目的反馈。这有助于你保持学习和成长的动力。
Join machine learning communities like Github, which is a great way to connect with people with similar interests as you. Through these communities, you will get a chance to learn from others, share experiences, and get feedback on your projects. This helps you stay motivated to learn and grow.