Logistic Regression In Python 简明教程
Logistic Regression in Python - Case Study
假设一家银行拜托您开发一个机器学习应用程序,帮助他们识别出很有可能在该银行开定期存款(一些银行也称定期存款)的潜在客户。该银行会定期通过电话或网络表单进行调查,收集潜在客户的信息。该调查的性质是通用的,面向非常广泛的受众,他们中许多人可能对在该银行办理业务并不感兴趣。在剩余的人中,只有少部分人可能对开定期存款感兴趣。其他人可能对银行提供的其他服务感兴趣。因此,该调查不一定专门用于识别开定期存款的客户。您的任务是从银行即将与您分享的庞大调查数据中识别出所有开定期存款可能性较高的客户。
Consider that a bank approaches you to develop a machine learning application that will help them in identifying the potential clients who would open a Term Deposit (also called Fixed Deposit by some banks) with them. The bank regularly conducts a survey by means of telephonic calls or web forms to collect information about the potential clients. The survey is general in nature and is conducted over a very large audience out of which many may not be interested in dealing with this bank itself. Out of the rest, only a few may be interested in opening a Term Deposit. Others may be interested in other facilities offered by the bank. So the survey is not necessarily conducted for identifying the customers opening TDs. Your task is to identify all those customers with high probability of opening TD from the humongous survey data that the bank is going to share with you.
幸运的是,对于那些渴望开发机器学习模型的人来说,这类数据是公开的。这些数据是由加州大学欧文分校的一些学生在得到资助的情况下准备的。该数据库是 UCI Machine Learning Repository 的一部分,世界各地的学生、教育者和研究人员都在广泛使用。可以从 here 下载数据。
Fortunately, one such kind of data is publicly available for those aspiring to develop machine learning models. This data was prepared by some students at UC Irvine with external funding. The database is available as a part of UCI Machine Learning Repository and is widely used by students, educators, and researchers all over the world. The data can be downloaded from here.
在接下来的章节中,让我们使用相同的数据执行应用程序开发。
In the next chapters, let us now perform the application development using the same data.