Apache Spark 简明教程
Apache Spark Tutorial
Apache Spark 是一款闪电般快速的群集计算,旨在实现快速计算。它建立在 Hadoop MapReduce 之上,并扩展了 MapReduce 模型以更有效地使用更多类型的计算,其中包括交互式查询和流处理。本教程简要解释了 Spark Core 编程的基本知识。
Apache Spark is a lightning-fast cluster computing designed for fast computation. It was built on top of Hadoop MapReduce and it extends the MapReduce model to efficiently use more types of computations which includes Interactive Queries and Stream Processing. This is a brief tutorial that explains the basics of Spark Core programming.
Audience
编写本教程是为了帮助有志于使用 Spark 框架学习大数据分析基础知识并成为 Spark 开发人员的专业人士。此外,它对分析专业人员和 ETL 开发人员也有用。
This tutorial has been prepared for professionals aspiring to learn the basics of Big Data Analytics using Spark Framework and become a Spark Developer. In addition, it would be useful for Analytics Professionals and ETL developers as well.