Spark Introduction Apache Spark is a cluster computing platform designed to be fast, expresive, high level, general-purpose, fault-tolerante and compatible with Hadoop (Spark can work directly with HDFS, S3 and so on). Spark can also be defined as a framework … Continue reading
February 25, 2015
by Javier (@jbbarquero)
0 comments
Getting started with Hadoop
Hadoop Introduction Hadoop is an open source framework for distributed fault-tolerant data storage and batch processing. It allows you to write applications for processing really huge data sets across clusters of computers using simple programming model with linear scalability on … Continue reading