Big data has been a buzzword in the it industry since email list 2008. The amount of data generated in the social networks, manufacturing, retail, equity, telecommunications, insurance, banking and healthcare industries is far beyond our imagination. Before hadoop, big data storage and processing was a big challenge. But now that hadoop is available, companies are aware of the business implications of big data and how understanding this data can drive growth. For example:• the banking sector has better opportunities to understand loyal customers, loan defaulters, and fraudulent transactions. •
the retail sector currently has enough data to forecast demand. • manufacturing departments do not have to rely on costly mechanisms for quality testing. Capturing and analyzing sensor data reveals many patterns. • e-commerce social networks can personalize pages based on customer interests. • the stock market produces vast amounts of data and email list sometimes correlates to reveal beautiful insights. Big data has many useful and insightful applications. Hadoop is the right answer for processing big data. The hadoop ecosystem is a combination of technologies that has sufficient benefits to solve business problems. Understand the components of hadoop ecosytem to build the right solution for your specific business problem. Hadoop ecosystem: hadoop ecosystem infographic
hadoop ecosystem infographiccore hadoop: hdfs: email list hdfs stands for hadoop distributed file system and manages large, fast, and diverse big data sets. Hdfs implements a master-slave architecture. The master is the name node and the slave is the data node. Feature: • scaleable • reliable • commodity hardware hdfs is well known for big data storage. Mapreduce: map reduce is a programming model designed to handle large amounts of distributed data. The platform is built using java to improve exception handling. Map reduce includes two daemons, a job tracker and a task tracker. Feature: • functional programming. • works very well with big data. • can handle large datasets. Map reduce is a key component known for processing