R on Hadoop with Tessera RHipe

This tutorial will show you how to perform parallel computation on a Hadoop cluster in R using Rhipe, with CSV files as input and output.

The EVE Online computer game universe consists of 5201 solar systems for players to explore and conquer in virtual spaceships. Each solar system is connected to an average 2.6 other systems by jump gates which allow instantaneous travel between systems.

Continue reading R on Hadoop with Tessera RHipe

Converting HDF5 to CSV

Hierarchical Data Format 5 is a popular file format for storing and managing large amounts of data. It is the format used by NASA for their ACOS and OCO-2 data products, which both contain (among other things) column-averaged CO2 in units of dry-air mole fraction (Xco2). This tutorial demonstrates how to extract the average daily Xco2 value – and total reading count per day – from the HDF5 files into a CSV file for analysis in Excel or Gnumeric.

Continue reading Converting HDF5 to CSV