- Time: Friday, June 12, 2009 9:00 ~ 18:00
- Location: Ewha Woman University Ewha Campus Center(ECC) B146
- Target audiences: Anyone who is interested in Hadoop and platform technologies at companies, universities, and research centers
- Free and open event (but limited by 200 persons)
- Host: NexR, Korea Hadoop Community
Overview
Description
PlatformDay is the conference on Apache Hadoop and other distributed platform technologies in South Korea, handling technical issues, application cases, and research issues. It will bring together Hadoop engineers, users, and researchers in companies, universities and research centers. It is similar to Hadoop Summit hosted by Yahoo!. Below is the topics covered in PlatformDay.
- Hadoop introduction and development status
- Case studies of applications built on Hadoop
- Researches on distributed computing platforms using Hadoop
- Cloud Computing based on Hadoop
- MapReduce applications
This year is the third conference. PlatformDay 2008 was so successful that the number of attendees was over 250 and major Korean portal service providers revealed their platform technology. In addition, Korea Hadoop Community was organized after the conference. Currently, PlatformDay is recognized as the main conference on Hadoop and platform technology in south Korea.
Goals
- Sharing information about Hadoop and its case studies
- Introducing Hadoop and expanding Hadoop user base
- Encouraging the cooperation of academic and industrial research groups
- Encouraging the participation in Hadoop open source project
Program
- Hadoop Tutorial (Junho Cho, NexR)
- Building Business Intelligence Platform Using Hadoop and OpenSource Tools (Youngwoo Kim, Daum Communications)
- Cloud Computing Packages using Hadoop: VC3, MR.Flow, Archiving, HadoopAppliance (Jason Han, NexR)
- GAIA & Neptune: Distributed Data Service and Store (Joon Kim Neptune, YK Kwon Gruter)
- Force.com: The Next Generation Cloud Computing Platform of Salesforce.com (Parksa Kim, Daou)
- MapReduce Use Cases in Scientific Applications (Donghun Choi, KISTI)
- SNS Analysis using DHT-based Storages and Cloud Computing Services (Dongwoo Lee, OikoLab)
- IRIS: Distributed DBMS based on Grid Computing (Taesoo Kim, Mobigen)
- Business Intelligence and Hadoop (Takkil Shim, NHN)
- Analyzing the Search Ads Performance Using Hadoop (Bill Kim, Freelancer)
This tutorial will introduce Hadoop and its architecture. As a new distributed programming paradigm, Hadoop MapReduce will be presented with the programming example of WordCount.
With the volume of data exploding and the need for business intelligence more important then ever, Hadoop is recognized as the fastest growing project on large scale clusters for data analysis. This talk will include introductions to Hive(facebook), CloudBase(Business.com), Open source BI/DW tools and many of the lessons learned during the process of building a data warehouse using Hadoop and open source tools.
Hadoop is a massive data storage and processing platform and can be utilized for a Cloud Computing platform.
In the other hand, Cloud Computing can support Hadoop as the flexible infrastructure for processing massive data.
NexR developed several services and tools for Cloud Computing and massive data processing based on Hadoop.
In this talk, these NexR Cloud Computing packages will be presented as below,
1. VC3(Virtual Cloud Computing Center) is the Cloud Computing service providing compute cloud and storage cloud.
It also integrated Amazon EC2 and S3 service, so that customers can choose the best cloud service for their needs.
2. MR.Flow is the data processing platform which supports the web-based workflow of Hadoop MapReduce in a drag-and-drop style.
It enables data analysts to create data processing workflows without complicated MapReduce programming.
3. Massive Email Archive is the masssive email archiving solution based on Hadoop and Lucene.
4. HadoopAppliance is the HW & SW package optimized for Hadoop. It supports OS and app provisioning including Hadoop,
host & Hadoop monitoring and management, and Hadoop HA.
GAIA(http://gaia.gruter.com) is a cloud data storage and search service which enables you to manage and search billions of records. We will introduce GAIA service and how open source platform like Neptune(http://www.openneptune.com) and Hadoop are integrated in GAIA.
How has the platform designed for Cloud computing evolved to give flexibility in customization to meet customers’ requirements that are being kept changing over and over ? And what specific features are there within the platform ? This session covers a full story about Force.com which represents the Cloud platform in Salesforce.com and how Salesforce.com has grown to become one of the most successful leading IT company transforming all the way from Saas to Cloud computing Service. Those who are keen to the newly transformed Paradigm will find out what the real Service platform is all about within Cloud computing sector which has become one of top issues within IT industry these days.
Storing Ads performance log data to a HDFS and Analyze that by MapReduce programs
Related events
- Platformday 2008
http://www.platformday.com/2008/ - PlatformDay 2007
http://www.web2hub.com/wiki/index.php/20070302_PlatformDay - Hadoop Summit 2009
http://developer.yahoo.com/events/hadoopsummit09/ - Hadoop Summit 2008
http://developer.yahoo.com/hadoop/summit/



