A SURVEY ON HADOOP PIG SYSTEM

Authors

  • Nikita Bhojwani B.V.M. Engineering College, V.V. Nagar, Gujarat, India
  • Asst Prof. Vatsal Shah B.V.M. Engineering College, V.V. Nagar, Gujarat, India

Keywords:

Pig, hadoop, SQl, Hive , MapReduce, Piglatin

Abstract

Pig is a platform for analyzing large datasets using a high-level and expressive language called pig latin,
which enable users to describe data-processing steps. It is the amalgamation of SQL and map-reduce. With pig, the
programming becomes easy. With very few lines of code, it deals with intricate type of data. It deals with unstructured
data. It works with real time applications. It is used for data mining. Pig is also like hive but there are some significant
differences such as hive supports a declarative sql like language whereas pig supports a flow language that is suitable
for data processing steps. It provides data structures such as relations that are similar to database tables that comprises
of rows. It bolsters various types of LOAD, FILTER, JOIN, FOREACH, GROUP, STORE etc. along with that pig also
provides extensible support for user defined functions(UDFs) as a way to custom processing. Pig’s semantics are
comparatively easier than map- reduce. Its language is pig latin and can be compared to python. Additionally, Apache
pig processes network flow data to process large data sets and can be accessed in dynamic environments.

Published

2016-02-25

How to Cite

Nikita Bhojwani, & Asst Prof. Vatsal Shah. (2016). A SURVEY ON HADOOP PIG SYSTEM. International Journal of Advance Research in Engineering, Science & Technology, 3(2), 8–19. Retrieved from https://ijarest.org/index.php/ijarest/article/view/409