Skip to main content

Apache

THE WORLD'S LARGEST OPEN-SOURCE FOUNDATION ~ apache.org  Apache Projects: Apache Spark Apache Hadoop Apache Hive Dataware house and ELT tool with SQL-like interface. Hive helps run SQL queries over distributed data. Hive is built on Apache Hadoop. Two types of tables Internal/Managed Hive Table Hive Own the Data Default tables Both metadata and data is deleted on the Drop command External Hive Table Hive manages only Metadata  Use EXTERNAL Keyword  Only metadata deleted on the drop command  Apache Nifi Apache Kafka
~ apache.org 

THE WORLD'S LARGEST OPEN-SOURCE FOUNDATION
~ apache.org 

Apache Projects:

  • Apache Spark
  • Apache Hadoop
  • Apache Hive
    • Dataware house and ELT tool with SQL-like interface.
    • Hive helps run SQL queries over distributed data.
    • Hive is built on Apache Hadoop.
    • Two types of tables
      • Internal/Managed Hive Table
        • Hive Own the Data
        • Default tables
        • Both metadata and data is deleted on the Drop command
      • External Hive Table
        • Hive manages only Metadata 
        • Use EXTERNAL Keyword 
        • Only metadata deleted on the drop command 
  • Apache Nifi
  • Apache Kafka



Comments

Popular posts from this blog

Delivery Foundation Academy (DFA) MCQs

Question  1 Correct Mark 1.00 out of 1.00 Flag question Question text Every sprint starts with _________ and ends with ___________ and ______________. Select one: a. 1. Sprint planning 2. Sprint Review 3. Sprint Retrospective b. 1. Daily Stand Up 2. Sprint Review 3. Sprint retrospective c. 1. Sprint Review 2. Sprint planning 3. Sprint Retrospective d. 1. Sprint Retrospective 2. Sprint Review 3. Sprint planning Feedback The correct answer is: 1. Sprint planning 2. Sprint Review 3. Sprint Retrospective Question  2 Partially correct Mark 0.67 out of 1.00 Flag question Question text As part of cloud application security, which of the following are the identity and access solutions provided by the cloud service providers? Select one or more: a. Role based authentication b. Single Sign-On/Off c. Federation and Identity Provision d. Multifactor Authentication Feedback The correct answers are: Federation and Identity Provision, Single Sign-On/Off, Multifactor Authentication Question...

What is Apache Hive?

Apache Hive is one of Apache's top-level projects. Hive is a data warehouse and ETL for a large dataset in distributed storage. Hive supports different types of storage formats like CSV, TSV, Parquet, ORC (Optimized Row Column), and others. It is used for the analytical processing of structured data using an SQL-like interface. Hive is built on top of Hadoop. Apache Hive      ~ https://hive.apache.org/ Hive is a software project that provided data querying and analysis. It facilitates the reading, writing, and handling of a wide dataset that is stored in distributed storage and queried by SQL syntax, HiveQL.  Hive provides the necessary abstraction to the Hadoop environment by projecting structure on data in HDFS storage so that SQL queries can be integrated with the low-level Java API.  Hive also provides a command-line tool and Java Database Connectivity (JDBC) driver that can be used to connect to Hive.  Hive was co-created by Joydeep Sen Sarma and ...