What is an algorithm? Informally, an algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some v
Background A Database Management System (DBMS) stores data in the form of tables, uses ER model and the goal is ACID properties. For example a DBMS of college h
A data-warehouse is a heterogeneous collection of different data sources organised under a unified schema. There are 2 approaches for constructing data-warehous
Prerequisite – Data Warehousing Data warehouse can be controlled when the user has a shared way of explaining the trends that are introduced as specific subje
Prerequisite – DBMS | File Organization – Set 1, File Organization-Set 2 B+ Tree File Organization – B+ Tree, as the name suggests, It uses a tree like st
In general terms, “Mining” is the process of extraction of some valuable material from the earth e.g. coal mining, diamond mining etc. In the context of com
Prerequisites – Data Warehousing, Data Warehouse Architecture, Characteristics and Functions of Data warehouse Here are some of the difficulties of Implementi
Data Mining – Knowledge Discovery in Databases(KDD). Why we need Data Mining? Volume of information is increasing everyday that we can handle from business tr
Datawarehouse and Data Mart, both are storage components of HDFS. Data mart is such a storage component which is concerned on a specific department of an organi
In this post, we will discuss what are different sources of data that are used in data mining process. The data from multiple sources are integrated into a comm
Prerequisite – Introduction to Hadoop, Apache Hive The major components of Hive and its interaction with the Hadoop is demonstrated in the figure below and al
With the phenomenal growth in digital data, particularly generated from multi-media and other enterprise application the need for high-performance storage solut
The ODBMS which is an abbreviation for object oriented database management system, is the data model in which data is stored in form of objects, which are insta
Prerequisites – Introduction to Hadoop, Computing Platforms and Technologies Apache Hive is a data warehouse and an ETL tool which provides an SQL-like interf
Prerequisites – Introduction to Hadoop, Apache HBase HBase architecture has 3 main components: HMaster, Region Server, Zookeeper. Figure – Architecture of H
Prerequisite – Introduction to Hadoop HBase is a data model that is similar to Google’s big table. It is an open source, distributed database developed by A
Hive: Hive is a datawarehousing package built on the top of Hadoop. It is mainly used for data analysis. It generally target towards users already comfortable w
A system in which each server is autonomous and centralized DBMS that has its own local users. The term Federated Database system or in short FDS is basically u
Relational Database Management System (RDBMS) – RDBMS is for SQL, and for all modern database systems like MS SQL Server, IBM DB2, Oracle, MySQL, and Microsof
Seeing the vast increase in volume and speed of threats to databases and many information assets, research efforts need to be consider to the following issues s
A distributed database is basically a database that is not limited to one system, it is spread over different sites, i.e, on multiple computers or over a networ
Inverted Index It is a data structure that stores mapping from words to documents or set of documents i.e. directs you from word to document. Steps to build Inv
Following questions have been asked in GATE 2009 CS exam. 1) Consider two transactions T1 and T2, and four schedules S1, S2, S3, S4 of T1 and T2 as given below:
Following questions have been asked in GATE 2005 CS exam. 1) Which one of the following statements about normal forms is FALSE? (a) BCNF is stricter than 3NF (b
There are many characteristics of biological data. All these characteristics make the management of biological information a particularly challenging problem. H