PACIFIC: New data warehouse technology, all-round data coverage
The entire data ecosystem is a very large industry, because big data
is hot, and the world is a $100-200 billion market. The data ecosystem includes
data sources, underlying systems, and various upper-level big data analysis
applications. Initially the data is generated at the data source, for example
it is a transactional system, Oracle or MySQL, etc. There are other places
where data is generated, such as mobile phones, ipads, web servers, etc. After
the data is generated, it goes through ETL or is collected into the data
warehouse. With the development of big data, artificial intelligence, and the
Internet of Things, including the emergence of technologies such as blockchain,
the data is getting bigger and bigger, and the requirements for data warehouses
are getting higher and higher. Therefore, the technological innovation of data
warehouse is the most.
The evolution of data warehouse
From the 1970s and 1980s to the present, the evolution of data
warehouses can be roughly divided into three generations. The earliest data
warehouse is based on the most traditional transactional database technology,
such as Oracle, which uses shared storage, which is the high-end storage of EMC
or IBM. Its disadvantage is that it can only be extended to a dozen nodes, so
after a dozen nodes, storage bottlenecks will be encountered, and the price is
relatively expensive.
The MPP system appeared in the 1980s, which belongs to the second
generation of data warehouse. The first productized MPP was Teradata. In terms
of hardware, the technology of mainframe, minicomputer, and some proprietary
hardware is adopted. Later, some start-up companies appeared, such as the more
famous Greenplum and Vertica around 2000. They are MPPs based on the X86
architecture, massively parallel processing MPP systems. These startups were
eventually acquired by giants. For example, Greenplum was acquired by EMC, and
Vertica was acquired by HP.
The second-generation system solves some of the scalability
problems, and can basically reach the scale of 100 nodes, but it is more
difficult to go further.
In recent years, third-generation systems have emerged, such as the
SQL system on Hadoop or the SQL system on the cloud, which we call a new
generation of data warehouses.
PACIFIC - New Data Warehouse Technology
In recent years, with the application and development of database
technology, people try to reprocess the data in DB to form a comprehensive and
analysis-oriented environment to better support decision analysis, thus forming
data warehouse technology (Data Warehousing, referred to as DW). As a
Decision-making Support System (DSS), the data warehouse system includes:
① Data warehouse technology;
②On-Line Analytical Processing (OLAP);
③ Data mining technology (Data Mining, DM for short);
The data warehouse makes up for the shortcomings of the original
database, and develops the original data environment centered on a single
database into a new environment: a systematic environment.
PACIFIC - Comprehensive Data Coverage
Data warehouse is a new data warehouse technology based on big data
technology. PACIFIC relies on the distributed storage and computing of big
data, and adds the support of SQL to form an architecture, so it is completely
different from the architecture of traditional data warehouses. A data
warehouse is a storage and computing service for massive data. It uses a
distributed architecture to solve data storage problems and has strong
scalability. In terms of data processing, in order to avoid the huge overhead
caused by the movement of massive data, the data warehouse uses a mobile
computing architecture to distribute computing tasks to data nodes for
computing, store them in multiple nodes, and execute concurrently after each
node receives computing tasks, and finally aggregate the partial results to
obtain the final result.
Epilogue
The PACIFIC global ecological community will be committed to
building a diversified ecosystem through "decentralized" autonomy.
Decentralized" governance effect, creating a fair and open participation
environment and participation experience.
评论
发表评论