五月天堂网_免费看影片_1024手机看毛片_亚洲综合四区_男人的天堂影院_在线亚洲自拍

中培偉業(yè)IT資訊頻道
您現(xiàn)在的位置:首頁(yè) > IT資訊 > 學(xué)習(xí)交流 > The Best Practices of Enterprise-level Data Center Construction

The Best Practices of Enterprise-level Data Center Construction

2017-07-28 16:27:09 | 來(lái)源:中培企業(yè)IT培訓(xùn)網(wǎng)

At present, most of data centers used by engineering enterprises are built by using traditional technology with several disadvantages, which contain high construction cost, weak scalability and limited capacities of calculation and analysis. To meet the need of data storage, processing, analysis and application based on big data, enterprise-level data centers, which combined with many technologies such as parallel computing, analysis of large-scale data, linear expansion, support of all types of data, are able to effectively achieve the centralized integration and analysis of data resources in all of businesses, levels and types.

At present, data centers built by most enterprises in engineering industry accumulate a large amount of structured data, unstructured data, geographic information data and massive real-time data. At the same time, most of them use centralized server architectures (such as Oracle Rac), which leads to weak scalability, so that it cannot meet the increasing need of data storage. Besides, data processing is mainly based on single-point models, lacking the capacity of real-time parallel computing, so that it cannot meet the need of processing the massive data in real time. Data storage and processing can only cope with structure data; it cannot effectively stores, processes or analyzes unstructured data; it cannot provide the service of data storage and processing in all directions and types under the environment of big data; it cannot support the deep analysis of data.

The overall structure of enterprise-level data centers in engineering industry based on big data is shown in Figure 1. According to the layers, it can be divided into seven layers, including data source layer, data integration layer, data storage layer, analysis/service layer, business application layer, front end access layer, overall data management platform.

 

Figure1. The Overall Structure of Enterprise-level Data Centers in Engineering IndustryBased on Big Data

By using interface tables, interface files, data reception services and data information reception, data centers can achieve the acquisition of structured data, unstructured data and real-time data to meet the requirement of different data timeliness. In the data storage layer, data centers contain data storage platforms, distributed data platforms and streaming data platforms to store data with different characteristics and provide the related data services. Data centers provide the integrated result data through the ways like push in batches and real-time data service, and meet the requirements of data sharing and data application through the ways of asynchronous data push. Besides, data centers achieve the functions of comprehensive information display and functional analysis and decision-making, and meet the requirement of displaying all kinds of analysis results in front ends through integrated display in various front ends (such as PC terminal, large screen terminal and mobile terminal). Meanwhile, data centers provide data resource management, which means managing metadata, data quality, data standards, data models and data resources in data centers.

Data Integration Layer:

[including data acquisition and job scheduling]

Data acquisition

Data acquisition refers to delivering the structured data, unstructured data and real-time data of the collecting source systems. It contains interface table processing, message reception processing, data reception processing, real-time data acquisition processing and unstructured files processing.

Job scheduling

Job scheduling can achieve the scheduling of structured data, unstructured data and real-time data, the operation of inner data in data centers (including ETL, MapReduce, Sqoop, etc.), and the unified centralized scheduling of jobs pushed to each target system by data. It implements scheduling engines, provides the automatic and manual adjustment mode of the job and controls the execution order of the job based on the job dependency configuration information. Meanwhile, it controls the concurrency of the job and records the running results and the logs of the job.

Data Storage Layer:

[It contains traditional data repository platforms based on relational database as well as distributed data platforms and streaming data platforms based on Hadoop ecosystem, which can store different data and provide different data services.]

The data repository platform

The data repository platform uses hierarchical design, divided itself into buffer layer, integration layer, summary layer and market layer.

Buffer layer stores data collected from source systems by data centers. It can share the pressure of distributing data in bulk and in real time in source systems, avoiding the problems of performance pressure, jet lag of different versions, developing for many times, redundancy storage because of getting data repeatedly. Meanwhile, as a kind of data source, it can avoid the influence to data integration layer and summary layer because of the changes of the original systems (such as data structure, time window).

Integration layer is the business data after data cleaning, conversion and integration, which is the core data layer in data centers.

Summary layer forms statistic and aggregate enterprise data according to the subject dimension; it can form aggregate data according to the requirement of processing the subject reports; the storage of aggregate data is formed by storing aggregate data according to main body and calculating business data through the dimensions of data, main body and processing types .

Data market layer is the analytical data set for specific business units (such as business departments). The data in the layer is mainly based on the data of integration layer and summary layer, which also contains the specific analytical data supporting targets.

The distributed data platform

The distributed data platform mainly stores the following types of data: massive structured data, unstructured file data and dumping data of streaming data and relational database which are difficult to store in traditional relational database. According to the data storage requirement and the characteristic of distributed platform technology component, the platform can be divided into HBase-based data storage area and Hive-based data storage area.

Unstructured data layer stores the unstructured data from all source systems, which contains office documents, design drawings, text files, image files, etc.

Massive structured data layer stores the massive structured data from all structured systems.

Dumping layer of streaming data stores the periodic dumping data from streaming data platforms, help streaming data platforms to achieve the persistent storage of real-time data.

The streaming data platform

The streaming data platform includes real-time data integration layer, real-time data summary layer and business data buffer layer.

As for real-time data integration layer, in the integration layer of streaming data platform, the entry end of source systems uniformly use the way of Socket communication to interact to avoid the inconsistency of the data source. The data center systems monitor Socket of source systems. When there is data in source systems, the monitor procedures obtain the data and write the source information of monitored data in the corresponding message queue.

As for real-time data summary layer, it processes the source data of message queue in integration layer by using Storm in the way of streaming data. Besides, it aggregates, calculates and stores data according to the business requirements.

As for business data buffer layer, when the calculation to streaming data by Storm is finished, it can figure out the result data according to the specific business logic. (The corresponding architecture is shown in Figure 2).

 

Analysis/Service Layer:

It includes comprehensive information display platform, intelligent analysis and decision-making platform and data services (shown in Figure 3)

Comprehensive information display platform

Comprehensive information display platform, based on data storage layer, is an application including report query and comprehensive analysis to achieve the dynamic configuration to analysis of the page content, layout, components, CCTV, linkage relations, etc.

Intelligent analysis and decision-making platform

Intelligent analysis and decision-making platform includes several modules, such as data loading, data preprocessing, data mining algorithm, analysis model management and model operation scheduling. It provides technical support for data understanding, data preprocessing, algorithm modeling, model evaluation, model application, etc. Besides, to meet the requirement of big data analysis, it digs algorithms library combined with big data (It includes three types of mining algorithms. They are descriptive mining algorithm such as clustering analysis and correlation analysis, predictive mining algorithm such as classification analysis, evolution analysis and heterogeneous analysis as well as the mining algorithm of dedicated data analysis such as text analysis, speech analysis, image analysis, video analysis, etc.)

Data services

Data services mainly achieve real-time data services, subscription, release, batch data services, etc. Besides, it provides the cache function to enhance the overall performance of the system.

 

Data Management Layer:

It includes functions of metadata management, data quality management, main data management, data standard management and centralized job scheduling and monitoring (shown in Figure 4).

Metadata management

It can achieve the rapid search, acquisition, use and sharing to metadata in data centers. Besides, it can provide metadata support for data centers data sharing and exchange, multidimensional analysis, assistant decision making, data mining, etc.

Data quality management

It can achieve the normalized quality audit of data in data centers and ensure the real-time, complete and compliance of data receiving in business systems.

Main data management

It can achieve the unified management, application and maintenance of main data like materials, projects and contracts to ensure the consistency and stability of main data modification.

Data standard management

It can achieve the unified management of standard documents in data centers.

Centralized job scheduling and monitoring

It can achieve the unified dispatching management and monitoring of ETL interface operations and big data operations.

 

With the development of information level in engineering industry, the information systems have been fully integrated into all aspects of the businesses of enterprise production and management, which have accumulated a large number of structured data, unstructured data, geographic information data and massive real-time data. As a result, using big data-based enterprise-level data centers can make up the disadvantages of traditional technology, solve the problems of weak expansibility, high construction costs and limited capacities of calculation, analysis and mining and meet the requirement of storage, processing, analysis and application of all types of data under the environment of big data.


相關(guān)閱讀

主站蜘蛛池模板: 国产精品毛片一区二区三区四区 | 国产成人精选在线不卡 | 在线播放国产不卡免费视频 | 亚洲啪啪网 | 天美传媒国产原创av中文字幕 | 中国熟妇人妻videos | 商场女厕偷拍一区二区三区视频 | 欧美另类一区二区三区 | 26UUU另类亚洲欧美日本 | 夜夜操操操操 | 最近中文字幕免费观看 | 日本v片在线免费观看 | 超级碰碰人妻中文字幕 | 波多野结衣免费观看视频 | 亚洲伊人久久精品影院 | 99久久亚洲精品 | 免费看女人的隐私超爽 | 国产AV新搬来的白领女邻居 | 91精品综合久久久久m3u8 | 免费av 网站 | 欧美日韩一区二区三区不卡视频 | av免费在线观看国产 | 柳文文被肉干高H潮文不断 一级婬片a级中文字幕 | 国产又黄又猛又粗又爽电影的起源 | 影音先锋无码aⅴ男人资源站 | 狠狠干夜夜操 | 精品日韩视频 | 黄色毛片视频免费观看中文 | 18毛片免费看 | 特黄A片在线播放免费麻婆豆腐 | 无码人妻aⅴ一区二区三区 色福利网址导航 | 神马久久免费视频 | 美女视频一级片 | 日本又黄又爽刺激 | 一区二区三区二区中文字幕视频 | 97超碰中文字幕 | 一色道久久88加勒比一 | 激情五月综合网 | 丝袜一级毛片 | 合欧美一区二区三区 | AA片子吇中文字处女 |