利用大数据分布式存储与并行计算、数据仓库建模等技术构建多维分析引擎数据管理平台,实现了分散存储油气生产动态大数据的优化管理与快速查询,该系统可集中管理36×104余口油、气、水井的生产数据,并实现秒级响应。建立了油、气、水井生产多维分析主题模型,对数据进行预处理,在中国石油天然气集团有限公司层级实现了油区生产运行跟踪、重点油田生产预警、低产井和长停井现状、分类油藏开发规律等分析应用的快速、高效响应,处理时间由原来的1 d缩短到现在的5 s;油气生产模式分析基本单元由原来的油田细化为单井,生产管理更为细致;分析结果可以按照集团公司、油气田公司、油气田、区块、单井逐级追溯,实时掌握各基本单元的油气生产动态。图8表6参25
The multidimensional analysis engine data management platform is constructed using big data distributed storage and parallel computing, data warehouse modeling technology, realizing the optimal management and instant query of distributed oil and gas production dynamic big data. The centralized management and quick response of the production data of more than 36×104 oil, gas and water wells is realized. Multidimensional analysis subject model of oil, gas and water well production is built to pretreat the relevant data. At the level of China National Petroleum Corporation (CNPC), the rapid analysis and applications such as oil and gas production tracking, early production warning of key oilfields, analysis of low production wells and long shutdown wells, classification of reservoir development laws have been realized, and the processing time has been shortened from 1 d to 5 s. The basic unit of oil and gas production analysis is refined from oilfield to single well, making the production management more detailed. The process can be traced step by step according to CNPC, oil field company, field, block and single well, and the oil and gas production performance of each unit can be mastered in real time.
[1] PERRONS R K, JENSEN J W.Data as an asset: What the oil and gas sector can learn from other industries about “Big Data”[J]. Energy Policy, 2015, 81: 117-121.
[2] SUMBAL M S, TSUI E, SEE-TO E W K. Interrelationship between big data and knowledge management: An exploratory study in the oil and gas sector[J]. Journal of Knowledge Management, 2017, 21(1): 180-196.
[3] 李金诺. 浅谈石油行业大数据的发展趋势[J]. 价值工程, 2013, 32(29): 172-174.
LI Jinnuo.Talking about the development trend of big data in petroleum industry[J]. Value Engineering, 2013, 32(29): 172-174.
[4] 李大伟, 熊华平, 石广仁, 等.基于全球典型油气田数据库的数据挖掘预处理[J]. 大庆石油地质与开发, 2016, 35(1): 66-70.
LI Dawei, XIONG Huaping, SHI Guangren, et al.Preprocessing of the data tapping based on global typical oil and gas field database[J]. Petroleum Geology and Oilfield Development in Daqing, 2016, 35(1): 66-70.
[5] 鲁帅帅. 大数据环境下油气钻井信息分布式数据仓库系统研究[D]. 西安: 西安石油大学, 2018.
LU Shuaishuai.Research on distributed data warehouse system for oil and gas drilling information in big data environment[D]. Xi’an: Xi’an Petroleum University, 2018.
[6] 曲海旭. 基于大数据的油田生产经营优化系统研究及应用[D]. 大庆: 东北石油大学, 2016.
QU Haixu.Research and application of oilfield production and operation optimization system based on big data[D]. Daqing: Northeast Petroleum University, 2016.
[7] 杨飞, 周静. 智能钻井大数据技术的发展研究[J]. 科学管理, 2017(9): 230-231.
YANG Fei, ZHOU Jing.Development of intelligent drilling big data technology[J]. Scientific Management, 2017(9): 230-231.
[8] 张东晓, 陈云天, 孟晋. 基于循环神经网络的测井曲线生成方法[J]. 石油勘探与开发, 2018, 45(4): 598-607.
ZHANG Dongxiao, CHEN Yuntian, MENG Jin.Synthetic well logs generation via Recurrent Neural Networks[J]. Petroleum Exploration and Development, 2018, 45(4): 598-607.
[9] 黄文松, 王家华, 陈和平, 等. 基于水平井资料进行地质建模的大数据误区分析与应对策略[J]. 石油勘探与开发, 2017, 44(6): 939-947.
HUANG Wensong, WANG Jiahua, CHEN Heping, et al.Big data paradox and modeling strategies in geological modeling based on horizontal wells data[J]. Petroleum Exploration and Development, 2017, 44(6): 939-947.
[10] 李熙喆, 刘晓华, 苏云河, 等. 中国大型气田井均动态储量与初始无阻流量定量关系的建立与应用[J]. 石油勘探与开发, 2018, 45(6): 1020-1025.
LI Xizhe, LIU Xiaohua, SU Yunhe, et al.Correlation between per-well average dynamic reserves and initial absolute open flow potential(AOFP) for large gas fields in China and its application[J]. Petroleum Exploration and Development, 2018, 45(6): 1020-1025.
[11] KIM J S, KIM B S.Analysis of fire-accident factors using big-data analysis method for construction areas[J]. KSCE Journal of Civil Engineering, 2017(4): 1-9.
[12] WIBISONO A, JATMIKO W, WISESA H A, et al.Traffic big data prediction and visualization using fast incremental model trees-drift detection (FIMT-DD)[J]. Knowledge-Based Systems, 2016, 93: 33-46.
[13] ALHARTHI A, KROTOV V, BOWMAN M.Addressing barriers to big data[J]. Business Horizons, 2017, 60(3): 285-292.
[14] DENG Z H, LYU S L.PrePost+: An efficient N-lists-based algorithm for mining frequent item sets via children-parent equivalence pruning[J]. Expert Systems with Applications, 2015, 42(13): 5424-5432.
[15] YAN X, ZHANG J, XUN Y, et al.A parallel algorithm for mining constrained frequent patterns using MapReduce[J]. Soft Computing, 2017, 21(9): 2237-2249.
[16] VO B, LE T, COENEN F, et al.Mining frequent itemsets using the N-list and subsume concepts[J]. International Journal of Machine Learning and Cybernetics, 2016, 7(2): 253-265.
[17] 何明, 常盟盟, 刘郭洋, 等. 基于SQL-on-Hadoop查询引擎的日志挖掘及其应用[J]. 智能系统学报, 2017, 12(5): 717-728.
HE Ming, CHANG Mengmeng, LIU Guoyang, et al.Log mining based on SQL-on-Hadoop query engine and its application[J]. Journal of Intelligent Systems, 2017, 12(5): 717-728.
[18] LOWD D, DAVIS J.Improving Markov network structure learning using decision trees[J]. Journal of Machine Learning Research, 2014, 15(1): 501-532.
[19] MCAFEE A, BRYNJOLFSSON E.Big data: The management revolution[J]. Harv. Bus. Rev., 2012, 90(10): 60-66.
[20] 王康, 陈海光, 李东静. 基于Hive的性能优化研究[J]. 上海师范大学学报(自然科学版), 2017, 46(4): 527-534.
WANG Kang, CHEN Haiguang, LI Dongjing.Performance optimization based on Hive[J]. Journal of Shanghai Normal University (Edition of Natural Science), 2017, 46(4): 527-534.
[21] 张延松, 焦敏, 张宇, 等. 并发内存OLAP查询优化技术研究[J]. 计算机研究与发展, 2016, 53(12): 2836-2846.
ZHANG Yansong, JIAO Min, ZHANG Yu, et al.Research on concurrent memory OLAP query optimization technology[J]. Computer Research and Development, 2016, 53(12): 2836-2846.
[22] 张延松, 张宇, 周烜, 等. 不对称内存计算平台OLAP查询处理技术研究[J]. 华东师范大学学报(自然科学版), 2016(5): 89-102.
ZHANG Yansong, ZHANG Yu, ZHOU Xuan, et al.Research on OLAP query processing technology for asymmetric memory computing platform[J].Journal of East China Normal University (Edition of Natural Science), 2016(5): 89-102.
[23] 蔡旭坤. 基于Hive和Apache Kylin的生产大数据聚合与管理系统的设计与实现[D]. 广州: 华南理工大学, 2018.
CAI Xukun.Design and implementation of production data aggregation and management system based on Hive and Apache Kylin[D]. Guangzhou: South China University of Technology, 2018.
[24] CHEN L, FENG C Y.Research on the strategy for temporal information index based on HBase[J]. Journal of Guangdong University of Technology, 2014, 12(3): 1-4.
[25] MALLEK H, GHOZZI F, TESTE O, et al.BigDimETL: ETL for multidimensional big data[C]. Berlin: Springer, 2017.