|国家预印本平台
首页|基于流式计算的查询系统设计

基于流式计算的查询系统设计

Query system based on flow calculation

中文摘要英文摘要

随着大数据技术和非关系型数据库的发展,如何解决大数据的快速查询成了一个重要问题。目前大多数基于大数据的查询都是通过Hive来实现,Hive本身依托MapReduce,针对小量的实时查询效率低下。本文提出一种全新的设计架构,详细分析架构中用到的关键技术:针对查询语句的解析分析了开源语法树生成工具JAVACC,并设计实现查询语句解析器;针对计算任务的算子,分析了开源流式计算框架Storm,并设计实现能运行在Storm上的通用算子;针对底层数据存储,分析了开源非关系型数据库Hypertable,并将其结合设计的查询引擎,完整的提供一套大数据查询解决方案。本文的最后,通过模拟一次用户的实际查询操作,验证系统的可行性与易用性。

With the development of large data and non relational database, how to solve the fast query of large data has become an important problem. Most of the current large data based query are realized through Hive, Hive itself is based on MapReduce, aiming at the low efficiency of small real time query. This paper presents a new design of architecture, the key technology used in the detailed analysis of Architecture: Analysis for query analysis of open source syntax tree generation tool JAVACC, and the design and implementation of query parser; operator based on the computing task, analyzes the calculation framework for Storm flow in open source, and the design and implementation of general operator for the underlying data storage; analysis of open source, non relational database Hypertable, and combined design of the query engine, complete provides a set of large data query solution. Finally, through the actual query operation simulation of a user, verify the feasibility of the system and the ease of use.

丛汉廷、潘维民

计算技术、计算机技术

流式计算查询系统StormHypertble

Flow calculationquery systemStormHypertable

丛汉廷,潘维民.基于流式计算的查询系统设计[EB/OL].(2014-12-01)[2025-08-23].http://www.paper.edu.cn/releasepaper/content/201412-19.点此复制

评论