Abstract:
Big data is not a new terminology in the Information Technology sector anymore. With the emergence of big data, arise the need for analyzing large amounts of data that consist trillions of records. Additionally, big data have already penetrated multiple areas in data analytics. Therefore, different technological solutions were developed to handle these big data complexities. However, even after decades, contemporary solutions are unable to address complex issues and overcome several limitations.
Lack of a common communication standard has resulted in many issues in big data analytics. Presently, all the big data solution companies are using their in-house ad hoc communication methods to perform analytics. Unfortunately, this leads to limitations in integration and reusability of the solutions built. To overcome this, Microsoft introduced the XMLA (XML for Analysis), an industry standard for accessing data in analytical systems, namely OLAP (online analytical processing) systems. XMLA was well standardized and well designed for accessing data through Multi-Dimensional Expressions (MDX). Development of tailor-made query languages to access and analyze the stack of scattered data stores has caused the creation of different standards. This leads to the state where almost all big data services offering their proprietary query languages and APIs for data analysis.
This research is to propose a methodology for addressing the ad-hoc integration of these big data analytics endpoints through a JSON based specification by reusing XMLA structures. The research components are publishing a communication model using JSON specification and proposing to adopt the standards to existing stores. This solution will enable frontend tools to be fully independent of the backend storage model. Also, this will allow existing JSON standardized frontend tools to easily integrate with big data analytics through eliminating the necessity of a specific frontend tool aiming a data store.