Bounded Computing Theory

——Making Big Data Smaller

Most computations do not require accessing all the data; the desired answers can be obtained by only retrieving a small subset of the data. Based on bounded query models and theories, big data computations are constrained to the processing of small subsets. Relevant research was awarded the Wolfson Prize by the Royal Society in 2018.


Leveraging the bounded computing theory, the newly developed database, YashanDB , has been tested to show that 91% of query tasks can use bounded computing, improving conventional database query speeds by 250,000 to 1,000,000 times , significantly saving computational resources.

Approximate Computing Theory

——Approximate Data Query

Breaking through the bottleneck of traditional approximate computing, the theory of data-driven approximation is studied. It enables accurate and efficient data queries even under limited hardware investment, achieving high precision data analysis. In scenarios with limited big data resources, this provides real-time analysis.

Adaptive Asynchronous Parallelism Theory

——A New Model for Computing Resource Scheduling

To address the challenge of database systems struggling to maintain efficiency as computational cores increase, the adaptive asynchronous parallel task scheduling mechanism was proposed. This mechanism transforms the traditional partitioning method into a new scheduling approach, significantly reducing conflicts and coordination overhead among multiple cores. 

 

 

Compared to synchronous scheduling and asynchronous scheduling, the adaptive asynchronous scheduling mechanism improves performance by 14.7 times and 4.8 times , respectively.

Cross-Modal Fusion Theory

——Replacing Approximate Computing with Direct Computing

With the rise of the Internet and the application of knowledge graphs in various products, the value of unstructured data is rapidly increasing. Addressing the real-time challenges of "approximate computing" models, a theoretical framework for enhancing data object computing was proposed. This framework defines cross-modal data correlations between entity models and relationship models, including properties such as NPspace-completeness and coNPspace-completeness , and presents theories for cross-modal data linking and hierarchical cross-modal correlation.

 

 

Compared to extrapolation and machine learning methods, cross-modal correlation recognition accuracy improves by 91.3% and 80.9%, respectively, with speed improvements of 40.4 times and 24.4 times .

"The integration of unstructured and semi-structured data analysis addresses unresolved challenges in international academic research for years.

 

---PODS 2017

Parallel Transaction Scheduling Theory

——Efficient Parallel Task Execution

Based on the transaction cost model, a parallel transaction scheduling method and a process for transaction management under multi-core architectures are proposed. This approach effectively reduces the overhead and retry costs of Optimistic Concurrency Control (OCC). Experimental results show that it can improve throughput and reduce the number of retries by 137% and 42.5% , respectively, with maximum improvements of 321% and 58.1% .

New-Generation Database YashanDB

 

 

In a big data environment, we need to handle PB to EB-scale data, with query scales reaching 10^15 to 10^18, resulting in extremely high computational costs. This makes it challenging for most small and medium-sized enterprises to afford big data computing. We are committed to reconstructing a big data query and processing framework under limited resources, making big data smaller and enabling resource-constrained small and medium-sized enterprises to enjoy the benefits brought by big data.

Big Data Computational Complexity Theory

——The Foundation of Big Data Computing

Traditional computational complexity theory is not applicable to big data computing,For the first time globally, the Big Data Computational Complexity Theory was proposed ,Defines the constraints and solutions to tractable problems in big data computing,This theory revolutionizes traditional computational complexity theory and establishes the foundation for the field of study.

"It revolutionized traditional complexity theory and laid the foundation for the field of study.

 

---VLDB 2013