Haoran Qiu, Weichao Mao, et al.
ASPLOS 2024
Cloud infrastructures promise to provide highperformance and cost-effective solutions to large-scale data processing problems. In this paper, we identify a common class of data-intensive applications for which data transfer latency for uploading data into the cloud in advance of its processing may hinder the linear scalability advantage of the cloud. For such applications, we propose a "stream-as-you-go" approach for incrementally accessing and processing data based on a stream data management architecture. We describe our approach in the context of a DNA sequence analysis use case and compare it against the state of the art in MapReduce-based DNA sequence analysis and incremental MapReduce frameworks. We provide experimental results over an implementation of our approach based on the IBM InfoSphere Streams computing platform deployed on Amazon EC2, showing an order of magnitude improvement in total processing time over the state of the art. © 2012 IEEE.
Haoran Qiu, Weichao Mao, et al.
ASPLOS 2024
Jose Manuel Bernabe' Murcia, Eduardo Canovas Martinez, et al.
MobiSec 2024
Sahil Suneja, Yufan Zhuang, et al.
ACM TOSEM
Guo-Jun Qi, Charu C. Aggarwal, et al.
ICDE 2012