High-Performance Telemetry Pipelines for Cloud Architectures: Optimization and Scalability Strategies

Authors

  • Aarthi Anbalagan Microsoft Corporation, USA Author
  • Manish Tomar Citibank, USA Author
  • Vincent Kanka Transunion, USA Author

Keywords:

Telemetry pipelines, cloud architectures, Apache Kafka

Abstract

New cloud architectures must handle huge telemetry. Telemetry pipelines must be improved to collect, analyze, and store data from various systems in demanding cloud settings. As cloud-based apps and services increase, traditional data input and processing techniques struggle with scalability, efficiency, and dependability. Apache Kafka-TSDB research enhances huge cloud data processing and preservation. 

Researchers examine cloud-scalable telemetry pipelines. Apache Kafka is a strong open-source distributed event streaming solution for high-throughput data streams and producer-consumer decoupling. The research explores how Kafka streams real-time data from virtual machines, containers, microservices, and cloud-native programs. 

References

R. G. Clegg, L. Y. Liu, and A. I. Malan, "Cloud-native architecture for telemetry data processing," IEEE Cloud Computing, vol. 8, no. 3, pp. 34–42, May 2021. doi: 10.1109/MCC.2021.3051298.

K. R. Anderson and S. H. Chung, "High-performance distributed telemetry data processing using Apache Kafka," IEEE Transactions on Cloud Computing, vol. 9, no. 6, pp. 2308–2319, Dec. 2020. doi: 10.1109/TCC.2020.2983142.

M. Zhang, P. S. Chen, and Y. Luo, "Data processing frameworks for time-series telemetry data: A comparative review," IEEE Transactions on Industrial Informatics, vol. 16, no. 5, pp. 3372–3381, May 2020. doi: 10.1109/TII.2020.2991489.

F. C. Schou, H. G. Silveira, and D. J. Silva, "Edge computing for real-time telemetry data processing," IEEE Internet of Things Journal, vol. 8, no. 2, pp. 824–834, Feb. 2021. doi: 10.1109/JIOT.2020.3025377.

A. Kumar, P. S. Chauhan, and B. Gupta, "Optimizing telemetry pipeline storage: A comparison of time-series databases," IEEE Access, vol. 8, pp. 31428–31438, 2020. doi: 10.1109/ACCESS.2020.2976311.

J. S. Lee, S. C. Ho, and L. Z. Zhang, "Machine learning applications in anomaly detection within telemetry data," IEEE Transactions on Neural Networks and Learning Systems, vol. 31, no. 9, pp. 3159–3167, Sept. 2020. doi: 10.1109/TNNLS.2019.2952285.

N. D. Davoudi and S. L. Ram, "Performance optimization in distributed telemetry systems," IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 7, pp. 1524–1534, July 2020. doi: 10.1109/TPDS.2020.2983487.

L. G. Cooper and A. M. Kapoor, "Scalable telemetry pipelines with Apache Pulsar," IEEE Cloud Computing, vol. 8, no. 5, pp. 50–59, Sept.-Oct. 2021. doi: 10.1109/MCC.2021.3060129.

G. Z. Murai, F. V. Oppenheimer, and H. J. Huang, "Strategies for minimizing latency in telemetry data pipelines," IEEE Transactions on Network and Service Management, vol. 17, no. 3, pp. 1896–1907, Sept. 2020. doi: 10.1109/TNSM.2020.3019651.

P. S. Lou and M. A. Bagherzadeh, "Cloud-based telemetry pipeline architecture for large-scale IoT systems," IEEE Transactions on Industrial Informatics, vol. 16, no. 6, pp. 3456–3464, June 2020. doi: 10.1109/TII.2020.2983078.

C. R. W. Thompson, J. E. Smith, and S. D. Barker, "Time-series data compression and storage for scalable telemetry systems," IEEE Transactions on Data and Knowledge Engineering, vol. 33, no. 8, pp. 1641–1653, Aug. 2021. doi: 10.1109/TKDE.2020.2991254.

S. N. Leung and C. S. Lee, "Batch versus real-time data processing in telemetry pipelines: A performance analysis," IEEE Transactions on Big Data, vol. 7, no. 2, pp. 315–327, Apr.-June 2020. doi: 10.1109/TBDATA.2020.2983921.

T. W. Williams and P. J. Liang, "Handling high-throughput telemetry data streams with Apache Flink," IEEE Transactions on Computational Intelligence and AI in Games, vol. 13, no. 4, pp. 12–22, Dec. 2020. doi: 10.1109/TCIAIG.2020.2992917.

H. M. Zheng, J. C. Olsson, and M. G. Latham, "Optimizing machine learning models for anomaly detection in telemetry data," IEEE Access, vol. 8, pp. 104921–104932, 2020. doi: 10.1109/ACCESS.2020.2992659.

M. Y. Iqbal, T. B. Sorensen, and D. C. Boswell, "Predictive analytics in telemetry systems: Leveraging machine learning for automated issue detection," IEEE Transactions on Industrial Electronics, vol. 68, no. 7, pp. 5674–5684, July 2021. doi: 10.1109/TIE.2020.2980143.

A. Z. Kumar, H. C. Thomas, and F. B. Yang, "Exploring telemetry pipeline design patterns in distributed systems," IEEE Transactions on Cloud Computing, vol. 9, no. 8, pp. 3460–3470, Nov.-Dec. 2021. doi: 10.1109/TCC.2020.2984978.

J. R. Patel, H. L. Tang, and R. D. Flores, "Enhancing telemetry data integrity and consistency in distributed cloud environments," IEEE Transactions on Cloud Computing, vol. 9, no. 4, pp. 900–911, July-Aug. 2021. doi: 10.1109/TCC.2021.2993847.

P. H. Lee, K. Y. Park, and A. B. Schreiber, "Data lifecycle management and retention in telemetry pipelines," IEEE Transactions on Services Computing, vol. 14, no. 1, pp. 148–158, Jan.-Mar. 2021. doi: 10.1109/TSC.2021.2996708.

T. S. Khil, S. H. Lee, and P. K. Verma, "Leveraging stream processing and cloud platforms for telemetry data analytics," IEEE Transactions on Cloud Computing, vol. 10, no. 7, pp. 1224–1234, July 2020. doi: 10.1109/TCC.2020.2986704.

E. J. Brown and V. D. Nguyen, "Time-series data indexing and query optimization for telemetry systems," IEEE Transactions on Knowledge and Data Engineering, vol. 33, no. 5, pp. 1324–1335, May 2021. doi: 10.1109/TKDE.2020.2995634.

Downloads

Published

16-04-2021

How to Cite

[1]
Aarthi Anbalagan, Manish Tomar, and Vincent Kanka, “High-Performance Telemetry Pipelines for Cloud Architectures: Optimization and Scalability Strategies ”, Aus. J. of Machine Learning Res. & App., vol. 1, no. 1, pp. 426–469, Apr. 2021, Accessed: Mar. 14, 2025. [Online]. Available: https://ajmlra.org/index.php/publication/article/view/9