A Big Data Analytical Approach for Analyzing Temperature Dataset using Machine Learning Techniques
Keywords:
Big Data, Machine Learning, HADOOP, Spark, Linear Regression, Gradient Boosting TreeAbstract
Machine Learning algorithms are used to predictive analytics. These algorithms are put into practice for measuring the temperature data. To capture these data spark framework is being exploited. Machine learning applications are increasingly deployed not only to serve predictions using static models, but also as tightly-integrated components of feedback loops involving dynamic, real-time decision making. These applications pose a new set of requirements, none of which are difficult to achieve in isolation, but the combination of which creates a challenge for existing distributed execution frameworks: computation with millisecond latency at high throughput, adaptive construction of arbitrary task graphs, and execution of heterogeneous kernels over diverse sets of resources. We assert that a new distributed execution framework is needed for such ML applications. This paper describes evaluation of these algorithms on Hadoop, an open-source for spark implementation. The proposed methodology uses a temperature data set for analyzing the machine learning algorithms on spark data frame.
References
J.V.N. Lakshmi, S. Ananthi, “A Theoretical Model for Big data Analytics using Machine Learning Algorithms”, In the Proceedings of the 2015 International Conference (WCI/ICACCI 2015), India, pp.632-636, 2015.
Y. C. Kwon, H. Bill “A Study of skew in MapReduce Application”, in International Conference , USA, pp. 234-245, 2014.
J.V.N. Lakshmi “Hadoop Spark Framework For Machine Learning Using Python”, In the proceedings of the 2016 National Conference on ACSE conference, India, pp.9-14, 2017 .
Haroshi T., Shinji N., Takuyu A., “Optimizing multiple machine learning jobs on map reduce”, In IEEE – ICCCTS conference, Japan, pp. 59-66, 2011.
C.-T. Chu, Lin, Y. Yu, G. R. Bradski, A. Y. Ng, K. Olukotun, “Map-reduce for machine learning on multicore”, MIT Press, USA, pp. 281–288, 2006.
Walisa, Wichan, “An Adaptive ML on Map Reduce for improving performance of large scale data analysis 2013 Eleventh International Conference on ICT and Knowledge Engineering, Bangkok, pp.1-7, 2013
Asha, Sravanthi, “Building Machine learning Algorithms on Hadoop for Big Data”, in IJET Journal, Vol 3, No 2, pp. 484-489, 2013.
G. Schwarz, “Estimating the dimension of a model” , The annals of statistics, Vol.6, Issue.2, pp. 461–464, 1978.
Caruana Rich, Nikos K, Ainur Y, “An Empirical Evaluation of Supervised Learning in High Dimensions”, Proceedings of the 25th International Conference on Machine Learning, Finland, pp.96-103, 2008.
M. Dhivya, D. Ragupathi, V.R. Kumar, “Hadoop Mapreduce Outline in Big Figures Analytics”, International Journal of Computer Sciences and Engineering, Vol.2, Issue.9, pp.100-104, 2014.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors contributing to this journal agree to publish their articles under the Creative Commons Attribution 4.0 International License, allowing third parties to share their work (copy, distribute, transmit) and to adapt it, under the condition that the authors are given credit and that in the event of reuse or distribution, the terms of this license are made clear.