A Big Data Analytical Approach for Analyzing Temperature Dataset using Machine Learning Techniques

Authors

  • J.V.N. Lakshmi Dept. of IT and MCA, Acharya Institutes of Management and Sciences, Peenya, India
  • Ananthi Sheshasaayee PG & Research Department of Computer Science, Quaid-E-Millath Govt College for Women, Chennai, India

Keywords:

Big Data, Machine Learning, HADOOP, Spark, Linear Regression, Gradient Boosting Tree

Abstract

Machine Learning algorithms are used to predictive analytics. These algorithms are put into practice for measuring the temperature data. To capture these data spark framework is being exploited. Machine learning applications are increasingly deployed not only to serve predictions using static models, but also as tightly-integrated components of feedback loops involving dynamic, real-time decision making. These applications pose a new set of requirements, none of which are difficult to achieve in isolation, but the combination of which creates a challenge for existing distributed execution frameworks: computation with millisecond latency at high throughput, adaptive construction of arbitrary task graphs, and execution of heterogeneous kernels over diverse sets of resources. We assert that a new distributed execution framework is needed for such ML applications. This paper describes evaluation of these algorithms on Hadoop, an open-source for spark implementation. The proposed methodology uses a temperature data set for analyzing the machine learning algorithms on spark data frame.

 

References

J.V.N. Lakshmi, S. Ananthi, “A Theoretical Model for Big data Analytics using Machine Learning Algorithms”, In the Proceedings of the 2015 International Conference (WCI/ICACCI 2015), India, pp.632-636, 2015.

Y. C. Kwon, H. Bill “A Study of skew in MapReduce Application”, in International Conference , USA, pp. 234-245, 2014.

J.V.N. Lakshmi “Hadoop Spark Framework For Machine Learning Using Python”, In the proceedings of the 2016 National Conference on ACSE conference, India, pp.9-14, 2017 .

Haroshi T., Shinji N., Takuyu A., “Optimizing multiple machine learning jobs on map reduce”, In IEEE – ICCCTS conference, Japan, pp. 59-66, 2011.

C.-T. Chu, Lin, Y. Yu, G. R. Bradski, A. Y. Ng, K. Olukotun, “Map-reduce for machine learning on multicore”, MIT Press, USA, pp. 281–288, 2006.

Walisa, Wichan, “An Adaptive ML on Map Reduce for improving performance of large scale data analysis 2013 Eleventh International Conference on ICT and Knowledge Engineering, Bangkok, pp.1-7, 2013

Asha, Sravanthi, “Building Machine learning Algorithms on Hadoop for Big Data”, in IJET Journal, Vol 3, No 2, pp. 484-489, 2013.

G. Schwarz, “Estimating the dimension of a model” , The annals of statistics, Vol.6, Issue.2, pp. 461–464, 1978.

Caruana Rich, Nikos K, Ainur Y, “An Empirical Evaluation of Supervised Learning in High Dimensions”, Proceedings of the 25th International Conference on Machine Learning, Finland, pp.96-103, 2008.

M. Dhivya, D. Ragupathi, V.R. Kumar, “Hadoop Mapreduce Outline in Big Figures Analytics”, International Journal of Computer Sciences and Engineering, Vol.2, Issue.9, pp.100-104, 2014.

Downloads

Published

2017-06-30

How to Cite

[1]
J. Lakshmi and A. Sheshasaayee, “A Big Data Analytical Approach for Analyzing Temperature Dataset using Machine Learning Techniques”, Int. J. Sci. Res. Comp. Sci. Eng., vol. 5, no. 3, pp. 92–97, Jun. 2017.

Issue

Section

Research Article