ANALYZING BIG DATA WITH MICROSOFT R
| Duration | : | 3 Hari (09.00 – 16.00) |
| | | |
| Desciptions | : | The open-source programming language R has for a long time been popular (particularly in academia) for data processing and statistical analysis. Among R's strengths are that it's a succinct programming language and has an extensive repository of third party libraries for performing all kinds of analyses. Together, these two features make it possible for a data scientist to very quickly go from raw data to summaries, charts, and even full-blown reports. However, one deficiency with R is that traditionally it uses a lot of memory, both because it needs to load a copy of the data in its entirety as a data.frame object, and also because processing the data often involves making further copies (sometimes referred to as copy-on-modify). |
| | | |
| Objectives | : | You will learn how to use MS R to read, process, and analyze large datasets including: · Read data from flat files into R’s data frame object, investigate the structure of the dataset and make corrections, and store prepared datasets for later use · Prepare and transform the data · Calculate essential summary statistics, do crosstabulation, write your own summary functions, and visualize data with the ggplot2 package · Build predictive models, evaluate and compare models, and generate predictions on new data |
| | | |
| Participants | : | - Backend Staff - Data Scientist - Big Data Developer |
| | | |
| Prerequisites | : | Familiar with Big Data Technologies |
No comments:
Post a Comment