Explore and prepare data(data mining)
Graded Assignment: Explore and Prepare Data
You work for a hypothetical university as an entry level data analyst and your supervisor has task you to learn more about the data mining process associated with problem definitions, data exploration and data preparation by completing the steps below:
- In the discussion this week, a task to install Rapid Miner was requested so to get started, your supervisor has asked you to prepare feedback based on at least two Rapid Miner data samples. Another data set can be downloaded from https://rapidminer.com/training/videos/ which is used in the free Rapid Miner video tutorials.
- Important Reminder: In support of this feedback and assignment, everyone should go through all introductory and data preparation video tutorials athttps://rapidminer.com/training/videos/. Additional learning videos could be found at www.youtube.com using keyword searches like “Rapid Miner Tutorials.” For example, check out the resource found below:
- The feedback needs to be a minimum of five body pages of written content not including illustrations and supported with at least three academic sources of research. Furthermore, the feedback needs to be professionally formatted using APA including an APA cover page, abstract, body pages, and reference page. The feedback needs to address the following:
- Problem Definitions: When looking at the data sets, think about, develop and discuss some potential problem definitions for these data sets. In other words, what are some potential ideas of working with and handling these data sets.
- Data exploration: In further exploration of the data sets, discuss and reflect on the quality of these data sets and use some of the basic statistical output and charts provided with Rapid Miner. When exploring the data sets, also remember to think about any potential data problems you see.
- Data Preparation: After exploring the data, discuss, reflect, and apply any ideas to cleanse or make the data better for data analysis and modeling efforts.
- Remember to be very illustrative embedding any charts used or other screen captures to verify any work completed to explore and prepare the data sets.
- For the conclusions of this feedback, no modeling has yet been accomplished; however, use the basic statistical and chart options to draw initial conclusions about these data sets assuming a case where there were no options to go further creating models. In other words, what types of decisions could be made about these data sets after data exploration and data preparations are conducted.
- Complete and submit this assignment for grading on or before the due date. Remember, it is not a good idea to complete or attempt completing work late. See the course syllabus and the associated late policy.
Student included a front APA cover page (Page 1)
Student included an abstract (Page 2)
Student included a minimum of five body pages of content supported with three academic sources of research addressing theory associated with problem definitions, data exploration, and data preparation in application use of RapidMiner Studio.
Student included data visualizations/illustrations to correlate or support written content.
Student included in-text citations with a complete reference page properly formatted using APA
Student included a completed paper free of grammar and spelling issues.
Total Earned points