Welcome To The Data Mine!
We are excited to have you here!
When Dr. Mark Daniel Ward launched Purdue University’s Data Mine initiative in 2018, he worked with less than 100 students from various academic backgrounds who wanted to learn about data science and how to apply it in their careers.
- Fast forward to today, Dr. Ward, The Executive Director of The Data Mine, is coordinating real-world projects with many companies in Indiana and beyond.
- He is currently offering data science training to over 1700 Purdue undergraduate and graduate students this year, with plans to reach more than 2,200 students in the year ahead.
A Message From Dr. Ward
"We endeavor to expand this model to many more companies and a similarly broad range of college and university profiles. I firmly believe in the value of students learning how to work not only on statistical modeling but also on cloud computing, full stack development, containerized environments, digital twins, large language models, and predictive analytics. Moreover, I love for students to learn directly from domain experts about how data science concepts are used in practice".
The Executive Director of The Data Mine
How To Participate:
The mission of The Data Mine is to foster, implement, and maintain data-driven collaborations across academic and corporate sectors. By building enduring communities of diverse individuals, our aim is to ensure that data science experiences are accessible and beneficial to everyone.
What this means is that we want people from diverse disciplines and backgrounds to be able to learn, apply, and benefit from data science skills and knowledge.
- The Data Mine Seminar/General Cohort is a supportive environment for students in any major from any background who want to learn data science skills.
- Students will have hands-on experience with computational tools for representing, extracting, manipulating, interpreting, transforming, and visualizing data, especially big, real-world data sets.
- Seminar is a year-long, 1 credit, project-based, learn by doing, AND lecture-free course where students:
- Expect 1 project per week, requiring 1 to 3 hours of student work.
- Design efficient search strategies and algorithms for reasearch questions posed by stakeholderd using data science while acquiring new technical and professional skills.
- Seminar Courses are offered at four levels to build data science knowledge and experience:
- For Example: Level 1 - (TDM 101/102):
- The Fall Semester focuses on R.
- The Fall Semester focuses on Python.
- Additional topics in higher levels include: UNIX, Bash, SQL, XML, Data Visualization, Machine Learning, and Deep Learning.
- This program is especially well suited for students who would like to partipate in the Data Mine but do not have the space in thier schedule for the Corporate Partners Program.
- The Corporate Partners Program is a experiential learning student experience course featuring data driven projects.
- Over 80 data-driven projects in partnership with 60 industry corporate partners (2024 Corporate Partners Symposium)
- Students in the Corporate Partners Program will:
- Utilize data science tools and Purdue University's computing resources to manage data sets from partners in the industry by researching, cleaning, processing, analyzing, and visualizing data.
- Develop skills in data science, data modeling, data visualization, data analysis, and data engineering.
- Employ Agile project management to plan tasks and decisions, collaborate with scrum teams during 2-week sprints, review the product backlog, and reflect on successes and improvements.
- Work with peers to identify and overcome complex data science challenges.
- Communicate technical research findings through detailed documentation and team presentations.
- Engage in professional developement opportinities.
- Projects span the entire academic year with weekly guidance from a corporate mentor.
- Commitment: 2 meetings per week plus project work, totaling 10 to 13 hours per week.
- National Data Mine Network (NDMN)
- Indiana Data Mine Network (IDMN)
The National Data Mine Network (NDMN)
The National Data Mine Network (NDMN) is a collaborative initiative between Purdue University and the American Statistical Association aimed at providing undergraduate students at minority-serving institutions (MSIs) with hands-on data science training. .Details about the National Data Mine Network
- An NSF-funded grant in collaboration with the American Statistical Association to enable MSIs' undergraduates to learn data science through research or industry projects.
- Provides $4500 in monthly research stipends ($500/month) plus up to $500 for conference travel to 100 students annually.
- Projects run throughout the 9-month academic year (August-April) with access to support, training, materials/tools ,and high-performance computing from Purdue.
- Students will particpiate in Seminar: A project-based, learn by doing, AND lecture-free course where students:
- Expect 1 project per week, requiring 1 to 3 hours of student work.
- Design efficient search strategies and algorithms for reasearch questions posed by stakeholderd using data science while acquiring new technical and professional skills.
- Students work on corporate partner projects with research mentors or Industry Partners through the Corporate Partners Program. Currently, there are 80+ corporate partner projects with plans to expand.
- Research stipends are provided directly by the American Statistical Association. (Faculty participation is free for those at MSIs.)
- Need to be a U.S. citizen, U.S. national, or permanent resident of the U.S.
- Have undergraduate status at any Minority Serving Institution(MSIs), including Historically Black Colleges(HBCUs) and Universities, Hispanic Serving Institutions, Tribal Colleges, and Universities, or also colleges serving Blind or Deaf learners.
- A list of many MSIs is given here: Minority Institutions List (but please inquire if there is any doubt about such classifications or eligibility)
- Onsite, to help provide mentoring for the students -- Such faculty do not need to have data science experience to mentor a team but should have an interest in working closely with students on a data science project
- Participating faculty will have access to a rich collection of resources and faculty development opportunities
All questions are welcome! For questions about this opportunity, please reply to: datamine@purdue.edu
The Indiana Data Mine Network (IDMN)
The Indiana Data Mine Network (IDMN) is an initiative launched by Purdue University with the support of a $10 million grant from Lilly Endowment Inc. This program aims to expand the Data Mine concept across Indiana. .Details about the Indiana Data Mine Network
- Thanks to a $10 million grant to the Purdue Research Foundation the from Lilly Endowment Inc's Charting the Future for Indiana’s Colleges and Universities initiative, Purdue will launch The Indiana Data Mine, an initiative that will take the Data Mine concept beyond the Purdue West Lafayette campus.
- Purdue will leverage its presence throughout the state to energize and prepare communities, employers and high school and college students for jobs of the future.
- These 'hubs' will provide immersive engagement opportunities for students with Indiana-based companies, potentially leading to careers within the state and boosting Indiana’s tech sector.
- Students involved with The Indiana Data Mine will learn data science skills through immersive engagement with Indiana-based companies that will potentially lead to careers in Indiana, enhancing the state’s surging tech sector.
- Participating students will have access to a rich collection of resources and faculty development opportunities
- Onsite, to help provide mentoring for the students -- Such faculty do not need to have data science experience to mentor a team but should have an interest in working closely with students on a data science project
- Participating faculty will have access to a rich collection of resources and faculty development opportunities
All questions are welcome! For questions about this opportunity, please reply to: datamine@purdue.edu