DATA SCIENCE CORE
Take the following courses:
CS-110 Computer Science I
An introductory study of computer science software development concepts. Python is used to introduce a disciplined approach to problem solving methods, algorithm development, software design, coding, debugging, testing, and documentation in the object oriented paradigm. This is the first course in the study of computer science.
3 CreditsN,CTGES,CTGISRecommended programming experience or IT110 or IT100, IT111 or IM110 or MA103 but not necessary.
MA-116 Discrete Structures
Introduces mathematical structures and concepts such as functions, relations, logic, induction, counting, and graph theory. Their application to Computer Science is emphasized.
4 CreditsN, QPre-requisite high school algebra.
DS-110 Intro to Data Science
This course introduces the student to the emerging field of data science through the presentation of basic math and statistics principles, an introduction to the computer tools and software commonly used to perform the data analytics, and a general overview of the machine learning techniques commonly applied to datasets for knowledge discovery. The students will identify a dataset for a final project that will require them to perform preparation, cleaning, simple visualization and analysis of the data with such tools as Excel and R. Understanding the varied nature of data, their acquisition and preliminary analysis provides the requisite skills to succeed in further study and application of the data science field. Prerequisite: comfort with pre-calculus topics and use of computers.
3 CreditsN
MA-130 Calculus I
An introduction to calculus including differentiation and integration of elementary functions of a single variable, limits, tangents, rates of change, maxima and minima, area, volume, and other applications. Integrates the use of computer algebra systems, and graphical, algebraic and numerical thinking.
4 CreditsN, QM
MA-160 Linear Algebra
An introduction to systems of linear equations, matrices, determinants, vector spaces, linear transformations, eigenvalues, and applications.
3 CreditsN, QMPrerequisites: MA130.
CS-370 Database Management Systems
Focuses on concepts and structures necessary to design and implement a database management system. Various modern data models, data security and integrity, and concurrency are discussed. An SQL database system is designed and implemented as a group project.
3 CreditsN,CTGISPrerequisites: CS110.
DS-210 Data Acquisition
Students will understand how to access various data types and sources, from flat file formats to databases to big storage data architecture. Students will perform transformations, cleaning, and merging of datasets in preparation for data mining and analysis.
3 CreditsNPRE-REQ: CS 110 and DS 110.
IM-242 Info Visualization
This course considers the various aspects of presenting digital information for public consumption visually. Data formats from binary, text, various file types, to relational databases and web sites are covered to understand the framework of information retrieval for use in visualization tools. Visualization and graphical analyses of data are considered in the context of the human visual system for appropriate information presentation. Various open-source and commercial digital tools are considered for development of visualization projects.
3 CreditsN,CTDH,CTGESPrerequisite: IT 110, IT 111, IM 110, DS 110, or CS 110 or permission.
MA-321 Multivariate Statistics
A class in multivariate statistical techniques including non-parametric methods, multiple regression, logistic regression, multiple testing, principle analysis.
3 CreditsN, QSPrerequisites: MA-130 or MA 160; an introductory statistics course from the following list: BI-305, EB-211, ESS-230, ESS-309, MA-205, MA-220, PY-366, or SW-215
MA-325 Statistical Consulting
The participating students will receive training during the semester in consulting on statistical problems and to assist in collaborative efforts with faculty and/or staff on client-partnered projects that are pre-determined. The semester-long project provides the student with both real work experience in the field of statistics and a project-based learning experience in partnership with the client. May be taken multiple times for credit.
3 CreditsN, QS, CW, SW-LEPrerequisite: Take one course from this list: BI-305, EB-211, ESS-230, ESS-309, MA-205, MA-220, PY-361, PY-366, SW-215. Also take FYC-101 or EN-110 or EN-109.
STATISTICS CORE
Take one of the following courses:
An introduction to the basic ideas and techniques of probability theory and to selected
topics in statistics, such as sampling theory, confidence intervals, and linear regression. 4 CreditsN, QS, CTGESPrerequisite: MA130 Introduction to traditional statistical concepts including descriptive statistics,
binomial and normal probability models, confidence intervals, tests of hypotheses,
linear correlation and regression, two-way contingency tables, and one-way analysis
of variance. 4 CreditsN, QS, WK-SPPrerequisite: FYC-101 or EN-110 or EN-109
This course covers basic descriptive and inferential statistics, normal curve and
z-score computations, and addresses hypothesis testing using Chi-Square, T-Test, ANOVA,
and linear regression modelling.
3 Credits
QS,S
This course deals centrally with quantitative and statistical methodology in the biological
sciences. It includes experimental design and the conventions of generating, analyzing,
interpreting and presenting biological data. Counts as a math course for graduate
and professional school requirements. 4 CreditsN, QS, CTGESPrerequisites: BI106 or ESS100 This course is a survey of the various visual, statistical, and modeling approaches
commonly used in the analysis of environmental data. The course covers: (1) visual
literacy from exploratory data inquisition to poster creation; (2) elementary group
comparison such as t-test and ANOVA and their non-parametric analogs;(3) basic systems
modeling; and (4) regression modeling techniques based on the generalized linear model
framework. 3 CreditsN, QS, CTGES, CTGISPrerequisites: Sophomore standing and permission of the instructor. A first course in econometrics with forays into regression, optimization, and modeling. 2 CreditsN, QPrerequisites: Introductory economics course. Introduces the methodological skills necessary for conducting research and for becoming
a better consumer of psychological science. Students will learn to think critically
about claims and accurately summarize primary source articles about behavior. Students
will learn statistical concepts commonly used to evaluate data, how to effectively
communicate research, and make ethical judgments informed by APA ethical standards. 4 Credits Prerequisite: PY-101 The second part of an integrated course sequence applying the scientific process to
the fields of Social Work and Sociology, emphasizing key research concepts, commonly
used quantitative and qualitative methods, and the ability to communicate effectively
about research with written and verbal skills. The course teaches students not only
to conduct research but also to consume and utilize research. 3 CreditsS MA-220 Introduction to Probability & Statistics
MA-205 Elementary Statistics
EB-211 Business Statistics
BI-305 Biostatistics
ESS-230 Environmetrics
ESS-309 Econometrics
PY-366 Research Methods & Statistics
SW-215 Integrated Research Methods & Stats II
ELECTIVES
Take at least 8 credits from the following courses:
Talk Nerdy To Me is a course designed for anyone interested in more effectively communicating
scientific ideas to non-specialists. Students will write short popular science articles,
illustrate comics, create video explanations, and refine oral presentation skills
to present recent research advances or their own research data. All student output
will be produced for public consumption and outreach online and in public formats.
There will be a strong emphasis on peer evaluation and review. 3 Credits Bioinformatics is the science of collecting and analyzing complex biological data.
It is an interdisciplinary field that develops and applies methods and software tools
for understanding biological data. 4 CreditsN,CTGESPre-req: BI-101 or BI-105, BI-102 or BI-106, CH-142, CH-143, CH-144, CH-145 The study and analysis of algorithms, their complexity and supporting data structures.
Topics include searching, sorting, mathematical algorithms, tree and graph algorithms,
the classes of P and NP, NP-complete and intractable problems, and parallel algorithms. 4 CreditsCW,NPrerequisites: CS240 and MA116. This course begins with an introduction to fundamental concepts in Scientific Computing
and concludes with domain-specific projects in areas like Bioinformatics, Data Science,
Physical Systems, and Numerical Analysis. The common content will include command-line
interfaces (Linux), programming languages (Jupyter/Python), numerical and graphical
libraries (NumPy and Matplotlib), version-control (Git/Github), and relational databases
(SQL). 3 CreditsNPre-Req: CS-110 This course considers the use of machine learning (ML) and data mining (DM) algorithms
for the data scientist to discover information embedded in datasets from the simple
tables through complex and big data sets. Topics include ML and DM techniques such
as classification, clustering, predictive and statistical modeling using tools such
as R, Matlab, Weka and others. Simple visualization and data exploration will be covered
in support of the DM. Software techniques implemented the emerging storage and hardware
structures are introduced for handling big data. 3 CreditsNPrerequisite: CS-110, DS-110, and an approved statistics course from this list: MA-205,
MA-220, BI-305, PY-214, PY-260, PY-366, or EB- 211. This course considers the management and processing of large data sets, structured,
semi-structured, and unstructured. The course focuses on modern, big data platforms
such as Hadoop and NoSQL frameworks. Students will gain experience using a variety
of programming tools and paradigms for manipulating big data sets on local servers
and cloud platforms. 3 CreditsNPrerequisites: DS 110 Intro to Data Science and CS 370 Database Management Systems Under the direction of their advisor, students will complete an original, independent
research project in Data Science. A written report and oral presentation summarizing
their research experience and results will be prepared. This course is a requirement
for students who are candidates for distinction in Data Science. Instructor permission
required. 1-5 Credits This course is an introduction to a Geographical Information System (GIS), and the
course objective is that students gain a basic, partial understanding of GIS concepts,
technical issues, and applications using Arc View GIS. It encourages thinking in spatial
context. A diverse array of hands-on computer applications and projects are used to
understand how geographical data can be analyzed spatially. Students explore analysis
techniques in a problem basis learning approach using small team projects. 4 CreditsCTGISNote: A special course fee is assessed. Prerequisite: ESS100. The goal of the course is to advance student understanding of a broad range of numerical
and graphical techniques used to analyze complex data sets encountered in the environmental
sciences. Students will learn the context to properly apply these techniques to address
research questions. The purview is ecological, but is applicable to all other quantitative
endeavors. The course emphasizes conceptual understanding, relevant applications,
and proper interpretation rather than gory, though interesting, statistical theory.
Students will apply the R language and environment for statistical computing to tailor
analyses to specific circumstances. 4 CreditsQS(Lec/Lab; 4 cr hr; Spring years; pre-req ESS 110, ESS 230-Environmetrics, or consent) This course reviews and applies project management processes and techniques such as
project life cycle, project selection methods, work breakdown instructions, network
diagrams, cost estimates, and more. 3 CreditsS,CW,CS,SW-LEPrerequisites: IT210 and Jr or Sr standing or permission of the instructor. Corequisite:
IT308. This lab will require a team of students to function as a project development team
for an IT- related business. The students will be exposed to many aspects of systems
analysis, design, development and implementation, as well as project management tools
and techniques. Students will be required to learn in a just-in-time mode using on-demand
educational resources. 1 CreditsSPrerequisites: IT210 and Jr or Sr standing or by permission of the instructor. Corequisite:
IT307. Note: This course will have appointed class times for projects other than the
times listed on the schedule. This course begins with an introduction to fundamental concepts in Scientific Computing
and concludes with domain-specific projects in areas like Bioinformatics, Data Science,
Physical Systems, and Numerical Analysis. The common content will include command-line
interfaces (Linux), programming languages (Jupyter/Python), numerical and graphical
libraries (NumPy and Matplotlib), version-control (Git/Github), and relational databases
(SQL). 3 CreditsNPre-Req: CS-110BI-314 Talk Nerdy to Me
BI-405 Bioinformatics Fundamentals
CS-315 Algorithms and Analysis
CS-341 Scientific Computing
DS-352 Machine Learning
DS-375 Big Data
DS-485 Data Science Research
ESS-330 Geographical Information Systems
ESS-335 Quantitative Ecology
IT-307 Project Management
IT-308 Innovations for Industry I
MA-341 Scientific Computing
COGNATE AREA
Take 12 credits, 3 of which must be at the 300 level or higher. Cognate area should be a coherent set of courses outside the areas of Data Science, Math and Computer Science.
CAPSTONE
Take the following course:
DS-420 Data Science Capstone
This course is a capstone experience for Data Science POE students and must be completed as part of a student's final 30 credits. It represents the summation of a student's Juniata experience and serves as a bridge to their future goals. Students will have the opportunity to both apply their previous data science skills and develop new skills through a data analysis project.
1 CreditPrerequisite: DS-110, CS-110, and one course from this list: MA-220 or MA-205 or EB-211 or BI-305 or ESS-230 or ESS-309 or PY-361 or SW-215.
What should you expect?
Students in the data science program will be prepared for jobs dealing with data in whatever fields they are interested. With an emphasis on practical skills for the organization, analysis, visualization, and presentation of actionable information gathered from widely varied data sources, data science will work with students on real world data. Students will take a variety of courses in data science, computer science, statistics, and in a cognate area of their choice.
As part of the POE in data science you can participate in internships at locations such as Mutual Benefit Corporation or Juniata’s Office of Advancement.
What your four years in the Data Science Program at Juniata College might look like:
First Year
Take Introduction to Data Science (DS 110), Discrete Structures (MA 116), Computer Science 1 (CS 110), and Calculus (MA 130). Begin exploring other fields such as business, biology, environmental science, psychology, or history as a possible area to apply your data analysis skills, a cognate area.
Sophomore Year
Take Data Acquisition (DS 210), Linear Algebra (MA 160), and Introduction to Probability and Statistics (MA 220). Start taking courses in chosen cognate area.
Junior Year
Take upper level courses in data science, computer science, and statistics. Continue taking cognate area courses. Consider studying abroad at the Mathematical Sciences Semesters at Guanjuato, Mexico. Look into internships Participate in DataFest.
Senior Year
Take Data Science Consulting (DS 325) to have capstone in Data Science of a real life
data analysis project. Continue taking upper levels and finish your cognate area courses.
Complete an internship. Participate in Data Fest.
POE Credit Total = 56-60
Students must complete at least 18 credits at the 300/400-level. Any course exception must be approved by the advisor and/or department chair.