This course is aimed at an introductory level to data science and machine learning, with an emphasis on practical, hands-on application of methods that are suitable on standard laptop and desktop machines.

This short course will begin with an introduction to Python computing in science and engineering using a notebook environment. We will start with a foundation in using numpy, scipy, and matplotlib to solve nonlinear optimization problems where a function is minimized. The functions will transition into finding parameters that minimize an error function which will lead us into nonlinear regression. From here, we will progressively build increasingly sophisticated models that lead to a small neural network.

In the afternoon, we will revisit some regression tasks with more complex data structures to learn why the Pandas data science library exists and how it simplifies some data science tasks. Then, we will introduce the sklearn package for a consistent data-driven modeling experience across linear, nonlinear, and tree-based models. If there is time, we will discuss how to write custom estimators for sklearn.

Instructor: John Kitchin

8:30 a.m. - 10:00 a.m.

  • Introduction to Python computing in science and engineering
  • Numpy, scipy, and matplotlib to solve nonlinear optimization problems

10:15 a.m. - 12:00 p.m.

  • Python for Nonlinear regression
  • Data driven models and progression to training of neural networks

12:00 p.m. - 1:00 p.m.

  • Break

1:00 p.m. - 3:00 p.m.

  • Advanced regression, complex data structures, and Pandas
  • Sklearn for data-driven modeling of linear, nonlinear, and tree-based models

3:15 p.m. - 4:30 p.m.

  • Further discussion of sklearn for data-driven modeling
  • Use and implementation of custom estimators for regression

(All times ET)

Upcoming Events