没有合适的资源?快使用搜索试试~ 我知道了~
sklearn英文文档
资源推荐
资源详情
资源评论


















Astronomy with scikit-learn
Release Scipy2012
Jacob VanderPlas
http://astroML.github.com/sklearn_tutorial/
November 16, 2012

Contents
1 Tutorial Setup and Installation 2
1.1 Python Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Tutorial Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Download the datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Machine Learning 101: General Concepts 5
2.1 Features and feature extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Supervised Learning, Unsupervised Learning, and scikit-learn syntax . . . . . . . . . . . . . 8
2.3 Supervised Learning: model.fit(X, y) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Unsupervised Learning: model.fit(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Linearly separable data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6 Hyperparameters, training set, test set and overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.7 Key takeaway points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3 Machine Learning 102: Practical Advice 24
3.1 Bias, Variance, Over-fitting, and Under-fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Cross-Validation and Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Learning Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Classification: Learning Labels of Astronomical Sources 30
4.1 Motivation: Why is this Important? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 Star-Quasar Classification: Naive Bayes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5 Regression: Photometric Redshifts of Galaxies 35
5.1 Motivation: Dark Energy, Dark Matter, and the Fate of the Universe . . . . . . . . . . . . . . . . . . 36
5.2 A Simple Method: Decision Tree Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6 Dimensionality Reduction of Astronomical Spectra 39
6.1 SDSS Spectral Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.2 Principal Component Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.3 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7 Exercises: Taking it a step further 47
i

7.1 Exercise 1: Photometric Classification with GMM . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.2 Exercise 2: Photometric redshifts with Decision Trees . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.3 Exercise 3: Dimensionality Reduction of Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8 Code examples 51
8.1 Tutorial Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.2 Bias and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
8.3 Linear Model Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
8.4 Iris Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
8.5 Basic numerics and plotting with Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8.6 SDSS Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
8.7 SDSS Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
8.8 SDSS Photometric Redshifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
8.9 SDSS Spectra Plots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.10 SGD: Maximum margin separating hyperplane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
8.11 Libsvm GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
9 People 85
10 Citing the scikit-learn 86
ii

Astronomy with scikit-learn, Release Scipy2012
For more information on machine learning for Astronomy, see the astroML
a
code and examples.
a
http://astroML.github.com
Machine Learning for Astronomy with scikit-learn
This tutorial offers a brief introduction to the fields of machine learning and statistical data analysis, and their
application to several problems in the field of astronomy. These learning tasks are enabled by the tools available
in the open-source package scikit-learn
a
.
scikit-learn
b
is a Python module integrating classic machine learning algorithms in the tightly-knit world of
scientific Python packages (numpy
c
, scipy
d
, matplotlib
e
). It aims to provide simple and efficient solutions to
learning problems that are accessible to everybody and reusable in various contexts: machine-learning as a
versatile tool for science and engineering.
Many of the examples and exercises in this tutorial require the ipython notebook
f
, a tool which provides an
intuitive web-based interactive environment for scientific python. Some of the material in the notebooks is
duplicated in the following pages, but ipython notebook is required for some parts. For information on how to
download the associated notebooks, see the Tutorial Setup and Installation (page 2) page.
a
http://www.scikit-learn.org
b
http://www.scikit-learn.org
c
http://numpy.scipy.org
d
http://www.scipy.org
e
http://matplotlib.sourceforge.net
f
http://ipython.org/ipython-doc/stable/interactive/htmlnotebook.html
Note: This document is meant to be used with scikit-learn version 0.11+. Find the latest version here.
Contents 1

CHAPTER 1
Tutorial Setup and Installation
Objectives
At the end of this section, you will
1. Have scikit-learn and all the prerequisites and dependencies for this tutorial installed on your machine.
2. Download the source files and data required for this tutorial
1.1 Python Prerequisites
This tutorial is based on scikit-learn, which has the following dependencies:
• numpy
1
: this is a python module which has powerful tools for the creation and manipulation of arrays. It is the
foundation of most scientific computing packages in python
• scipy
2
: this is a python module which builds on numpy and provides fast implementations of many basic
scientific algorithms.
• matplotlib
3
: this is a powerful package for generating plots, figures, and diagrams. Our main form of visual
interaction with data and results depends on matplotlib.
We will also make extensive use of iPython
4
, an interactive python interpreter. In particular, much of the interactive
material requires ipython notebook
5
functionality, which was introduced in ipython version 0.12.
1.1.1 Installing scikit-learn and Dependencies
Please refer to the install page
6
for per-system instructions on installing scikit-learn. In addition to numpy, scipy,
and scikit-learn, this tutorial will assume that you have matplotlib and ipython installed as well.
• Under Debian or Ubuntu Linux you should use:
1
http://numpy.scipy.org
2
http://www.scipy.org
3
http://matplotlib.sourceforge.net/
4
http://ipython.org
5
http://ipython.org/ipython-doc/stable/interactive/htmlnotebook.html
6
http://www.scikit-learn.org/stable/install.html#installing-an-official-release
2
剩余88页未读,继续阅读
资源评论


sakuralu
- 粉丝: 1
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助


最新资源
- 软考网络工程师全面复习笔记汇总.docx
- 路由交换技术课程设计任务书网络.doc
- 电力系统中并联型有源电力滤波器APF的Simulink仿真与Matlab建模——基于瞬时无功功率理论的ip-iq谐波检测算法
- 网络结构拓扑图.ppt
- 建设工程项目管理操作手册(11页-含图表).doc
- 网络推广方案示例.doc
- 巧克力网络营销在线推广策略.ppt
- 决策树算法研究.doc
- 文献管理软件Endnote及其新功能.ppt
- 2023年操作系统试题库综合题.doc
- python基础100练习题.doc
- 传感器试验程序MATLAB.doc
- 企划外包网络营销价格策略新知助业营销策划机构推.pptx
- 自动化专业生产实习报告.docx
- MATLAB-Carsim联合仿真:基于LQR的车辆横向控制模型(输入:前轮转角,输出:横向误差与航向误差) · CarSim
- 基于最大诚信原则的我国互联网保险法律风险问题研究.pdf
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈



安全验证
文档复制为VIP权益,开通VIP直接复制
