Data mining : practical machine learning tools and techniques with Java implementations / Ian H. Witten, Eibe Frank.

Published
  • San Francisco, Calif. : Morgan Kaufmann 2000
Physical description
xxv, 371 pages ; 24 cm
ISBN
  • 1558605525
  • 9781558605527
Notes
  • Both authors are from Waikato University.
  • Includes bibliographical references (pages 339-349) and index.
Contents
  • 1 What's it all about? 1 -- 1.1 Data mining and machine learning 2 -- Describing structural patterns 4 -- Machine learning 5 -- Data mining 7 -- 1.2 Simple examples: The weather problem and others 8 -- Weather problem 8 -- Contact lenses: An idealized problem 11 -- Irises: A classic numeric dataset 13 -- CPU performance: Introducing numeric prediction 15 -- Labor negotiations: A more realistic example 16 -- Soybean classification: A classic machine learning success 17 -- 1.3 Fielded applications 20 -- Decisions involving judgment 21 -- Screening images 22 -- Load forecasting 23 -- Diagnosis 24 -- Marketing and sales 25 -- 1.4 Machine learning and statistics 26 -- 1.5 Generalization as search 27 -- Enumerating the concept space 28 -- Bias 29 -- 1.6 Data mining and ethics 32 -- 2 Input: Concepts, instances, attributes 37 -- 2.1 What's a concept? 38 -- 2.2 What's in an example? 41 -- 2.3 What's in an attribute? 45 -- 2.4 Preparing the input 48 -- Gathering the data together 48 -- Arff format 49 -- Attribute types 51 -- Missing values 52 -- Inaccurate values 53 -- Getting to know your data 54 -- 3 Output: Knowledge representation 57 -- 3.1 Decision tables 58 -- 3.2 Decision trees 58 -- 3.3 Classification rules 59 -- 3.4 Association rules 63 -- 3.5 Rules with exceptions 64 -- 3.6 Rules involving relations 67 -- 3.7 Trees for numeric prediction 70 -- 3.8 Instance-based representation 72 -- 3.9 Clusters 75 -- 4 Algorithms: The basic methods 77 -- 4.1 Inferring rudimentary rules 78 -- Missing values and numeric attributes 80 -- Discussion 81 -- 4.2 Statistical modeling 82 -- Missing values and numeric attributes 85 -- Discussion 88 -- 4.3 Divide and conquer: Constructing decision trees 89 -- Calculating information 93 -- Highly branching attributes 94 -- Discussion 97 -- 4.4 Covering algorithms: Constructing rules 97 -- Rules versus trees 98 -- A simple covering algorithm 98 -- Rules versus decision lists 103 -- 4.5 Mining association rules 104 -- Item sets 105 -- Association rules 105 -- Generating rules efficiently 108 -- Discussion 111 -- 4.6 Linear models 112 -- Numeric prediction 112 -- Classification 113 -- Discussion 113 -- 4.7 Instance-based learning 114 -- Distance function 114 -- Discussion 115 -- 5 Credibility: Evaluating what's been learned 119 -- 5.1 Training and testing 120 -- 5.2 Predicting performance 123 -- 5.3 Cross-validation 125 -- 5.4 Other estimates 127 -- Leave-one-out 127 -- Bootstrap 128 -- 5.5 Comparing data mining schemes 129 -- 5.6 Predicting probabilities 133 -- Quadratic loss function 134 -- Informational loss function 135 -- Discussion 136 -- 5.7 Counting the cost 137 -- Lift charts 139 -- ROC curves 141 -- Cost-sensitive learning 144 -- Discussion 145 -- 5.8 Evaluating numeric prediction 147 -- 5.9 Minimum description length principle 150 -- 5.10 Applying MDL to clustering 154 -- 6 Implementations: Real machine learning schemes 157 -- 6.1 Decision trees 159 -- Numeric attributes 159 -- Missing values 161 -- Pruning 162 -- Estimating error rates 164 -- Complexity of decision tree induction 167 -- From trees to rules 168 -- C4.5: Choices and options 169 -- Discussion 169 -- 6.2 Classification rules 170 -- Criteria for choosing tests 171 -- Missing values, numeric attributes 172 -- Good rules and bad rules 173 -- Generating good rules 174 -- Generating good decision lists 175 -- Probability measure for rule evaluation 177 -- Evaluating rules using a test set 178 -- Obtaining rules from partial decision trees 181 -- Rules with exceptions 184 -- Discussion 187 -- 6.3 Extending linear classification: Support vector machines 188 -- Maximum margin hyperplane 189 -- Nonlinear class boundaries 191 -- Discussion 193 -- 6.4 Instance-based learning 193 -- Reducing the number of exemplars 194 -- Pruning noisy exemplars 194 -- Weighting attributes 195 -- Generalizing exemplars 196 -- Distance functions for generalized exemplars 197 -- Generalized distance functions 199 -- Discussion 200 -- 6.5 Numeric prediction 201 -- Model trees 202 -- Building the tree 202 -- Pruning the tree 203 -- Nominal attributes 204 -- Missing values 204 -- Pseudo-code for model tree induction 205 -- Locally weighted linear regression 208 -- Discussion 209 -- 6.6 Clustering 210 -- Iterative distance-based clustering 211 -- Incremental clustering 212 -- Category utility 217 -- Probability-based clustering 218 -- EM algorithm 221 -- Extending the mixture model 223 -- Bayesian clustering 225 -- Discussion 226 -- 7 Moving on: Engineering the input and output 229 -- 7.1 Attribute selection 232 -- Scheme-independent selection 233 -- Searching the attribute space 235 -- Scheme-specific selection 236 -- 7.2 Discretizing numeric attributes 238 -- Unsupervised discretization 239 -- Entropy-based discretization 240 -- Other discretization methods 243 -- Entropy-based versus error-based discretization 244 -- Converting discrete to numeric attributes 246 -- 7.3 Automatic data cleansing 247 -- Improving decision trees 247 -- Robust regression 248 -- Detecting anomalies 249 -- 7.4 Combining multiple models 250 -- Bagging 251 -- Boosting 254 -- Stacking 258 -- Error-correcting output codes 260 -- 8 Nuts and bolts: Machine learning algorithms in Java 265 -- 8.2 Javadoc and the class library 271 -- Classes, instances, and packages 272 -- Weka.core package 272 -- Weka.classifiers package 274 -- Other packages 276 -- 8.3 Processing datasets using the machine learning programs 277 -- Using M5' 277 -- Generic options 279 -- Scheme-specific options 282 -- Classifiers 283 -- Meta-learning shemes 286 -- Filters 289 -- Association rules 294 -- Clustering 296 -- 8.4 Embedded machine learning 297 -- A simple message classifier 299 -- 8.5 Writing new learning schemes 306 -- An example classifier 306 -- Conventions for implementing classifiers 314 -- Writing filters 314 -- An example filter 316 -- Conventions for writing filters 317 -- 9 Looking forward 321 -- 9.1 Learning from massive datasets 322 -- 9.2 Visualizing machine learning 325 -- Visualizing the input 325 -- Visualizing the output 327 -- 9.3 Incorporating domain knowledge 329 -- 9.4 Text mining 331 -- Finding key phrases for documents 331 -- Finding information in running text 333 -- Soft parsing 334 -- 9.5 Mining the World Wide Web 335.
Other names
Related item
Genre
  • text
Language
  • English
Related Internet Resources

Summary holdings does not include live availability details. Select a library name for the full Holdings display.

Location of copy Shelfmark Online location Holdings Notes
University of Aberdeen Libraries: Sir Duncan Rice Library, Floor 7 006.312 WIT
University of Aberdeen Libraries: Sir Duncan Rice Library, Floor 7 006.312 WIT
University of Aberdeen Libraries: Sir Duncan Rice Library, Floor 7 006.312 WIT
Aberystwyth University Library: Physical Sciences Library QA76.9.D343.W8
University of Birmingham Libraries: Main Library, Collection QA76.9.D343 W58
University of Bradford: J B Priestley Library - Floor 2 006.3 WIT
University of Bradford: J B Priestley Library - Floor 2 006.3 WIT
University of Bristol Libraries Table of contents Online location
University of Bristol Libraries Publisher description Online location
University of Bristol Libraries Online location Sommaire et compléments
University of Bristol Libraries: Queen's Building Library QA76.9.D343 WIT 7 day loan: vacation loan
British Library: Lending Collection m00/10875
University of Cambridge Libraries: University Library: Order in West Room (Not borrowable) 2000.9.3376
University of Cambridge Libraries: Jesus College: > Quincentenary Library AY2 XS Wit
University of East Anglia Library: Main Library: Main shelves QA76.9.D343 WIT
University of Edinburgh Libraries: Main Library (STANDARD LOAN) - 3rd floor QA76.9.D343 Wit.
University of Edinburgh Libraries: Main Library (SHORT LOAN) - 3rd floor QA76.9.D343 Wit.
University of Essex: Request from store -- Store 2 QA 76.9.D343W5
University of Exeter Library: Forum Library 001.535 WIT
Glasgow Caledonian University: Open Shelves - Level 4 006.3 WIT
University of Hull: BJL Teaching Reserve QA 76.9 D343 W8
The Institution of Engineering and Technology: Maxwell Library (ML) 681.3:002 WIT On Shelf
King's College London Library: Maughan Library ; [Science books] QA76.9.D343 WIT One week
University of Liverpool Library: Sydney Jones Library (Store) 518.561.W82 22
University of Manchester Library: Main Library: Blue Area Floor 1 006.312
National Library of Scotland: General Reading Room, Edinburgh (stored offsite) SP3.200.1175
University of Oxford Libraries: St Anne's College Library: Tim Gardam Building 519.8 WIT:Dat
University of Oxford Libraries: St Anne's College Library: Tim Gardam Building 519.8 WIT:Dat
University of Oxford Libraries: Computer Science Library: Books 95H 28 WIT
University of Oxford Libraries: Computer Science Library: Books 95H 28 WIT
University of Oxford Libraries: Computer Science Library: Books 95H 28 WIT
University of Oxford Libraries: Computer Science Library: Books WIT 1 R/S
University of Oxford Libraries: Computer Science Library: Books WIT 2 R/S
University of Oxford Libraries: Computer Science Library: Books WIT 3 R/S
University of Oxford Libraries: Computer Science Library: Books WIT 4 R/S
University of Oxford Libraries: St Hugh's College Library 006.312 WIT
University of Oxford Libraries: St Hugh's College Library 006.312 WIT
University of Oxford Libraries: Mansfield College Library: Main Library 7705 WIT
University of Oxford Libraries: Radcliffe Science Library: Vere Harmsworth Library Stack QA 76.9.D343 WIT
University of Oxford Libraries: Radcliffe Science Library QA 76.9.D343 WIT
University of Oxford Libraries: Radcliffe Science Library: Vere Harmsworth Library Open Shelves QA 76.9.D343 WIT
University of Oxford Libraries: Somerville College Library 521.9 WIT 1
University of Oxford Libraries: St John's College Library ENGIN / 60 / JAV / WIT
Royal Holloway, University of London: Royal Holloway library Davison General Three Week 001.535 WIT
Sheffield Hallam University Library: Adsetts Main Collection 006.3 WI (LEVEL 2)
University of Sheffield Library: Western Bank Library Main sequence 006.312 (W)
University of Southampton Library: Hartley Library QA 76.9.D343 WIT
University of Strathclyde Library: Standard Loan MLD D 006.3 WIT
Swansea University Libraries: Bay Library : Main QA76.9.D343 W58 2000
Trinity College Dublin Library: Santry Book Repository PL-354-310
University of Warwick Library: Main Library QA76.9.D343 W58
University of Westminster: Cavendish Library, Books 006.31
University of York Libraries: University Library: Morrell - Ordinary SK 90 WIT

Export: