By J. Ross Quinlan (Auth.)

Classifier platforms play a huge function in laptop studying and knowledge-based platforms, and Ross Quinlans paintings on ID3 and C4.5 is greatly stated to have made essentially the most major contributions to their improvement. This e-book is an entire advisor to the C4.5 process as carried out in C for the UNIX surroundings. It includes a accomplished advisor to the structures use , the resource code (about 8,800 lines), and implementation notes. The resource code and pattern datasets also are to be had for obtain (see below).

C4.5 begins with huge units of situations belonging to recognized sessions. The instances, defined by means of any mix of nominal and numeric homes, are scrutinized for styles that let the sessions to be reliably discriminated. those styles are then expressed as versions, within the type of selection timber or units of if-then principles, that may be used to categorise new instances, with emphasis on making the types comprehensible in addition to exact. The procedure has been utilized effectively to initiatives concerning tens of hundreds of thousands of circumstances defined by way of enormous quantities of homes. The publication starts off from basic center studying equipment and exhibits how they are often elaborated and prolonged to accommodate commonplace difficulties equivalent to lacking information and over hitting. benefits and drawbacks of the C4.5 process are mentioned and illustrated with numerous case studies.

This ebook and software program can be of curiosity to builders of classification-based clever structures and to scholars in computer studying and professional structures courses.

Show description

Read Online or Download C4.5. Programs for Machine Learning PDF

Best nonfiction_12 books

A Broader Mission for Liberal Education. Baccalaureate Address, Delivered in Agricultural College ... North Dakota [FACSIMILE]

Top of the range FACSIMILE copy: Worst, John H. (John Henry): A Broader undertaking For Liberal schooling. Baccalaureate tackle, added In Agricultural university . .. North Dakota : Facsimile: initially released through Agricultural collage, N. D in 1901? . booklet might be revealed in black and white, with grayscale photos.

Perspectives on Earthquake Geotechnical Engineering: In Honour of Prof. Kenji Ishihara

This ebook bargains a large standpoint on very important issues in earthquake geotechnical engineering and provides experts and people who are concerned with learn and alertness a extra accomplished realizing in regards to the quite a few subject matters. such as eighteen chapters written through authors from the main seismic lively areas of the realm, reminiscent of united states, Japan, Canada, Chile, Italy, Greece, Portugal, Taiwan, and Turkey, the booklet displays diversified perspectives pertaining to find out how to determine and reduce earthquake harm.

Extra info for C4.5. Programs for Machine Learning

Sample text

This process leads to a production rule classifier that is usually about as accurate as a pruned tree, but more easily understood by people. CHAPTER 6 Windowing When I started work on ID3 in the late 1970s, computers with virtual memory were uncommon and programs were usually subject to size restrictions. The training sets of those early experiments were quite large— one had 30,000 cases described by 24 attributes—and exceeded my memory allowance on the machine I was using. Consequently, there was a need to explore indirect methods of growing trees from large datasets.

206. 273. 6. The upper and lower limits are symmetrical, so that the probability that the real is CF/2. 512. Since the existing subtree has a higher number of predicted errors, it is pruned to a leaf. 642. 610. The predicted error rate for the leaf again is lower than that for the subtree, so this subtree is also pruned to a leaf. 4 Estimating error rates for trees The numbers (N/E) at the leaves of the pruned tree in Figure 4-1 can now be explained. As before, N is the number of training cases covered by the leaf.

W h e n N training cases are covered by a leaf, E of t h e m incorrectly, the resubstitution error rate for this leaf is E/N. 3 EXAMPLE: DEMOCRATS AND REPUBLICANS 41 this somewhat naively as observing E "events" in N trials. If this set of N training cases could be regarded as a sample (which, of course, it is not), we could ask what this result tells us about the probability of an event (error) over the entire population of cases covered by this leaf. The probability of error cannot be determined exactly, but has itself a (posterior) probability distribution that is usually summarized by a pair of confidence limits.

Download PDF sample

Rated 4.96 of 5 – based on 15 votes