You are viewing the RapidMiner Studio documentation for version 10.2 - Check here for latest version
 Smote Upsampling
						(Operator Toolbox)
Smote Upsampling
						(Operator Toolbox)
					
        
        Synopsis
This operator implements the Synthetic Minority Over-sampling Technique as proposed by Chawla et. al., Journal of Artificial Intelligence Research 16 (2002), 321 -- 357.Description
In the first step the ExampleSet is filtered to only consider examples of the minority class. Afterwards a search on the k nearest neighbours for all examples is performed. The algorithm then selects a random example and a random nearest neighbour for this example. A new example is created which is on the line between the two examples.
Input
 exa (Data Table) exa (Data Table)- ExampleSet you want to upsample. 
Output
 ups (Data Table) ups (Data Table)- The original ExampleSet with the attached synthetic examples. 
 ori (Data Table) ori (Data Table)- The original ExampleSet. 
Parameters
- number_of_neighbours In SMOTE we calculate the k nearest neighborhood. This parameter defines the number of neighbors to consider. Range:
- normalize If checked range transformation to [0,1] is performed to make distance calculation solid. Range:
- equalize_classes If activated as many new examples as needed to balance the classes are drawn. Range:
- upsampling_size Defines the number of examples you want to create. Range:
- auto_detect_minority_class If activated the class to upsample is the class with the least occurrences. Range:
- minority_class Defines the class you want to upsample. Range:
- round_integers Round Integer attributes to the next Integer. Range:
- nominal_change_rate Probability to change a nominal value to the nominal value of it's nearest neighbor. Range:
- use_local_random_seed This parameter indicates if a local random seed should be used. Range:
- local_random_seed If the use local random seed parameter is checked this parameter determines the local random seed. Range:
Tutorial Processes
Use smote on imbalanced Sonar
In this tutorial we unbalance the sonar data set with a sample operator and create synthetic examples to recreate class balance.
