Supervised ranking in the weka environment
- ⁎ Corresponding author. Tel.: +32 9 264 59 41; fax: +32 9 264 62 20.
- a Department of Applied Mathematics and Computer Science, Ghent University, Krijgslaan 281 S9, B-9000 Gent, Belgium
- b Department of Applied Mathematics, Biometrics and Process Control, Ghent University, Coupure links 653, B-9000 Gent, Belgium
Research highlights
- The supervised ranking problem has its own – very important – distinct characteristics. In this article, we focus on differences with regard to the traditional classification problem.
- We have given four propositions that all provide solutions to the supervised ranking problem (for more information, we refer to [14, 15, 16]).
- A software implementation of proposed solutions can be used inside the WEKA environment.
- We have shown how to get started using this software, and we have given snapshots of WEKA outputs that show that indeed the implementation fits into the framework.
Abstract
Software for solving the supervised ranking problem is presented. Four variants of the Ordinal Stochastic Dominance Learner (OSDL) are given, together with the space and time complexity of their implementations. It is shown that the described software, which includes two further algorithms for supervised ranking, fits seamlessly into the weka environment.
1. The supervised ranking problem
1.1. Problem description
The supervised ranking problem, which is sometimes called the supervised ordered sorting problem, undeniably bears some resemblances to the traditional supervised classification problem, but has its own – very important – distinct characteristics. Given a collection of learning examples or instances, the goal is also, just as in traditional supervised classification, to attach a label to (unseen and/or unlabelled) instances in a way that extends the learning examples in the ‘most reasonable way’.
In traditional classification algorithms, instances are mostly described by feature vectors or attribute vectors belonging to a data space , i.e. an object x from the object space Ω is described by an attribute vector . The measurement scale of these features is quite arbitrary, and may range from nominal to numerical. The first important difference is that in the ranking problem, instances are described by true criteria [7], i.e. each ‘feature’ is measured on (at least) an ordinal scale, and has an associated meaning (semantics) of preference. This is important as it now makes sense to compare two objects a and b from the object space Ω on the basis of a single criterion ci. One says that object a is strictly preferred to object b w.r.t. ci if . A second difference concerns the measurement scale of the labels: in traditional classification problems, the set of labels constitutes a nominal scale, while in the ranking problem, the set is a finite chain, again with associated preference semantics. It thus also makes sense to compare objects on the basis of their labels.
These two differences w.r.t. the traditional classification problem give in a very natural way rise to a basic monotonicity constraint. Indeed, it would be very unnatural that an object a that, on each criterion, scores at least as good as an object b, would get a global evaluation (label) that is lower than that of b. Mathematically speaking this means the following: the product ordering turns into a partially ordered set , while is a chain, and we are looking for an order-preserving mapping . More explicitly, it must hold for all elements x and y of that:(1) Sometimes however, a classification algorithm does not return a single label, but a probability mass function (PMF) over the set of labels . We write this as . For the ranking problem, the set of all probability mass functions over , which we already denoted as , has to be equipped with a partial order supporting the idea that one PMF assigns more weight to the higher labels than another PMF. The weak (first order) stochastic dominance relation is a suitable ordering for [9]. If fX and fY are two elements of , with associated cumulative probability distribution functions (CDF) FX and FY, then (2) and we say that fX is (weakly) dominated by fY. The set of all CDFs over is denoted by , and the order naturally carries over from to as there is a bijection between the two sets. This concept of stochastic dominance is of primordial importance in the framework described in the next section.
In a stochastic context, the supervised ranking problem can be stated as follows: find a learning algorithm, or a mapping , that satisfies the stochastic monotonicity constraint:(3) Of course, we must be able to select a single label from a PMF computed by in a way that doesn’t break the required monotonicity. In [10], it is shown that the Bayesian decision (extracting the label with maximal probability) might break the monotonicity, while taking the mean (and rounding) is valid on numerical scales [12]. Taking the “median” is an operation that is valid on ordinal scales, while preserving the monotonicity property [10]. In general, one has a set of medians for a PMF and choosing a label consistently (e.g. left end point, right end point or midpoint) from this set, which is actually an interval, preserves the monotonicity property. In the rest of this article, we will use the midpoint (or the biggest of the two midpoints) of the set of medians to select labels from PMFs.
At this point, we would also like to stress that the collection of learning examples is a multiset, meaning that in general a learning example can occur more than once. In the setting of traditional supervised classification algorithms, it is possible that there occurs doubt, i.e. there may be examples which have an identical feature vector, but with a different label assigned to them. In the context of supervised ranking algorithms a second kind of mistake, related to the monotonicity constraint, may be present: there may be feature vectors x and y for which , while it holds for their assigned labels d(x) and d(y) that . One says that there is reversed preference between x and y. The algorithms presented here can handle data containing both doubt and reversed preferences, which is a prerequisite for supervised ranking algorithms to be useful in a real life setting, as most real life data sets are pervaded with noise, resulting in doubt and reversed preferences.
The outline of this paper is as follows: in Section 2, we briefly recall the theory developed in [10] which serves as the basis for instance-based supervised ranking algorithms. We concentrate on those parts of the theory that have immediate practical consequences and state four propositions each of which gives rise to a different OSDL variant. Section 3 gives pseudo code of the implemented algorithms and we state the time and space complexity of building the classifier and of classifying an arbitrary element. Three of these algorithms have one degree of freedom and we describe how one can ‘tune’ this parameter. The following section then gives some examples of how to use the software from within the weka environment [4]. We have opted for weka as it is an increasingly popular collection of machine learning algorithms for data mining tasks, issued as open source software under the GNU General Public License. It contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well suited for developing new machine learning schemes.
2. Methods for solving the supervised ranking problem
In [10], a probabilistic framework for dealing with the supervised ranking problem is developed. This general framework begins with describing two particular CDFs, the so-called minimal and maximal extension Fmin and Fmax, that obey the stochastic monotonicity constraint (3); the framework then gives the necessary and sufficient conditions that any ‘interpolation scheme’ between Fmin and Fmax has to satisfy in order to give rise to a supervised ranking algorithm satisfying the stochastic monotonicity constraint (3). Here, we give the definition of the minimal and maximal extensions Fmin and Fmax, and then state some propositions, which are in fact corollaries of the main theorem of [10], without proof.
Let be a collection of learning examples, and let be the image of under the mapping which associates each object with its feature vector, i.e. . For each , let denote a CDF, that is estimated from the collection of learning examples. For the main theorem in [10], the way in which this estimation is done is immaterial, but for the propositions, we will assume that is in fact the maximum likelihood estimation :(4) The set is called the stochastic training data set. It is natural to say that the stochastic training data set is monotone – compare with the monotonicity constraint (3) – if for all elements x and y of it holds that: Given a stochastic training data set , we define the minimal and maximal extension as (5) and (6) When , the PMF fmin is defined as that PMF that assigns probability 1 to , this is the PMF that is dominated by all other PMFs over . Analogously, when , the PMF fmax assigns probability 1 to ; fmax is then the PMF dominating all other PMFs.
One can easily prove some important properties of these minimal and maximal extensions, the first one being that for each element x of is indeed a CDF over , and for all x and y of one has that implies (and likewise for Fmax). A second important property is that when the stochastic training data set is monotone, then it holds for any monotone extension of from to that , for each element x of . (Note that this also explains the names minimal and maximal extension.) Thirdly, when is not monotone, and when x and y, with , do not satisfy the stochastic monotonicity constraint, i.e. is a so-called couple of reversed preference, then for each z between x and y there exists at least one label ℓ of such that . We often refer to the situation as the monotone situation, while we term as the reversed preference situation.
From the first property given in the previous paragraph, it is clear that both the minimal and maximal extension yield a solution to the supervised ranking problem. Next, we will give some ‘interpolation’ schemes for which one can show that they all are valid solutions to the supervised ranking problem. We start with a very simple interpolation scheme, which is especially useful when the stochastic training data set is monotone.
Proposition 2.1 OSDL Let be a collection of learning examples with associated stochastic training data set . For an arbitrary but fixed real number S from the interval [0, 1], define the mapping as follows: (7) For all elements x of , the mapping is an element of . Moreover, the partial mapping is order preserving.
The only degree of freedom left in this proposition is the choice of the interpolation parameter S. We call the algorithm resulting from this proposition the Ordinal Stochastic Dominance Learner, or OSDL for short.
Let x denote an element of for which one would like to make a prediction. When is very small (≈0) this means that predicts a label strictly greater than ℓ for x. On the other hand, when approaches 1, the maximal extension predicts a label at most ℓ for x. When both situations occur simultaneously, we are likely to end up in the reversed preference situation. When this is the case, the idea is to weigh Fmin and Fmax by the number of learning examples that support their respective cases. We define the following mappings: counting the number of instances that indicate that x should receive a label strictly greater than ℓ, and which counts the number of instances indicating that x should receive a label at most ℓ. We have the following proposition.
Proposition 2.2 B-OSDL Proposition 2.1 is still valid if one replaces by: (8) In (8), we distinguish between the monotone situation, for which the interpolation parameter S is used, and the reversed preference situation for which the counting argument is triggered. The algorithm based on this second proposition is called Balanced OSDL, or B-OSDL for short. Note that this algorithm reduces to the OSDL when the stochastic training data set is monotone, since in this case the reversed preference situation will never occur.
It is possible to devise a counting argument for the monotone situation as well. We define the weight of Fmin in the monotone situation as(9) while the weight for Fmin in the reversed preference situation, denoted , is defined as in the reversed preference situation of (8).
We now have the following proposition, with a resulting algorithm called B2-OSDL, or doubly balanced OSDL.
Proposition 2.3 B2-OSDL Proposition 2.1 is still valid if one replaces by: (10) where and are defined in (9) and (8) respectively.
Finally, let us mention that sometimes the monotonicity constraints (1), (3) are relaxed somewhat in the sense that we only require it for elements x and y that are separated by a learning example. Stated otherwise, one only requires quasi-monotonicity w.r.t. the set [11,13]. If one does not want to make a distinction between the monotone and reversed preference situation, the weights for Fmin and Fmax should only depend on x (and not on ℓ). A simple way of achieving this is by counting the number of learning examples that are smaller or bigger than x. We define: However, to avoid division by zero, we add, if not already present, the elements and to the collection of learning examples . These elements are very natural boundary conditions when trying to predict or reconstruct an order-preserving mapping from to (and this is exactly what the supervised ranking problem is). One now has the following proposition.
Proposition 2.4 Let be a collection of learning examples in which, if not already present, the elements and are added. The associated stochastic training data set is . The mapping is defined as follows: (11) For all elements x of the mapping belongs to . Moreover, the partial mapping is quasi-monotone w.r.t. .
3. Implementation of the propositions
3.1. Pseudo code of the two phases
Given a collection of learning examples , the first phase in any classification algorithm involves building the classifier. In the case of the various OSDL variants, this is not too difficult as it simply involves the estimation of the CDFs . As the propositions assume the maximum likelihood estimation , this is what we will use in the implementation as well. The pseudo code of this first phase is given as the method buildClassifier in the top part of Algorithm 1. The fact that this first phase is quite simple explains why we have previously termed algorithms resulting from the aforementioned framework as instance-based.
In this code, a Map represents an implementation of an associative array as a hash table. An associative array maps or binds keys to values. In this case, the keys are elements of the data space , while the values are e.g. frequency counts of the labels, or elements of . We assume that the implementation of the map is such that retrieval and insertion of key/value pairs both have amortised constant time complexity (ignoring the time to compute the hash code).
In practice, we must attach a hash code to each element of . An arbitrary element of consists of a vector of criterion values, i.e. , where each ci(x) is measured on an ordinal scale . Let be the unique increasing mapping that identifies with the first natural numbers. For simplicity, we denote si(ci(x)) as xi. We now interpret x as a number represented in a mixed radix system [8, Section 4.1]: This yields an increasing bijection between and the set of natural numbers smaller than , and for each element of the corresponding hash code is calculated in (n) time, where n denotes the number of criteria.
We can now state the complexity of buildClassifier easily as . Indeed, the loop starting on the second line traverses elements, the get and replace methods each take (n) time (for the computation of the hash code), while the update method is either constant (when freqCount already exists) or of order (n) (otherwise). The loop starting on line 7 considers elements, and the conversion of a frequency count to a CDF takes (n), while putting the CDF in the map takes again (n) time. As the total time complexity of buildClassifier is . Any classifier which makes use of all learning examples presented, has at least this time complexity, as one has to consider each learning example at least once. So in this sense, the time to build the classifier is optimal. As for the space complexity, this is easily seen to be of order (ignoring the space required for storing the initial learning examples).
Once buildClassifier has been executed, a user generally wants to compute the PMF associated with some element x; this is the second phase of the algorithm. For the OSDL variant described in Proposition 2.1, the pseudo code for computing this PMF is given in the second part of Algorithm 1. One simply iterates over the map containing the estimated CDFs; if the element considered (i.e. the key of the map) is incomparable with x, one simply ignores it, otherwise one updates the CDFs Fmin and/or Fmax depending on whether the element considered is smaller and/or bigger than x. An upper bound for the time complexity of classifying an instance is easily seen to be . (Here, we assume that comparing two elements takes time, and that the methods takeMin and takeMax each take time.)
We don’t give explicit pseudo code for the other, slightly more involved, versions of the OSDL algorithm, as in fact this code would be equally straightforward as the one just given. All that is needed are some extra counting arguments in the appropriate places, e.g. to calculate Nmin and Nmax; this is easily achieved without increasing the order of growth of the running time complexity.Algorithm 1. The Ordinal Stochastic Dominance Learner buildClassifier() Require: a collection of learning examples Ensure: a Map containing pairs for the elements of . 1: map ← ∅; map_cdf ← ∅ 2: for do 3: 4: freqCount ← freqCount.update(d(x)) 5: 6: end for 7: for do 8: 9: 10: end for 11: return map_cdf distributionForInstance() Require: an element x of and S ∈ [0,1] Ensure: a PMF over for x 1: Fmin ← (1, 1, … , 1) // minimal CDF 2: Fmax ← (0, 0, … , 1) // maximal CDF 3: for do 4: if then 5: 6: end if 7: if then 8: 9: end if 10: end for 11: F ← SFmin + (1 − S)Fmax 12: return F.toProbabilityMassFunction()
3.2. Tuning the interpolation parameter S
The attentive reader will have noticed that nowhere in the preceding text have we given a hint as to how to choose the interpolation parameter S that turns up in the various algorithms. In order to lift the burden of making a (more or less arbitrary) choice for the parameter S from the shoulders of the user of the program, we have made an attempt to let the program devise a reasonable value for S. Of course, we did not want this feature to increase the running time of the program dramatically. To this end, we use a simple scheme that resembles the way in which weka determines the number of neighbours to use in its implementation of the nearest neighbour algorithm, the IBk-class. This scheme is based on leave-one-out cross validation [17, Section 5.4]. This is, for each possible value of S under consideration, we determine the performance of the algorithm for a specified loss function, typically the 0–1 loss function, using leave-one-out cross validation.
This is not as bad as it sounds at first sight since this can be done rather efficiently, as is shown in Algorithm 2. Basically, one considers the learning examples one by one and one classifies the learning example under consideration on the basis of the remaining learning examples. One does this for values of S that are equally spaced in the interval [Sl, Su]. The time complexity of this procedure is determined as follows: building the classifier takes time. The operation that has to be done for each element takes time. Basically, this involves two calls to distributionForInstance and a loop calculating the loss between d(x) and the median prediction. Computing the median takes time. So, if we ignore the time to build the classifier (this has to be done anyhow), the time complexity of tuning the interpolation parameter is .Algorithm 2. Tuning the interpolation parameter S Require: (1) a collection of learning examples ; (2) a lower bound Sl and an upper bound Su for S; (3) a natural number N determining the subdivision of [Sl, Sr]; (4) a loss function L Ensure: the tuned value for S 1: map ← buildClassifier 2: step ← (Su − Sl)/N 3: loss ← (0, 0, … , 0) // length N + 1 4: for all do 5: 6: 7: 8: S ← Sl 9: for i = 0 to N do 10: loss[i] ← loss[i] + L(d(x),median(Sfmin + (1 − S)fmax)) 11: S ← S + step 12: end for 13: 14: end for 15: i ← findIndexOfMin(loss) 16: return Sl + i × step
4. Use of the software
The described algorithms are available as a Java-implementation that is distributed under the GNU General Public License [1]. To facilitate the usage and dissemination of the algorithms, we have chosen to develop an implementation that fits seamlessly into the weka environment. The weka-workbench [17], also distributed under the terms of the GNU GPL, is a very comprehensive collection of machine learning algorithms and data preprocessing tools, written in the Java programming language. A major advantage of using Java as language is that it is platform independent (one only needs a JVM for the goal platform), and the fact that it is open source allows one to learn how actual classifiers are implemented (and the provided sources may serve as a basis for implementing your own classifiers).
For a user only interested in employing machine learning algorithms, weka offers various graphical user interfaces, such as the “Explorer” and the “Knowledge Flow” interface, both designed for working conveniently with filters and classifiers, and the ”Experimenter” interface, designed for conducting experiments regarding the performance of different classifiers on a range of data sets in a quite automated manner. Since our implementation of the OSDL variants adheres to all the conventions imposed on weka classifiers, it is very easy to make them available as choices in the various menus of these interfaces, as is shown in Fig. 1. One can also use the various algorithms (including the OSDL algorithm) from the command line.
Fig. 1: Snapshot of weka showing the available algorithms for supervised ranking.
As can be seen in Fig. 1, there also reside two other classes in theweka.classifiers.monotone package, namely MinMaxExtension and OLM. These classes implement other solutions to the supervised ranking problem. As the name indicates, MinMaxExtension implements the minimal and maximal extension, two very simple solutions to the supervised ranking problem, which are in fact non-stochastic analogues of the minimal and maximal extensions Fmin and Fmax introduced earlier. The class OLM is an implementation of (some variants of) the Ordinal Learning Method [6,5], the first algorithm devised especially for supervised ranking.
After obtaining and installing weka and the provided implementation of the OSDL algorithm [2], a typical first usage using the command line interface would be as follows:
java weka.classifiers.monotone.OSDL
This will display an error message because no training file was given, but, more interestingly, it will also show the various options available when using the OSDL class. We ignore the options common to all weka-classifiers, but we do give an overview of the options specific to the OSDL class together with their meaning. These can be found in Table 1.
Table 1: Options specific for the OSDL class.
| -C | 〈classification type〉 |
| Sets the classification type to be used (default: MED) | |
| Valid classification types are: REG, WSUM, MAX, MED, RMED | |
| -B | Use the balanced version of the Ordinal Stochastic Dominance Learner |
| -W | Use the weighted version of the Ordinal Stochastic Dominance Learner |
| -S | 〈value of interpolation parameter〉 |
| Sets the value of the interpolation parameter (default: 0.5) | |
| -L | 〈Lower bound for interpolation parameter〉 |
| Lower bound for the interpolation parameter (default: 0) | |
| -U | 〈Upper bound for interpolation parameter〉 |
| Upper bound for the interpolation parameter (default: 1) | |
| -P | 〈Number of parts〉 |
| Determines the step size for tuning the interpolation parameter, nl. (U-L)/P | |
| -D | Outputs debugging info (default: false) |
The -C option sets the way in which a single label is deduced from the PMF generated by the algorithm. The value MED is the most important one, as this uses the median, which is valid on ordinal scales and on top of that ensures monotonicity. More in particular, MED chooses the midpoint (or the biggest of the two midpoints) of the set of medians. The value RMED does the same, but does not “round” the result to the closest label, so “halfway” labels are possible, i.e. when the set of medians consists of an even number of labels, an intermediate label is produced. When one chooses MAX, the Bayesian decision is made, which is valid on ordinal scales, but which may break the monotonicity property. Hence, choosing MAX does not guarantee a valid solution to the supervised ranking problem! Finally, REG and WSUM bot use the mean to deduce a label from the PMF, the first choice does no rounding, while the second choice does. The scores used for the calculation of the mean are equidistant, as they are the internal weka values for the labels. Although ensuring monotonicity, these last two choices are not semantically valid on intrinsically ordinal scales.
When the switch -B is set, one uses B-OSDL; when the switch -W is set, the quasi-monotone version of OSDL, described in Proposition 2.4 is triggered. When both are set simultaneously, the doubly balanced OSDL algorithm, B2-OSDL, is used.
The meaning of the -S option should be clear; it simply sets the interpolation parameter S that is used in the OSDL variants (if the quasi-monotone variant is used, this option is simply ignored). The following three options -L, -U and -P, set the values of Sl, Su and N in Algorithm 2, respectively. The switch -D finally does as it says in the description; when it is set some extra debugging information is shown on the console.
A basic use is thus the following:
java weka.classifiers.monotone.OSDL -t car.arff -C MED -S 0.5
This applies the OSDL algorithm to the data set car.arff, which is the “Car Evaluation Database” from the UCI Machine Learning Repository [3] put into arff format, with interpolation parameter 1/2 and with the median as selection criterion. In Fig. 2, the output of this command is shown. There, one sees that the resubstitution error on the original data is 1.2153%. This means that this data set is quite monotone, but that there are indeed couples of reversed preference present in the data. (By construction, the data set does not contain doubt and a property of OSDL is that on a data set containing no doubt nor reversed preference, the resubstitution error is zero.) An estimate for the error on an independent test set was made using 10-fold stratified cross validation and in this case it is 3.8194%. Also, some other statistics are shown, together with two confusion matrices.
Fig. 2: Output from the command java weka.classifiers.monotone.OSDL -t car.arff -C MED -S 0.5.
Finally, as a last example, in Fig. 3 a snapshot of the “Analyse” tab of the Experiment interface is shown. In fact, it is the result of running the input of Fig. 1. It shows the average percentage of correctly classified instances over 10 runs of 10-fold cross validation. We used six instantiations of the algorithms given using the four propositions, and for the first two propositions, we also use the option that allows tuning of the interpolation parameter. One sees that the six instantiations give slightly different results, some of them being statistically different for a significance level of 5%.
Fig. 3: Screen shot showing the results of running six variants of the OSDL algorithm on the dataset car.arff.
5. Conclusion
In this article, we have stated the supervised ranking problem, focusing on the differences w.r.t. the traditional classification problem. We have given, without proof, four propositions that all provide solutions to the supervised ranking problem. Three of the propositions distinguish between the monotone situation and the reversed preference situation. An alternative approach is to relabel the data set so that it becomes monotone; in that case only the basic Ordinal Stochastic Dominance Learner is of use. For more information, we refer to [14,15,16].
Our main objective, however, was to introduce a software implementation of these solutions that can be used inside the weka environment. We have shown how to get started using this software, and we have given snapshots of weka outputs that show that indeed the implementation fits into the framework. The software was a key to a successful application in group decision making [18,19,20].
In the near future, we plan to undertake an experimental study that compares the newly developed algorithms for supervised ranking with existing ones and with traditional classification algorithms that do not guarantee the monotonicity of their solutions. The weka environment will be used to carry out this experimental study.
References
-
- The GNU General Public License. <http://www.gnu.org>.
-
- Software for supervised ranking. <http://www.kermit.ugent.be>.
-
- UCI repository of machine learning databases. <http://www.ics.uci.edu/mlearn/MLRepository.html>.
-
- Weka 3: Data Mining Software in Java. <http://www.cs.waikato.ac.nz/ml/weka>.
-
- Automatic generation of symbolic multiattribute ordinal knowledge-based DSSs: methodology and applications
- Decision Sciences, 23, 1357-1372 (1992)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Learning and classification of monotonic ordinal concepts
- Computational Intelligence, 5, 45-49 (1989)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Building criteria: a prerequisite for MCDA
- Readings in Multiple Criteria Decision Aid, 58-80 (1990)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Seminumerical Algorithms
- The Art of Computer Programming, vol. 2 (1969)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Stochastic Dominance
- Kluwer Academic Publishers, MA (1998)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- A probabilistic framework for the design of instance-based supervised ranking algorithms in an ordinal setting
- Annals of Operations Research, 80, 105-136 (2008)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Data analysis by positive decision trees
- IEICE Transactions on Information and Systems, E82-D, 76-88 (1999)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Dual stochastic dominance and related mean-risk models
- SIAM Journal of Optimization, 13, 60-78 (2002)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- R. Potharst, J.C. Bioch, Quasi-monotone decision trees for ordinal classification, in: F. Verdenius, W. van den Broek (Eds.), Proceedings of the Eighth Belgian-Dutch Conference on Machine Learning, 1998, pp. 122–131.
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- M. Rademaker, B. De Baets, Optimal restoration of stochastic monotonicity w.r.t. cumulative label frequency loss functions, Information Sciences, submitted for publication.
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Loss optimal monotone relabelling of noisy multi-criteria data sets
- Information Sciences, 179, 4089-4096 (2009)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- M. Rademaker, B. De Baets, H. De Meyer, Optimal monotone relabelling of partially non-monotone ordinal data, Optimization Methods and Software, submitted for publication, doi:10.1080/10556788.2010.507272.
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations
- Morgann Kaufmann (2005)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Qualitative valuation of environmental criteria through a group consensus based on stochastic dominance
- Ecological Economics, 67, 253-264 (2008)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Improving tractability of group decision making on environmental problems through the use of social intensities of preferences
- Environmental Modelling and Software, 24, 1457-1466 (2009)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
-
- Environmental decision making with conflicting social groups: a case study of the Lar rangeland in Iran
- Journal of Arid Environments, 74, 394-402 (2010)
- Full Text via CrossRef | View Record in Scopus | Cited By in Scopus (?!?)
Automatic
Article only
Article & outline
Article & sidebar
Full view 




