Friday, March 23, 2012
MCDBA
a good book/set of books for the core exams...?
I am quite competent with desgning and administering databases from
Enterprise Manager, but need to fill in all of the T-SQL-sized gaps... And
can't afford to do a tutor-led course...
Thanks
Nick
did you hear "=?Utf-8?B?Tkg=?=" <NH@.discussions.microsoft.com> say in
news:B9657604-9F1D-4F51-806D-2641A6FA0D61@.microsoft.com:
> can recommend
> a good book/set of books
you mean other than Books On Line? ;)
I liked the Sybex books, YMMV
Neil MacMurchy
http://spaces.msn.com/members/neilmacmurchy
http://spaces.msn.com/members/mctblogs
sql
MCDBA
a good book/set of books for the core exams...?
I am quite competent with desgning and administering databases from
Enterprise Manager, but need to fill in all of the T-SQL-sized gaps... And
can't afford to do a tutor-led course...
Thanks
Nickdid you hear "=?Utf-8?B?Tkg=?=" <NH@.discussions.microsoft.com> say in
news:B9657604-9F1D-4F51-806D-2641A6FA0D61@.microsoft.com:
> can recommend
> a good book/set of books
you mean other than Books On Line? ;)
I liked the Sybex books, YMMV
--
Neil MacMurchy
http://spaces.msn.com/members/neilmacmurchy
http://spaces.msn.com/members/mctblogs
MCDBA
a good book/set of books for the core exams...?
I am quite competent with desgning and administering databases from
Enterprise Manager, but need to fill in all of the T-SQL-sized gaps... And
can't afford to do a tutor-led course...
Thanks
Nickdid you hear "examnotes" <NH@.discussions.microsoft.com> say in
news:B9657604-9F1D-4F51-806D-2641A6FA0D61@.microsoft.com:
> can recommend
> a good book/set of books
you mean other than Books On Line? ;)
I liked the Sybex books, YMMV
Neil MacMurchy
http://spaces.msn.com/members/neilmacmurchy
http://spaces.msn.com/members/mctblogs
Wednesday, March 21, 2012
May I have my attributes discretized based on my own expression?
Hi, all here.
I am just having one question about discretization of continous attributes values. Cos the current discretization methods available in SQL Server 2005 data mining engine are these 3 ones:
.......................................................................................
automatic;
equal areas;
clusters.
..........................................................................................
So how these 3 methods work respectively? I mean like clusters method, how dose it discretize the continous values?
More importantly, can we have a discretization based on our own expression? like when i have one column with values ranging from 1 to 10, may we discretize this column based on expression like: 1-3,4-6,7-10?
Thanks a lot for any guidance.
User-defined ranges are not supported.
Here are descriptions of the supported discretization methods:
· Clusters: This finds buckets by performing single-dimensional clustering on the input values using the K-Means algorithm. It uses Gaussian distributions.
· EqualAreas: This examines the distribution of values across the population and creates bucket ranges such that that the total population is distributed equally across the buckets. In other words, if the distribution of continuous values were plotted as a curve, the areas under the curve covered by each bucket range would be equal. This is useful when there are a large number of duplicate values.
· Automatic: If this is selected, we try obtaining the requested number of buckets by applying the above discretization methods in the following order: Clusters, EqualAreas. We use the first method that gets closest to the number of requested buckets.
The Clusters method use random sampling (with a sample size of 1000) so EqualAreas may be used in situations where sampling is not desirable.
|||Hi, Thanks a lot.|||However, you can always add a calculated column to do your own discretization. For example you can add a column "AgeDisc" with the expression
CASE WHEN [Age]<20 THEN 'Under 20'
WHEN [Age] <= 30 THEN 'Between 20 and 30'
ELSE 'Over 30'
END
Of course, you will have to map any input data to these values for predictions.
|||Jamie, thanks a lot. Very helpful.