Showing posts with label based. Show all posts
Showing posts with label based. Show all posts

Wednesday, March 21, 2012

May I have my attributes discretized based on my own expression?

Hi, all here.

I am just having one question about discretization of continous attributes values. Cos the current discretization methods available in SQL Server 2005 data mining engine are these 3 ones:

.......................................................................................

automatic;

equal areas;

clusters.

..........................................................................................

So how these 3 methods work respectively? I mean like clusters method, how dose it discretize the continous values?

More importantly, can we have a discretization based on our own expression? like when i have one column with values ranging from 1 to 10, may we discretize this column based on expression like: 1-3,4-6,7-10?

Thanks a lot for any guidance.

User-defined ranges are not supported.

Here are descriptions of the supported discretization methods:

· Clusters: This finds buckets by performing single-dimensional clustering on the input values using the K-Means algorithm. It uses Gaussian distributions.

· EqualAreas: This examines the distribution of values across the population and creates bucket ranges such that that the total population is distributed equally across the buckets. In other words, if the distribution of continuous values were plotted as a curve, the areas under the curve covered by each bucket range would be equal. This is useful when there are a large number of duplicate values.

· Automatic: If this is selected, we try obtaining the requested number of buckets by applying the above discretization methods in the following order: Clusters, EqualAreas. We use the first method that gets closest to the number of requested buckets.

The Clusters method use random sampling (with a sample size of 1000) so EqualAreas may be used in situations where sampling is not desirable.

|||Hi, Thanks a lot.|||

However, you can always add a calculated column to do your own discretization. For example you can add a column "AgeDisc" with the expression

CASE WHEN [Age]<20 THEN 'Under 20'
WHEN [Age] <= 30 THEN 'Between 20 and 30'
ELSE 'Over 30'
END

Of course, you will have to map any input data to these values for predictions.

|||Jamie, thanks a lot. Very helpful.

Friday, March 9, 2012

Maximum height for table

How do I set the maximum vertical size of a table? I would like to set it based on total vertical size in inches or a maximum number of rows returned.

Thanks for any help.

Brad

Click View -> Properties window. The dimensions for a table/matrix can be changed in this window.

|||

Hi Greg,

Do you know how does this height create any difference, what i mean is usually this doesnt restrict the table from expanding or so.

|||So you're saying that you want to cut off the rest of your data if it goes past the height you specify?

|||

Hi Greg -

I don't think I stated my question clearly. What I'm trying to do is limit the number of rows returned in my table. So for example, if the dataset connected to my table contained 30 rows, is it possible to cap the number that are returned to 20? If not, is it possible to set the "canGrow" property to a maximum height?

Thanks.

Brad

|||

I would do this in your SQL query.

SELECT TOP 20 * FROM .....

|||

Yes, but unfortunately I am in a situation where I cannot control the results of the stored procedure. I am only able to use pre-existing SPs and build the report.

Brad

|||

Please try visibility condition of that section, use =iif(rownumber(nothing)>=20,true,false).

Priyank

|||

That's a good thought but unfortunately does not do exactly what I need. Using my example from earlier, that code returns 20 rows of data, and then 10 blank lines. My table is still expanded to the full 30 rows however.

Any other ideas?

Thanks.

Brad

|||

Are you using it in the visibility of the section and not for the indivisual textbox. What i mean to say is select the complete detail section of the table and then in the visibility condition of the section write this condition, It shold supress the complete sectiona dn shold not give you blank rows.

|||

I had selected the textbox and the full table, but not the detail portion of the table. Smile

Putting the code in the detail portion worked like a charm, thank you!

Saturday, February 25, 2012

max(tag) for pair of matched rows (was "Need help on query")

I've got a table of transactions which are linked up in pairs based on the column 'Ref'. 'F's are the identifiers of each transaction while 'S's points to the 'F' of its matching pair. I need to select all the transactions with the larger 'Tag' for each pair, can someone point me in the right direction? :confused:

Tag Ref Type
-- -- --
1 200 F
1 201 S
2 201 F
2 200 S
3 202 F
3 203 S
4 203 F
4 202 S
5 204 F
5 205 S
6 205 F
6 204 SIs this what you are after?

select t.tag, t.ref, t.type
from t
join (select max(tag)as tag, ref from t
group by ref) b
on t.tag = b.tag
and t.ref = b.ref

giving you the result set

6 205 F
6 204 S
4 203 F
4 202 S
2 201 F
2 200 S|||Yes it is, thanks!