TSQl-Accmulations

#1 07-01-2007, 09:49 AM

Batool

What version are you using?
How big is the table?
What is the indexes defined on the table?

"Batool" wrote in message
news:ewbSFX%23uHHA.4908-at-TK2MSFTNGP03.phx.gbl...
>I need an appropriate select statement for my case, the case is a table
>with two columns (row id ,col1) I want to retrieve a three columns
> RowId, col1 ,accumulation of col1
>
> I wrote this select statement, but the performance is going down is there
> an
> alternative way for writing this SQL statement
>
> SELECT RowID ,Col1, (SELECT SUM(Col1) from tablehh where RowID
> <=aa.RowId)
> AS Col2 FROM tablehh as aa ORDER BY Col1,Col2
> The output should be like this:
>
> RowId Col1 Col2
> 1 10 10
> 2 100 110
> 3 15 125
> 4 20 145]
>
> Col2 is the accumulation of col1
>
> thx;
> batool.
>
>
>

#2 07-01-2007, 10:19 AM

Hi Batool,

The algorithmic complexity of the set-based solution you have is N^2; even
with a nice covering index on RowID, Col1, the run time will be roughly n +
(n + n^2)/2, where n is the time it takes to scan n rows in the index. If
you run a benchmark showing the run time of the query in respect to the
number of rows in the table, you will get a nicely shaped parabola.
I'm afraid that beyond a few dozens of rows in the table, a cursor-based
solution will be faster in this case despite the record-by-record overhead
since the cursor-based solution will do a single pass over the data.
This, I'm afraid, will be true until SQL Server will add a more complete
support for the ANSI OVER clause, and I don't know when/if this will happen.
Per ANSI SQL, you can achieve this with the following set-based solution
(following is not supported in SQL Server):

SELECT RowID ,Col1,
SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT
ROW) AS Col2
FROM tablehh
ORDER BY RowId;

If implemented, the cost of this query would most likely simply be n
provided an index on RowId, Col1.

For details about the cursor solution you can use now and why it's faster in
this case, see sample chapter 03 - Cursors from Inside T-SQL Programming:

http://www.insidetsql.com/sample_chapters.htm

For details about the OVER clause, see:

http://www.insidetsql.com/OVER_Claus...lculations.doc

--
BG, SQL Server MVP
www.SolidQ.com
www.InsideTSQL.com

"Batool" wrote in message
news:ewbSFX%23uHHA.4908-at-TK2MSFTNGP03.phx.gbl...
>I need an appropriate select statement for my case, the case is a table
>with two columns (row id ,col1) I want to retrieve a three columns
> RowId, col1 ,accumulation of col1
>
> I wrote this select statement, but the performance is going down is there
> an
> alternative way for writing this SQL statement
>
> SELECT RowID ,Col1, (SELECT SUM(Col1) from tablehh where RowID
> <=aa.RowId)
> AS Col2 FROM tablehh as aa ORDER BY Col1,Col2
> The output should be like this:
>
> RowId Col1 Col2
> 1 10 10
> 2 100 110
> 3 15 125
> 4 20 145]
>
> Col2 is the accumulation of col1
>
> thx;
> batool.
>
>
>

#3 07-01-2007, 10:45 AM

I need an appropriate select statement for my case, the case is a table with
two columns (row id ,col1) I want to retrieve a three columns
RowId, col1 ,accumulation of col1

I wrote this select statement, but the performance is going down is there an
alternative way for writing this SQL statement

SELECT RowID ,Col1, (SELECT SUM(Col1) from tablehh where RowID <=aa.RowId)
AS Col2 FROM tablehh as aa ORDER BY Col1,Col2
The output should be like this:

RowId Col1 Col2
1 10 10
2 100 110
3 15 125
4 20 145]

Col2 is the accumulation of col1

thx;
batool.

#4 07-01-2007, 02:44 PM

Hi Itzik,

I hope Microsoft folks were monitoring this ng. Your suggestions (and Sujata
Mehta) are based on real necessities and not fancy T-SQL.

One more thing, Am I posting in the Israeli SQL Server ng? :-)

AMB

"Itzik Ben-Gan" wrote:

> Hi Batool,
>
> The algorithmic complexity of the set-based solution you have is N^2; even
> with a nice covering index on RowID, Col1, the run time will be roughly n +
> (n + n^2)/2, where n is the time it takes to scan n rows in the index. If
> you run a benchmark showing the run time of the query in respect to the
> number of rows in the table, you will get a nicely shaped parabola.
> I'm afraid that beyond a few dozens of rows in the table, a cursor-based
> solution will be faster in this case despite the record-by-record overhead
> since the cursor-based solution will do a single pass over the data.
> This, I'm afraid, will be true until SQL Server will add a more complete
> support for the ANSI OVER clause, and I don't know when/if this will happen.
> Per ANSI SQL, you can achieve this with the following set-based solution
> (following is not supported in SQL Server):
>
> SELECT RowID ,Col1,
> SUM(Col1) OVER(ORDER BY RowId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT
> ROW) AS Col2
> FROM tablehh
> ORDER BY RowId;
>
> If implemented, the cost of this query would most likely simply be n
> provided an index on RowId, Col1.
>
> For details about the cursor solution you can use now and why it's faster in
> this case, see sample chapter 03 - Cursors from Inside T-SQL Programming:
>
> http://www.insidetsql.com/sample_chapters.htm
>
> For details about the OVER clause, see:
>
> http://www.insidetsql.com/OVER_Claus...lculations.doc
>
> --
> BG, SQL Server MVP
> www.SolidQ.com
> www.InsideTSQL.com
>
>
> "Batool" wrote in message
> news:ewbSFX%23uHHA.4908-at-TK2MSFTNGP03.phx.gbl...
> >I need an appropriate select statement for my case, the case is a table
> >with two columns (row id ,col1) I want to retrieve a three columns
> > RowId, col1 ,accumulation of col1
> >
> > I wrote this select statement, but the performance is going down is there
> > an
> > alternative way for writing this SQL statement
> >
> > SELECT RowID ,Col1, (SELECT SUM(Col1) from tablehh where RowID
> > <=aa.RowId)
> > AS Col2 FROM tablehh as aa ORDER BY Col1,Col2
> > The output should be like this:
> >
> > RowId Col1 Col2
> > 1 10 10
> > 2 100 110
> > 3 15 125
> > 4 20 145]
> >
> > Col2 is the accumulation of col1
> >
> > thx;
> > batool.
> >
> >
> >
>

#5 07-01-2007, 05:01 PM

"Itzik Ben-Gan" wrote in message
news:F90CA3E5-68C7-4675-89A7-B9F2DC257982-at-microsoft.com...
> This, I'm afraid, will be true until SQL Server will add a more complete
> support for the ANSI OVER clause, and I don't know when/if this will
> happen.

I'm shocked, shocked that posting to MS connect seems to have
made no impact.
The sad fact is that most MS users have no idea of what your
talking about. This may even be true of the development team.
And/or the company has lost the ability to feel a sense of
shame. What else could explain offering something that is by
definition only one half complete. One has to wonder about
those that hold the company to a high intellectual standard
in the hope that they will rise to the occasion. For those
that yard stick must be a heavy cross to bear.

www.beyondsql.blogspot.com

#6 07-01-2007, 05:22 PM

> I'm shocked, shocked that posting to MS connect seems to have
> made no impact.

How quickly do you think they can change such a feature? This isn't a
one-man shop with no QA we're talking about here...

A

#7 07-01-2007, 05:45 PM

Giving MS the benefit of doubt, they must (or should) have been
aware of the incomplete nature of over() when they were first
working on S2005. If anyone argues that this came as a surprise
when it was posted in Jan then MS needs to raid its competitors
for people who know what's going on their industry. But this is
not a QA issue but probably one of sleep. As in who feel asleep
at the switch -

best,
steve

"Aaron Bertrand [SQL Server MVP]" wrote in message
news:eaTrVWCvHHA.3376-at-TK2MSFTNGP04.phx.gbl...
>> I'm shocked, shocked that posting to MS connect seems to have
>> made no impact.
>
> How quickly do you think they can change such a feature? This isn't a
> one-man shop with no QA we're talking about here...
>
> A
>

#8 07-01-2007, 06:13 PM

> Giving MS the benefit of doubt, they must (or should) have been
> aware of the incomplete nature of over() when they were first
> working on S2005.

So, let me know what you think would be a better alternative:

(a) not have OVER() at all
(b) still be waiting for SQL Server 2005
(c) have SQL Server cost double what it costs now because they had to hire n
extra engineers to make OVER() complete in time

#9 07-01-2007, 07:20 PM

"Aaron Bertrand [SQL Server MVP]" wrote in message
news:ebB4CzCvHHA.3444-at-TK2MSFTNGP04.phx.gbl...
> > Giving MS the benefit of doubt, they must (or should) have been
> > aware of the incomplete nature of over() when they were first
> > working on S2005.
>
> So, let me know what you think would be a better alternative:
>
> (a) not have OVER() at all
> (b) still be waiting for SQL Server 2005
> (c) have SQL Server cost double what it costs now because they had to hire
n
> extra engineers to make OVER() complete in time

Simply to do a better job of it and spare you from making excuses for
them -

#10 07-01-2007, 07:23 PM

>> So, let me know what you think would be a better alternative:
>>
>> (a) not have OVER() at all
>> (b) still be waiting for SQL Server 2005
>> (c) have SQL Server cost double what it costs now because they had to
>> hire
> n
>> extra engineers to make OVER() complete in time
>
> Simply to do a better job of it and spare you from making excuses for
> them -

I'm not making excuses for anyone, Steve. I work in the software industry,
and I realize that every choice has a trade-off. Expecting perfection even
from Microsoft is a little unrealistic and naive. But I've debated with you
about that before and I realize that I should cut my losses because it is
hopeless.

A

Thread Tools
Show Printable Version Email this Page
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode