MIN(DISTINCT X)

What is doing the MIN(DISTINCT X) call? Basically, every distinct value of X is passed to the MIN function. Well, it is probably of very little interest as the MIN function is very fast and processing less rows than MIN(X) should not boost the performance because of the overhead of sorting distinct values.

However, if you write your own aggregate, distinct may be interesting!

create type myudag_type as object
(
myudag INTEGER,
static function ODCIAggregateInitialize(
sctx IN OUT myudag_type)
return number,
member function ODCIAggregateIterate(
self IN OUT myudag_type,
value IN INTEGER)
return number,
member function ODCIAggregateTerminate(
self IN myudag_type,
returnValue OUT INTEGER,
flags IN number)
return number,
member function ODCIAggregateMerge(
self IN OUT myudag_type,
ctx2 IN myudag_type)
return number
);
/

create or replace type body myudag_type is
static function ODCIAggregateInitialize(
sctx IN OUT myudag_type)
return number is
begin
sctx := myudag_type(0);
return ODCIConst.Success;
end;
member function ODCIAggregateIterate(
self IN OUT myudag_type,
value IN INTEGER)
return number is
begin
-- doing nothing will cost you a lot !!!
for i in 1..1000000 loop null; end loop;
return ODCIConst.Success;
end;
member function ODCIAggregateTerminate(
self IN myudag_type,
returnValue OUT INTEGER,
flags IN number) return number is
begin
returnValue := self.myudag;
return ODCIConst.Success;
end;
member function ODCIAggregateMerge(
self IN OUT myudag_type,
ctx2 IN myudag_type)
return number is
begin
return ODCIConst.Success;
end;
end;
/

CREATE FUNCTION myudag (
input INTEGER)
RETURN INTEGER
AGGREGATE USING myudag_type;
/

SQL> select myudag(deptno) from emp;

MYUDAG(DEPTNO)
--------------
0

Elapsed: 00:00:00.57
SQL> select myudag(distinct deptno) from emp;

MYUDAG(DISTINCTDEPTNO)
----------------------
0

Elapsed: 00:00:00.13

2 thoughts on “MIN(DISTINCT X)

  1. Gary

    Whether it is appropriate to your aggregation function is a different matter.
    MIN is unaffected, but AVG(DISTINCT col) and AVG(col) would return different results.

  2. Laurent Schneider Post author

    of course Gary, thanks for making this clear, my point was using DISTINCT to achieve a better performance when ALL and DISTINCT are equivalent, as for min, MIN(ALL X) is always the same as MIN(DISTINCT X)

    You may then wonder why MIN and MAX accept DISTINCT but MEDIAN does not…

Comments are closed.