MIN(DISTINCT X) – Laurent Schneider

What is doing the MIN(DISTINCT X) call? Basically, every distinct value of X is passed to the MIN function. Well, it is probably of very little interest as the MIN function is very fast and processing less rows than MIN(X) should not boost the performance because of the overhead of sorting distinct values.

However, if you write your own aggregate, distinct may be interesting!

create type myudag_type as object ( myudag INTEGER, static function ODCIAggregateInitialize( sctx IN OUT myudag_type) return number, member function ODCIAggregateIterate( self IN OUT myudag_type, value IN INTEGER) return number, member function ODCIAggregateTerminate( self IN myudag_type, returnValue OUT INTEGER, flags IN number) return number, member function ODCIAggregateMerge( self IN OUT myudag_type, ctx2 IN myudag_type) return number ); /


create or replace type body myudag_type is

static function ODCIAggregateInitialize(

  sctx IN OUT myudag_type)

  return number is

begin

  sctx := myudag_type(0);

  return ODCIConst.Success;

end;

member function ODCIAggregateIterate(

  self IN OUT myudag_type,

  value IN INTEGER)

  return number is

begin

  -- doing nothing will cost you a lot !!!

  for i in 1..1000000 loop null; end loop;

  return ODCIConst.Success;

end;

member function ODCIAggregateTerminate(

  self IN myudag_type,

  returnValue OUT INTEGER,

  flags IN number) return number is

begin

  returnValue := self.myudag;

  return ODCIConst.Success;

end;

member function ODCIAggregateMerge(

  self IN OUT myudag_type,

  ctx2 IN myudag_type)

  return number is

begin

  return ODCIConst.Success;

end;

end;

/
CREATE FUNCTION myudag (

  input INTEGER)

  RETURN INTEGER

  AGGREGATE USING myudag_type;

/
SQL> select myudag(deptno) from emp;
MYUDAG(DEPTNO)

--------------

             0
Elapsed: 00:00:00.57

SQL> select myudag(distinct deptno) from emp;
MYUDAG(DISTINCTDEPTNO)

----------------------

                     0
Elapsed: 00:00:00.13

2 thoughts on “MIN(DISTINCT X)”

Gary October 30, 2007 at 00:04

Whether it is appropriate to your aggregation function is a different matter.
MIN is unaffected, but AVG(DISTINCT col) and AVG(col) would return different results.
Laurent Schneider Post authorOctober 30, 2007 at 09:24

of course Gary, thanks for making this clear, my point was using DISTINCT to achieve a better performance when ALL and DISTINCT are equivalent, as for min, MIN(ALL X) is always the same as MIN(DISTINCT X)

You may then wonder why MIN and MAX accept DISTINCT but MEDIAN does not…

Comments are closed.