MIN(DISTINCT X)

What is doing the MIN(DISTINCT X) call? Basically, every distinct value of X is passed to the MIN function. Well, it is probably of very little interest as the MIN function is very fast and processing less rows than MIN(X) should not boost the performance because of the overhead of sorting distinct values.

However, if you write your own aggregate, distinct may be interesting!

create type myudag_type as object
(
  myudag INTEGER,
  static function ODCIAggregateInitialize(
    sctx IN OUT myudag_type)
    return number,
  member function ODCIAggregateIterate(
    self IN OUT myudag_type,
    value IN INTEGER)
    return number,
  member function ODCIAggregateTerminate(
    self IN myudag_type,
    returnValue OUT INTEGER,
    flags IN number)
    return number,
  member function ODCIAggregateMerge(
    self IN OUT myudag_type,
    ctx2 IN myudag_type)
    return number
);
/

create or replace type body myudag_type is
static function ODCIAggregateInitialize(
  sctx IN OUT myudag_type)
  return number is
begin
  sctx := myudag_type(0);
  return ODCIConst.Success;
end;
member function ODCIAggregateIterate(
  self IN OUT myudag_type,
  value IN INTEGER)
  return number is
begin
  -- doing nothing will cost you a lot !!!
  for i in 1..1000000 loop null; end loop; 
  return ODCIConst.Success;
end;
member function ODCIAggregateTerminate(
  self IN myudag_type,
  returnValue OUT INTEGER,
  flags IN number) return number is
begin
  returnValue := self.myudag;
  return ODCIConst.Success;
end;
member function ODCIAggregateMerge(
  self IN OUT myudag_type,
  ctx2 IN myudag_type)
  return number is
begin
  return ODCIConst.Success;
end;
end;
/

CREATE FUNCTION myudag (
  input INTEGER)
  RETURN INTEGER
  AGGREGATE USING myudag_type;
/

SQL> select myudag(deptno) from emp;

MYUDAG(DEPTNO)
--------------
             0

Elapsed: 00:00:00.57
SQL> select myudag(distinct deptno) from emp;

MYUDAG(DISTINCTDEPTNO)
----------------------
                     0

Elapsed: 00:00:00.13

2 thoughts on “MIN(DISTINCT X)”

  1. Whether it is appropriate to your aggregation function is a different matter.
    MIN is unaffected, but AVG(DISTINCT col) and AVG(col) would return different results.

  2. of course Gary, thanks for making this clear, my point was using DISTINCT to achieve a better performance when ALL and DISTINCT are equivalent, as for min, MIN(ALL X) is always the same as MIN(DISTINCT X)

    You may then wonder why MIN and MAX accept DISTINCT but MEDIAN does not…

Leave a Reply

Your email address will not be published.


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>