Friday, 28 March 2008

Using ROWNUM in Oracle

ROWNUM is an Oracle pseudo column which numbers the rows in a result set.
SELECT rownum, table_name
FROM user_tables;
ROWNUM        TABLE_NAME                     
------------- -----------------
1 EMP
2 DEPT
3 BONUS
4 SALGRADE
5 DUMMY
5 rows selected
Here is a summary of how ROWNUM can be used.

Limiting Rows

ROWNUM can be used to limit the number of rows returned by a query in a similar way to LIMIT in Postgres and MySql, TOP in SQL Server and FETCH FIRST in DB2.
SELECT rownum, table_name
FROM user_tables
WHERE rownum <=3;
ROWNUM        TABLE_NAME                     
------------- -----------------
1 EMP
2 DEPT
3 BONUS
3 rows selected

ROWNUM with DML

The use of ROWNUM is not restricted to select statements. It can be used with DML statements that update the database too.
CREATE TABLE o AS
SELECT *
FROM all_objects
WHERE rownum <= 1000;
Table created

UPDATE o
SET object_id = rownum,
created = created + INTERVAL '1' MINUTE * rownum
WHERE rownum <= 100;
100 rows updated

DELETE FROM o
WHERE OWNER = 'SYS'
AND rownum = 1;
1 rows deleted
ROWNUM is particularly useful, when used in conjunction with the CONNECT BY LEVEL clause, for creating arbitrary rows in the database. See the article on generating rows in Oracle for more details.

Offsetting Rows

Rows can also be skipped at the beginning of a result set using ROWNUM.
SELECT rnum, table_name
FROM
(SELECT rownum rnum, table_name
FROM user_tables)
WHERE rnum > 2;
RNUM     TABLE_NAME                     
-------- ----------------
3 SALGRADE
4 DUMMY
5 DEPT
3 rows selected
You will notice that an inline view has been introduced to transform the ROWNUM pseudo column into a 'real' column before we do the comparison.

It is tempting to write the above SQL as follows.
SELECT table_name
FROM user_tables
WHERE rownum > 2;
TABLE_NAME                     
------------------------------
0 rows selected
However, this query will always return zero rows, regardless of the number of rows in the table.

To explain this behaviour, we need to understand how Oracle processes ROWNUM. When assigning ROWNUM to a row, Oracle starts at 1 and only only increments the value when a row is selected; that is, when all conditions in the WHERE clause are met. Since our condition requires that ROWNUM is greater than 2, no rows are selected and ROWNUM is never incremented beyond 1.

The bottom line is that conditions such as the following will work as expected.

.. WHERE rownum = 1;

.. WHERE rownum <= 10;



While queries with these conditions will always return zero rows.

.. WHERE rownum = 2;

.. WHERE rownum > 10;

Top-n Query

Typically, a top-n query sorts data into the required sequence and then limits the output to a subset of rows.

For example, suppose we wish to retrieve the top three earners from our employee table.
SELECT ename, sal
FROM (
SELECT ename, sal
FROM emp
ORDER BY sal DESC)
WHERE rownum <=3;
ENAME      SAL                    
---------- ---------
KING 5000
SCOTT 3000
FORD 3000
3 rows selected
The inline view (the inner select) sorts the rows and passes the result up to the outer select. The outer select then limits the output to three rows.

It may seem more natural to use the following SQL.
SELECT ename, sal
FROM emp
WHERE rownum <=3
ORDER BY sal DESC;
ENAME      SAL                    
---------- ----------------------
ALLEN 1600
WARD 1250
SMITH 800
3 rows selected
However, this does not give us the result we want because Oracle assigns the ROWNUM values to the rows before it does the sort.

In this example, Oracle will retrieve three rows from the table, any three rows, and sort only these three rows. We really need Oracle to sort all the rows and then return the first three. The inline view will ensure that this will happen.

Sort Performance

Limiting rows on a sorted result set using ROWNUM can also provide an added performance benefit. Rather than physically sorting all the rows to retrieve just the top few, Oracle maintains an array which contains just the highest or the lowest values (depending on whether we specified ASC or DESC in the ORDER BY clause). The size of the array will be the number of rows we wish to return. As rows are processed, only the highest (or lowest) values are retained in the array. All other rows are discarded.

Pagination

Next, we will see how ROWNUM is used to select a range of rows from within a result set. This is useful if we are to provide pagination on a web screen, for example.

Suppose we are paging through the employee table in name order and we wish to display rows six to ten inclusive.
SELECT rnum, ename, job
FROM
(SELECT /*+ FIRST_ROWS(10) */ rownum rnum, ename, job
FROM
(SELECT ename, job
FROM emp
ORDER BY ename)
WHERE rownum <= 10
)
WHERE rnum > 5;
RNUM     ENAME      JOB       
-------- ---------- ---------
6 JAMES CLERK
7 JONES MANAGER
8 KING PRESIDENT
9 MARTIN SALESMAN
10 MILLER CLERK
5 rows selected
We use nested inline views to retrieve and sort the data and then apply the range check using ROWNUM. We have split the upper and lower bound check, which allows Oracle to use COUNT(STOPKEY) in the execution plan when checking for ROWNUM <= 10. This is a performance optimization which, along with the sorting optimization described earlier, will ensure that our query runs efficiently as the table grows.

The FIRST_ROWS(n) hint also tells Oracle to optimize the query so that the first n rows are returned as quickly as possible.

Summary

ROWNUM provides a mechanism for returning a subset or range of rows from a query. It can be misleading at first if not properly understood but, once mastered, is invaluable for limiting result set output for pagination and top-n style queries.

For more information on ROWNUM, see Tom Kytes article on OTN.

For more information on Oracle, visit level up.

Saturday, 22 March 2008

How to Monitor Oracle Index Usage

We can use the following command to determine whether a particular index is being used by Oracle.
ALTER INDEX <index_name> MONITORING USAGE;
If the specified index is used before monitoring is disabled, then the USED column of the relevant V$OBJECT_USAGE row, is set to YES.

Let's see this in action.

First, we create a table using the CONNECT BY LEVEL clause to generate a thousand rows, with an index on the ID column.
CREATE TABLE index_usage AS
SELECT rownum id, 'xyz' || rownum name
FROM dual
CONNECT BY LEVEL <= 1000;
Table created.

CREATE INDEX index_usage_ix1 ON INDEX_USAGE (id);
Index created.
Then we start monitoring the usage of the index and display the results from the V$OBJECT_USAGE view.
ALTER INDEX index_usage_ix1 MONITORING USAGE;
Index altered.

SELECT monitoring, used, start_monitoring, end_monitoring
FROM v$object_usage
WHERE index_name = 'INDEX_USAGE_IX1';
MONITORING USED START_MONITORING    END_MONITORING      
---------- ---- ------------------- -------------------
YES NO 03/22/2008 11:44:20
1 rows selected
Next, we access the table using the index, disable the monitoring and display the results again.
SELECT *
FROM index_usage
WHERE id = 123;
ID                     NAME                                        
---------------------- -------------------
123 xyz123
1 rows selected
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1 Bytes=10)
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'INDEX_USAGE' (TABLE) (Cost=2 Card=1 Bytes=10)
2 1 INDEX (RANGE SCAN) OF 'INDEX_USAGE_IX1' (INDEX) (Cost=1 Card=1)

ALTER INDEX index_usage_ix1 NOMONITORING USAGE;
Index altered.

SELECT monitoring, used,start_monitoring, end_monitoring
FROM v$object_usage
WHERE index_name = 'INDEX_USAGE_IX1';
MONITORING USED START_MONITORING    END_MONITORING      
---------- ---- ------------------- -------------------
NO YES 03/22/2008 11:44:20 03/22/2008 11:47:35
1 rows selected
This shows us that the index was used during our monitoring period.

This feature can help us to identify unused indexes which are candidates for removal. Removing unnecessary indexes can improve performance by reducing overhead during updates.

Saturday, 15 March 2008

JDBC Fetch Size

Fetch size is a performance hint which tells the JDBC driver how many rows should be fetched from the database when more rows are needed by the client. Increasing this value can improve overall query performance, especially with large result sets.

The required fetch size can be set on a Statement or a ResultSet object using the setFetchSize method. If used with a Statement then all ResultSets returned by that Statement will have the same fetch size. The default is ten.

Setting a JDBC fetch size that is too large or too small can degrade performance. Usually, a fetch size of one half or one quarter of the total expected result size is optimal.

Example

To see the effect of using different fetch sizes with Oracle, we first create a table containing a million rows using the CONNECT BY LEVEL clause.

CREATE TABLE one_million_rows AS
SELECT rownum id, rpad('x', 20, 'x') || rownum a, sysdate b
FROM dual
CONNECT BY LEVEL <= 1000000;
Table created.

Next, we call the following method several times with different fetch sizes and record the elapsed time taken to retrieve the data. SQL tracing is also enabled, allowing us to create a TKPROF report for each run.

public void testFetchSize(int fetchSize) throws SQLException {
  Connection con = ConnectionFactory.getNewConnection();
Statement stmt = con.createStatement();
  stmt.execute(
"ALTER SESSION SET tracefile_identifier = " +
"FETCH_SIZE_" + fetchSize);
stmt.execute(
"ALTER SESSION SET events " +
"'10046 trace name context forever, level 12'");
  ResultSet rs = stmt.executeQuery(
"SELECT /* FETCH SIZE " + fetchSize + " */ * " +
"FROM one_million_rows");
rs.setFetchSize(fetchSize);
  long start = System.currentTimeMillis();
while(rs.next());
System.out.printf("Fetch size %d took %4.2f seconds\n",
fetchSize,
(System.currentTimeMillis() - start) / 1000f);
  stmt.execute(
"ALTER SESSION SET events '10046 trace name context off'");

}

For this exercise we have used event 10046 to enable SQL tracing (rather than sql_trace) because we need to capture the wait events in addition to the execution statistics. We will see why later.

Note: Starting with Oracle 10g release 2, the preferred method of enabling SQL tracing is with the DBMS_SESSION package. The ALTER SESSION SET EVENTS command shown above can now be enabled with the following procedure call.

DBMS_SESSION.SESSION_TRACE_ENABLE(waits=>TRUE, binds=>TRUE);

Results

Here is a summary of the results.

Fetch SizeElapsed Time (secs)
1093.78
5029.48
10016.76
2009.58
5006.22
10004.86
50003.91
100003.79
500003.93
1000003.98

Clearly, the fetch size has a considerable impact on the overall performance of our query. The difference between the fastest, which was under four seconds, and the slowest, which was over a minute and a half, is certainly not trivial.

We can also see that performance starts to degrade slightly with a fetch size of somewhere between ten and fifty thousand. From these results, it would appear that ten thousand is a reasonable value for our fetch size.

TKPROF

We can also see the difference in performance with the TKPROF reports produced during the tests. Here, we compare a fetch size of ten to a fetch size of ten thousand.
SELECT /* FETCH SIZE 10 */ * 
FROM
one_million_rows

call count cpu elapsed query rows
------- ------ -------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0
Execute 1 0.00 0.00 0 0
Fetch 100001 7.40 6.77 101220 1000000
------- ------ -------- ---------- ---------- ----------
total 100003 7.40 6.77 101220 1000000
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
-------------------------------- Waited ---------- ------------
SQL*Net message to client 100001 0.00 0.37
SQL*Net message from client 100001 0.05 65.21
************************************************************************

SELECT /* FETCH SIZE 10000 */ *
FROM
one_million_rows

call count cpu elapsed query rows
------- ------ -------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0
Execute 1 0.00 0.00 0 0
Fetch 101 1.98 3.03 11573 1000000
------- ------ -------- ---------- ----------- -----------
total 103 1.98 3.03 11573 1000000
Elapsed times include waiting on following events:
Event waited on Times Max. Wait Total Waited
-------------------------------- Waited ---------- ------------
SQL*Net message to client 101 0.00 0.00
SQL*Net message from client 101 0.11 0.77
SQL*Net more data to client 17740 0.07 1.35
************************************************************************

Note: The report has been condensed to fit on the page.

Increasing the fetch size to ten thousand has more than halved the CPU and elapsed times and reduced consistent (logical) reads by almost ninety percent. However, the most significant gain is in the reduction of time spent waiting for a response from the client, from over a minute to less than a second.

Sunday, 9 March 2008

Oracle External Tables

An external table enables you to access data in a flat file as if it were in a table in the database.

External tables can be used to read the contents of a log file or spread sheet export, for example, or to import information into a data warehouse as part of an ETL operation.

A Simple Example

Suppose we have a file containing city populations in our /tmp directory and we wish to access this data from within our database.
Tokyo,Japan,33600000
Seoul,South Korea,23400000
Mexico City,Mexico,22400000
New York,USA,21900000
Bombay,India,21600000
Delhi,India,21500000
Sao Paulo,Brazil,20600000
Los Angeles,USA,18000000
Shanghai,China,17500000
Osaka,Japan,16700000

First, we create a DIRECTORY object which will reference the /tmp directory in our file system.

CREATE OR REPLACE DIRECTORY tmp_dir AS '/tmp';
Directory created.

Note: This command format will work with Windows and Linux.

Now, we create an external table by specifying the name of the file we wish to query and the directory where it can be located. We will rely on the default values of comma for a field separator and new line as the record delimiter, to describe our data.

CREATE TABLE pop_ext (
city VARCHAR2(30),
country VARCHAR2(30),
population NUMBER)
ORGANIZATION EXTERNAL
(DEFAULT DIRECTORY tmp_dir
LOCATION('pop.txt'));
Table created.

The file can now be queried as if it were a table in the database.

SELECT * 
FROM pop_ext;
CITY              COUNTRY          POPULATION             
----------------- ---------------- ------------------
Tokyo Japan 33600000
Seoul South Korea 23400000
Mexico City Mexico 22400000
New York USA 21900000
Bombay India 21600000
Delhi India 21500000
Sao Paulo Brazil 20600000
Los Angeles USA 18000000
Shanghai China 17500000
Osaka Japan 16700000
10 rows selected

External tables are read-only. An attempt to update the table will result in the following error message, complete with some useful advice on the appropriate action to take.

UPDATE pop_ext 
SET population = 0;
30657.0000 -  "operation not supported on external organized table"
*Cause: User attempted on operation on an external table which is
not supported.
*Action: Don't do that!

We could, of course, create a 'real' table containing the city populations if we wish to maintain the data.

CREATE TABLE pop AS
SELECT * FROM pop_ext;
Table created.

A More Detailed Example

In this example we have a file which contains bank holidays in the UK for 2007 to 2009.
"Holiday"       2007            2008            2009
"New Year" 01/01/07 01/01/08 01/01/09
"Good Friday" 06/04/07 21/03/08 10/04/09
"Easter Monday" 09/04/07 24/03/08 13/04/09
"Early May" 07/05/07 05/05/08 04/05/09
"(First Monday in May)"
"Spring Bank" 28/05/07 26/05/08 25/05/09
"(Last Monday in May)"
"Summer" 27/08/07 25/08/08 31/08/09
"(Last Monday in August)"
"Christmas Day" 25/12/07 25/12/08 25/12/09
"Boxing Day" 26/12/07 26/12/08 28/12/09

This time, we will include a few more options in the table definition to describe the format of the data and to exclude the comment lines (the text surrounded by brackets).

CREATE OR REPLACE DIRECTORY log_dir AS '/logs';
Directory created.

CREATE TABLE bank_hol_ext(
hol_desc VARCHAR2(50),
hol_2007 DATE,
hol_2008 DATE,
hol_2009 DATE
)
ORGANIZATION EXTERNAL
(TYPE oracle_loader
DEFAULT DIRECTORY tmp_dir
ACCESS PARAMETERS(
records delimited BY newline
load when (hol_2007 != BLANKS)
badfile log_dir:'bank_holidays.bad'
logfile log_dir:'bank_holidays.log'
discardfile log_dir:'bank_holidays.dsc'
skip 1
fields terminated by '\t'
optionally enclosed by '"'
missing field values are null
(
hol_desc,
hol_2007 DATE "dd/mm/yy",
hol_2008 DATE "dd/mm/yy",
hol_2009 DATE "dd/mm/yy"
)
)
LOCATION('hols.txt'))
REJECT LIMIT UNLIMITED;
Table created.

SELECT *
FROM bank_hol_ext;

HOL_DESC HOL_2007 HOL_2008 HOL_2009
----------------------------- -------------- -------------- ------------
New Year's Day 01-jan-2007 01-jan-2008 01-jan-2009
Good Friday 06-apr-2007 21-mar-2008 10-apr-2009
Easter Monday 09-apr-2007 24-mar-2008 13-apr-2009
Early May Bank Holiday 07-may-2007 05-may-2008 04-may-2009
Spring Bank Holiday 28-may-2007 26-may-2008 25-may-2009
Summer Bank Holiday 27-aug-2007 25-aug-2008 31-aug-2009
Christmas Day 25-dec-2007 25-dec-2008 25-dec-2009
Boxing Day 26-dec-2007 26-dec-2008 28-dec-2009
8 rows selected

Here, we have specified that the first row (the title) should be skipped, that fields may be surrounded by quotes and that a tab is used as the field delimiter. We also tell Oracle the format of the date fields and that we don't wish to impose a reject limit for bad records.

The logfile, badfile and discardfile clauses specify the location and file names for these file types. The logfile contains messages generated by the external tables utility while it is accessing data in the data file. The badfile contains the rows that could not be loaded due to errors (for example, a date was not valid for the format model) and the discardfile contains the rows excluded by the condition in the LOAD WHEN clause.

The default behaviour is to create these files in the same directory as the data file. However, if we only have read access to this directory, we can use these clauses to create the files in another location. In our example, we have chosen to write the files to the /logs directory.

If we wish, we can create another external table to view the contents of these files. Here, we define a table to query the log file, which contains a single column with a fixed width of four thousand bytes.

CREATE TABLE bank_hol_rej_ext (
rec VARCHAR2(4000))
ORGANIZATION EXTERNAL
(DEFAULT DIRECTORY log_dir
ACCESS PARAMETERS(
records delimited by newline
fields
missing field VALUES ARE NULL
(
rec position(1:4000)
)
)
LOCATION('bank_holidays.log'));
Table created.

SELECT *
FROM bank_hol_rej_ext
WHERE rownum <=10
AND rec IS NOT NULL;
REC
---------------------------------------------------------------
LOG file opened at 03/04/08 21:23:35
Field Definitions for table BANK_HOL_EXT
Record format DELIMITED BY NEWLINE
Data in file has same endianness as the platform
Rows with all null fields are accepted
Load when (HOL_2007 != BLANKS)
Fields in Data Source:
HOL_DESC CHAR (255)
Terminated by " "
Enclosed by """ and """
10 rows selected

We can change the table definition to reference a different file using the ALTER TABLE command.

ALTER TABLE bank_hol_rej_ext LOCATION ('bank_holidays.dsc');
Table altered.
This allows us to query the discard file with the same external table.
SELECT * 
FROM bank_hol_rej_ext;
REC
----------------------------------
"(First Monday in May)"
"(Last Monday in May)"
"(Last Monday in August)"
3 rows selected.

Data Pump

Oracle 10g introduced the option to create an external table using data constructed from within the database.
CREATE TABLE emp_ext
ORGANIZATION EXTERNAL
(
TYPE ORACLE_DATAPUMP
DEFAULT DIRECTORY tmp_dir
LOCATION ('emp.dp')
)
AS
SELECT empno, ename, job, sal, comm, dname
FROM emp JOIN dept USING (deptno);
Table created.

This SQL will create a flat file in the /tmp directory containing the results of the SELECT statement. Unfortunately, the file is a proprietary binary format which can only be read by Data Pump. It would be useful to have the option to produce a pure CSV format which could easily be read by other tools and utilities. However, the file is readable across platforms, which allows us to move it between Windows and Linux and create another external table on the target machine, to read the data.

CREATE TABLE emp_ext (
empno NUMBER(4,0),
ename VARCHAR2(10),
job VARCHAR2(9),
sal NUMBER(7,2),
comm NUMBER(7,2),
dname VARCHAR2(14)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_DATAPUMP
DEFAULT DIRECTORY tmp_dir
LOCATION ('emp.dp')
);
Table created.

This can be useful for transferring arbitrary data from one database to another where a database link is not available.

Saturday, 1 March 2008

How to Generate Rows in Oracle


Here is a a simple and efficient way to generate arbitrary rows in Oracle.
SELECT <columns>
FROM dual
CONNECT BY LEVEL <= n;
where n is the number of rows you wish to produce.

For example.
SELECT rownum 
FROM dual
CONNECT BY LEVEL <= 5;
    ROWNUM
----------
1
2
3
4
5
5 rows selected

Test Data

One use for this technique is to quickly generate the contents of a table. The following example uses a CTAS statement to create a table and populate it with test data.

CREATE TABLE test_data(
id PRIMARY KEY,
group_id NOT NULL,
created_at NOT NULL,
text NOT NULL) AS
SELECT rownum,
MOD(rownum, 5),
TO_DATE('1-jan-07', 'dd-mon-yy') + INTERVAL '1' MINUTE * rownum,
CAST ('xyz' || rownum AS VARCHAR2(50))
FROM dual
CONNECT BY LEVEL <= 10000;
Table Created.

SELECT * 
FROM test_data
WHERE id <= 10;
        ID   GROUP_ID  CREATED_AT          TEXT
---------- ---------- ------------------- -----------
1 1 01-jan-07 00:01:00 xyz1
2 2 01-jan-07 00:02:00 xyz2
3 3 01-jan-07 00:03:00 xyz3
4 4 01-jan-07 00:04:00 xyz4
5 0 01-jan-07 00:05:00 xyz5
6 1 01-jan-07 00:06:00 xyz6
7 2 01-jan-07 00:07:00 xyz7
8 3 01-jan-07 00:08:00 xyz8
9 4 01-jan-07 00:09:00 xyz9
10 0 01-jan-07 00:10:00 xyz10
10 rows selected.

We now have ten thousand rows of test data with a sequential number for the primary key, a group item which contains five distinct values, a date field which increments by one minute for each row and a text field containing unique values.

Sparse Data

Another way we can utilise row generation is when dealing with sparse data.

Suppose we wish to use the data from the previous example to show the total number of rows by day.

SELECT TRUNC(created_at, 'dd') created_on, count(*)
FROM test_data
GROUP BY TRUNC(created_at, 'dd');
CREATED_ON   COUNT(*)
---------- ----------
01-jan-07 1439
02-jan-07 1440
03-jan-07 1440
04-jan-07 1440
05-jan-07 1440
06-jan-07 1440
07-jan-07 1361
7 rows selected.

However, let us assume that we must display a row for every day within a specified date range, even if the row count is zero. This is often the case if the data is used to produce a graph, for example.

It just so happens that our query produced a row per day between the 1st and 7th of January, but suppose the required date range is the 1st to the 10th of January and our table has a couple of days with no data. We will delete the rows for the 3rd and 5th of January to simulate this.

DELETE FROM test_data 
WHERE TRUNC(created_at, 'dd') IN (TO_DATE('3-jan-2007', 'dd-mon-yyyy'),
TO_DATE('5-jan-2007', 'dd-mon-yyyy'));
2880 rows deleted.

SELECT TRUNC(created_at, 'dd') created_on, count(*)
FROM test_data
GROUP BY TRUNC(created_at, 'dd');
CREATED_ON   COUNT(*)
---------- ---------
01-jan-07 1439
02-jan-07 1440
04-jan-07 1440
06-jan-07 1440
07-jan-07 1361
5 rows selected.

Because we are dealing with sparse data, row counts for only five days are returned by our query. However, the requirement is to produce a row for each of the ten days in our range. We will use the CONNECT BY LEVEL clause, inline views and an outer join to provide the solution.

First, we construct a query to return a row for each day in a specified range.

SELECT TO_DATE('1-jan-2007', 'dd-mon-yyyy') + (rownum - 1) created_on
FROM dual
CONNECT BY LEVEL <= TO_DATE('10-jan-2007', 'dd-mon-yyyy') -
TO_DATE('1-jan-2007', 'dd-mon-yyyy') + 1;
CREATED_ON
----------
01-jan-07
02-jan-07
03-jan-07
04-jan-07
05-jan-07
06-jan-07
07-jan-07
08-jan-07
09-jan-07
10-jan-07
10 rows selected.
This query is then joined with an aggregate query on the test data using an outer join.
SELECT created_on,
NVL(qty, 0) qty
FROM
(SELECT TO_DATE('1-jan-2007', 'dd-mon-yyyy') + (rownum - 1) created_on
FROM dual
CONNECT BY LEVEL <= TO_DATE('10-jan-2007', 'dd-mon-yyyy') -
TO_DATE('1-jan-2007', 'dd-mon-yyyy') + 1)
LEFT OUTER JOIN
(SELECT TRUNC(created_at, 'dd') created_on,
COUNT(*) qty
FROM test_data
GROUP BY TRUNC(created_at, 'dd'))
USING(created_on)
ORDER BY created_on;
CREATED_ON        QTY
---------- ----------
01-jan-07 1439
02-jan-07 1440
03-jan-07 0
04-jan-07 1440
05-jan-07 0
06-jan-07 1440
07-jan-07 1361
08-jan-07 0
09-jan-07 0
10-jan-07 0
10 rows selected.

Now we are guaranteed to return a row for every day in our range, even for the days with no data.