Oracle SQL Model Clause
参考官方文档,学习了一下Oracle SQL Model语法。
摘自Pro Oracle SQL:
With the MODEL clause, you build matrixes (or a model) of data with a variable number of dimensions. The model
uses a subset of the available columns from the tables in your FROM clause and has to contain at least one dimension,
at least one measure, and, optionally, one or more partitions. You can think of a model as a spreadsheet file containing
separate worksheets for each calculated value (measures). A worksheet has an x- and a y-axis (two dimensions), and
you can imagine having your worksheets split up in several identical areas, each for a different attribute (partition).
官方文档地址:http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/10g/r2/prod/bidw/sqlmodel/sqlmodel_otn.htm
现将过程记录在这里:
1. 准备过程, 连接到Oracle自带schema SH(sales history),创建数据源:
CREATE VIEW sales_view AS
SELECT country_name country, prod_name prod, calendar_year year,
SUM(amount_sold) sale, COUNT(amount_sold) cnt
FROM sales, times, customers, countries, products
WHERE sales.time_id = times.time_id AND
sales.prod_id = products.prod_id
AND sales.cust_id = customers.cust_id
AND customers.country_id = countries.country_id
GROUP BY country_name, prod_name, calendar_year
2. Model语法分为3个部分,partition, dimension, measures。
Partition与分析函数的partition类似,将结果集分成了若干逻辑块。Model的rules被应用于每个partition的cells。
原文: Partitions define logical blocks of the result set in a way similar to the partitions of the analytical functions (described in the chapter titled "SQL for Analysis in Data Warehouses" in the Data Warehousing Guide ). MODEL rules are applied to the cells of each partition.
Dimension用于在每一个partition内区分每个measure的单元格。有点类似于excel中的行与列。如(A1, B1)就定义了一个单元格
原文: Dimensions identify each measure cell within a partition. These columns identify characteristics such as date, region, and product name.
Measures: Measures近似于星形模型中的事实表。它们典型包含数值,如销售单位或成本。每一个单元格都通过指定全部的维度在它的partition内访问。
原文: Measures are analogous to the measures of a fact table in a star schema. They typically contain numeric values such as sales units or cost. Each cell is accessed within its partition by specifying its full combination of dimensions.
3. 实例:
例1: SELECT SUBSTR(country,1,20) country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country IN ('Italy','Japan')
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (
sales['Bounce', 2002] = sales['Bounce', 2001] + sales['Bounce', 2000],
sales['Y Box', 2002] = sales['Y Box', 2001],
sales['2_Products', 2002] = sales['Bounce', 2002] + sales['Y Box', 2002])
ORDER BY country, prod, year;
这个SQL创建了几个新行,比如sales['Y Box', 2002]使用的是 sales['Y Box ', 2001 ]的销售金额, sales['2_Products ', 2002 ] 是另外两个产品2002年的销售金额之和
例2:
SELECT SUBSTR(country,1,20) country,
SUBSTR(prod,1,15) prod, year, sales
FROM sales_view
WHERE country IN ('Italy','Japan')
MODEL RETURN UPDATED ROWS
PARTITION BY (country)
DIMENSION BY (prod, year)
MEASURES (sale sales)
RULES (
sales['Bounce', 2002] = sales['Bounce', 2001] + sales['Bounce', 2000],
sales['Y Box', 2002] = sales['Y Box', 2001],
sales['Bounce', 2003] = sum(sales)['Bounce',year <= 2002],
sales['Y Box', 2003] = sum(sales)['Y Box',year <= 2002],
sales['2_Products', 2002] = sales['Bounce', 2002] + sales['Y Box', 2002])
ORDER BY country, prod, year;
这个例子中,引入了新的rules, sales['Bounce', 2003] = sum(sales)['Bounce',year <= 2002]表示 产品Bounce在2003年的收入金额是2002年以及2002年前的销售金额之和
实例3:算累加
with t as (
select rownum rn from dual connect by rownum <= 100
)
select rn, total from t
model return updated rows
dimension by (rn)
measures(0 total)
rules(
total[rn] = cv(rn)+ nvl(total[cv(rn) - 1], 0)
)
求出1到100的和, 这里用了函数CV(current value)去引用其他单元格数据
另一个求累加的:
with t as (
select 1 rn from dual
)
select rn,total from t
model return updated rows
dimension by (rn)
measures(0 total)
rules iterate(100)(
total[1]= nvl(total[1], 0) + (ITERATION_NUMBER +1)
)
rules iterate(100)代表的是将规则迭代100次。 ITERATION_NUMBER是循环计数,注意是从0开始
参考:
http://www.itpub.net/thread-1904347-4-1.html