重新創建了具有存儲方式、壓縮級別、分布方式和分布列的測試數據集后,重新測試系統性能。
1.記錄各表的存儲使用情況。
使用pg_size_pretty函數查詢每張表使用的磁盤空間,并將結果記錄到基準表中。
SELECT T_NAME, PG_SIZE_PRETTY(PG_RELATION_SIZE(t_name)) FROM (VALUES('store_sales'),('date_dim'),('store'),('item'),('time_dim'),('promotion'),('customer_demographics'),('customer_address'),('household_demographics'),('customer'),('income_band')) AS names1(t_name);
t_name | pg_size_pretty
------------------------+----------------
store_sales | 14 GB
date_dim | 27 MB
store | 4352 kB
item | 259 MB
time_dim | 14 MB
promotion | 3200 kB
customer_demographics | 11 MB
customer_address | 27 MB
household_demographics | 1280 kB
customer | 111 MB
income_band | 896 kB
(11 rows)
2.測試查詢性能,并將性能數據錄入基準表中。
再次運行如下三個查詢,并記錄每個查詢的耗費時間。
\timing on
SELECT * FROM (SELECT COUNT(*)
FROM store_sales
,household_demographics
,time_dim, store
WHERE ss_sold_time_sk = time_dim.t_time_sk
AND ss_hdemo_sk = household_demographics.hd_demo_sk
AND ss_store_sk = s_store_sk
AND time_dim.t_hour = 8
AND time_dim.t_minute >= 30
AND household_demographics.hd_dep_count = 5
AND store.s_store_name = 'ese'
ORDER BY COUNT(*)
) LIMIT 100;
SELECT * FROM (SELECT i_brand_id brand_id, i_brand brand, i_manufact_id, i_manufact,
SUM(ss_ext_sales_price) ext_price
FROM date_dim, store_sales, item,customer,customer_address,store
WHERE d_date_sk = ss_sold_date_sk
AND ss_item_sk = i_item_sk
AND i_manager_id=8
AND d_moy=11
AND d_year=1999
AND ss_customer_sk = c_customer_sk
AND c_current_addr_sk = ca_address_sk
AND substr(ca_zip,1,5) <> substr(s_zip,1,5)
AND ss_store_sk = s_store_sk
GROUP BY i_brand
,i_brand_id
,i_manufact_id
,i_manufact
ORDER BY ext_price desc
,i_brand
,i_brand_id
,i_manufact_id
,i_manufact
) LIMIT 100;
SELECT * FROM (SELECT s_store_name, s_store_id,
SUM(CASE WHEN (d_day_name='Sunday') THEN ss_sales_price ELSE null END) sun_sales,
SUM(CASE WHEN (d_day_name='Monday') THEN ss_sales_price ELSE null END) mon_sales,
SUM(CASE WHEN (d_day_name='Tuesday') THEN ss_sales_price ELSE null END) tue_sales,
SUM(CASE WHEN (d_day_name='Wednesday') THEN ss_sales_price ELSE null END) wed_sales,
SUM(CASE WHEN (d_day_name='Thursday') THEN ss_sales_price ELSE null END) thu_sales,
SUM(CASE WHEN (d_day_name='Friday') THEN ss_sales_price ELSE null END) fri_sales,
SUM(CASE WHEN (d_day_name='Saturday') THEN ss_sales_price ELSE null END) sat_sales
FROM date_dim, store_sales, store
WHERE d_date_sk = ss_sold_date_sk AND
s_store_sk = ss_store_sk AND
s_gmt_offset = -5 AND
d_year = 2000
GROUP BY s_store_name, s_store_id
ORDER BY s_store_name, s_store_id,sun_sales,mon_sales,tue_sales,wed_sales,thu_sales,fri_sales,sat_sales
) LIMIT 100;
下面的基準表顯示了本次實踐中所用集群的驗證結果。您的結果可能會因多方面的原因而有所變化,但規律性應該相差不大。考慮到操作系統緩存的影響,相同表結構的同一查詢在每次執行時耗時會有不同屬正常現象,建議多測試幾次,取一組平均值。
基準 優化前 優化后 加載時間(11張表)
341584ms
257241ms
占用存儲
Store_Sales
42GB
14GB
Date_Dim
11MB
27MB
Store
232kB
4352kB
Item
110MB
259MB
Time_Dim
11MB
14MB
Promotion
256kB
3200kB
Customer_Demographics
171MB
11MB
Customer_Address
170MB
27MB
Household_Demographics
504kB
1280kB
Customer
441MB
111MB
Income_Band
88kB
896kB
總存儲空間
42GB
15GB
查詢執行時間
查詢1
14552.05ms
1783.353ms
查詢2
27952.36ms
14247.803ms
查詢3
17721.15ms
11441.659ms
總執行時間
60225.56ms
27472.815ms
3.如果對表設計后的性能還有更高期望,可以運行EXPLAIN PERFORMANCE以查看執行計劃進行調優。
關于執行計劃的更詳細介紹及查詢優化請參考《開發指南》中的“SQL執行計劃介紹”及“優化查詢性能概述” 。