17数据展现(tabulate).ppt
57页第第17章章 数据展现数据展现清华大学经管学院清华大学经管学院 朱世武朱世武Zhushw@Resdat样本数据:样本数据:SAS论坛:论坛: 数据展现的方式有两类:v 列表方式;v 图形方式 本章内容包括:本章内容包括: § 利用PRINT过程打印列表展现详细数据集;§ 利用TABULATE过程制作汇总报表展现数据;§ 利用GPLOT过程作图展现数据集;§ 利用GCHART过程输出高精度图表 打印列表过程打印列表过程 v 利用PRINT过程打印列表展现详细数据集v PRINT过程是展现数据集内容的最简单过程打印列表过程句法打印列表过程句法PROC PRINT
DATA=选项:proc print data=ResDat.class;proc print ;run; Obs Name Sex Age Height Weight 1 Alice F 13 56.5 84.0 2 Barbara F 13 65.3 98.0 3 Carol F 14 62.8 102.5 4 Jane F 12 59.8 84.5 5 Janet F 15 62.5 112.5 6 Joyce F 11 51.3 50.5 7 Judy F 14 64.3 90.0 8 Louise F 12 56.3 77.0 9 Mary F 15 66.5 112.0 10 Alfred M 14 69.0 112.5 …… ……OUTPUT窗口结果 NOOBS选项:proc print data=ResDat.class noobs;run;proc print data=ResDat.class split='#';var height;label height='This is a Label # for height';run; This is a Label Obs for height 1 56.5 2 65.3 3 62.8 4 59.8 5 62.5 6 51.3 7 64.3 8 56.3 …… ……proc print data=ResDat.class (obs=3) noobs;var height weight;title2 '身高体重';id name;run; 身高体重 Name Height Weight Alice 56.5 84.0 Barbara 65.3 98.0 Carol 62.8 102.5应用举例应用举例 例17.2 选择打印输出变量。
options nodate pageno=1 linesize=70 pagesize=60;proc print data=ResDat.exprev double;var month state expenses;title 'Monthly Expenses for Offices in Each State';run;例17.3 定制标题内容options nodate pageno=1 linesize=70 pagesize=60;proc print data=ResDat.exprev split='*' n obs='Observation*Number*===========';var month state expenses;label month='Month**=====' state='State**=====' expenses='Expenses**========'; format expenses comma10.;title 'Monthly Expenses for Offices in Each State';run;例17.4 分组创建输出报告。
options pagesize=60 pageno=1 nodate linesize=70;proc sort data=ResDat.exprev;by region state month;run;proc print data=ResDat.exprev n='Number of observations for the state: ' noobs label;var month expenses revenues;by region state;pageby region;label region='Sales Region';format revenues expenses comma10.;title 'Sales Figures Grouped by Region and State';run;例17.5 对BY组中的数值变量求和options nodate pageno=1 linesize=70 pagesize=60 nobyline;proc sort data=ResDat.exprev;by region;run;proc print data=ResDat.exprev noobs n='Number of observations for the state: ';sum expenses revenues;by region;format revenues expenses comma10.;title 'Revenue and Expense Totals for the #byval(region) Region';run;options byline;例17.9 用BY组和ID变量定制输出。
options nodate pageno=1 linesize=64 pagesize=60;proc sort data=ResDat.empdata out=tempemp;by jobcode gender;run;proc print data=tempemp split='*';id jobcode;by jobcode;var gender salary;sum salary;label jobcode='Job Code*========' gender='Gender*======' salary='Annual Salary*=============';format salary dollar11.2;where jobcode contains 'FA' or jobcode contains 'ME';title 'Expenses Incurred for';title2 'Salaries for Flight Attendants and Mechanics';run;制表过程制表过程 v 利用TABULATE过程可以制作汇总报表展现数据。
v 制表过程既有频数统计和常用描述统计的计算功能,又有很强的用表格展现数据的功能制表过程句法制表过程句法 PROC TABULATE
报表布局设计要用到TABLE语句v TABLE语句的选项非常复杂,使用时可查看SAS系统的帮助这里我们通过例子来说明一些选项的使用方法 应用举例应用举例例17.11创建二维报表proc format;value regfmt 1='Northeast' 2='South' 3='Midwest' 4='West';value divfmt 1='New England' 2='Middle Atlantic' 3='Mountain' 4='Pacific';value usetype 1='Residential Customers' 2='Business Customers';run;options nodate pageno=1 linesize=80 pagesize=60;proc tabulate data=ResDat.energy format=dollar12.;class region division type;var expenditures;table region*division, type*expenditures / rts=25;format region regfmt. division divfmt. type usetype.;title 'Energy Expenditures for Each Region';title2 '(millions of dollars)';run;例17.12 规定CLASS变量组合出现在报表中。
options nodate pageno=1 linesize=80 pagesize=60;proc tabulate data=ResDat.energy format=dollar12. classdata=ResDat.classes exclusive;class region division type;var expenditures;table region*division, type*expenditures / rts=25;format region regfmt. division divfmt. type usetype.;title 'Energy Expenditures for Each Region';title2 '(millions of dollars)';run; 例17.14 定制行列标题options nodate pageno=1 linesize=80 pagesize=60;proc tabulate data=ResDat.energy format=dollar12.;class region division type;var expenditures;table region*division, type='Customer Base'*expenditures=' '*sum=' ' / rts=25;format region regfmt. division divfmt. type usetype.;title 'Energy Expenditures for Each Region';title2 '(millions of dollars)';run;例17.18 创建多页报表。
options nodate pageno=1 linesize=80 pagesize=60;proc tabulate data=ResDat.energy format=dollar12.;class region division type;var expenditures;table region='Region: ' all='All Regions', division all='All Divisions', type='Customer Base'*expenditures=' '*sum=' ' / rts=25 box=_page_ condense indent=1;format region regfmt. division divfmt. type usetype.;title 'Energy Expenditures for Each Region and All Regions';title2 '(millions of dollars)';run;例17.21 计算百分比统计量options nodate pageno=1 linesize=105 pagesize=60; proc format;picture pctfmt low-high='009 %';run;title "Fundraiser Sales";proc tabulate data=ResDat. Fundrais format=7.;class team classrm;var sales;table (team all)*sales=' ', classrm='Classroom'*(sum colpctsum*f=pctfmt9. rowpctsum*f=pctfmt9. reppctsum*f=pctfmt9.) all /rts=20 row=float;run;例17.23 输出为HTML格式的报表。
ods html body='d:\ResDat\html.htm';proc tabulate data=ResDat.energy style=[font_weight=bold];class region division type / style=[just=center];classlev region division type / style=[just=left];var expenditures / style=[font_size=3];keyword all sum / style=[font_width=wide];keylabel all="Total";table (region all)*(division all*[style=[background=yellow]]), (type all)*(expenditures*f=dollar10.) / style=[background=red] misstext=[label="Missing" style=[font_weight=light]] box=[label="Region by Division by Type" style=[font_style=italic]];format region regfmt. division divfmt. type usetype.;title 'Energy Expenditures';title2 '(millions of dollars)';run;ods html close;run;作图过程作图过程 图形是展现数据的重要方法,图形的形象直观是数据报表无法替代的。
SAS/GRAPH软件具有强大的作图功能 SAS/GRAPH软件可以展现图形有:§ 散点图及连线图(PLOTS);§ 图表(CHARTS);§ 地图(MAPS);§ 三维图(3D GRAPHICS);§ 幻灯片(TEXT SLIDES)等本章介绍制作图形的两个基本过程:作图过程(GPLOT过程)和图表过程(GCHART过程)作图过程GPLOT输出高精度散点图及连线图作图过程句法作图过程句法 PROC GPLOT
选项说明: SYMBOL语句语句 SYMBOL语句规定图中线和符号的特征 选项说明:AXIS语句语句 AXIS语句规定图形的轴的表现形式 选项说明: 应用举例应用举例 例17.24 创建简单泡沫(Bubble)图goptions reset=global gunit=pct border cback=white colors=(black blue green red) ftitle=swissb ftext=swiss htitle=6 htext=4;title1 'Member Profile';title2 'Salaries and Number of Member Engineers';footnote h=3 j=r 'GR21N01 ';axis1 offset=(5,5);proc gplot data=ResDat.jobs;format dollars dollar9.;bubble dollars*eng=num / haxis=axis1;run;quit;例17.25 规定泡沫的大小和标签goptions reset=global gunit=pct border cback=white colors=(black blue green red) ftitle=swissb ftext=swiss htitle=6 htext=4;title1 'Member Profile';title2 h=4 'Salaries and Number of Member Engineers';footnote1 h=3 j=r 'GR21N02 ';axis1 label=noneoffset=(5,5)width=3value=(height=4);axis2 order=(0 to 40000 by 10000) label=none major=(height=1.5) minor=(height=1) width=3 value=(height=4);/*接上页*/ proc gplot data=ResDat.jobs; format dollars dollar9. num comma7.0; bubble dollars*eng=num / haxis=axis1 vaxis=axis2 vminor=1 bcolor=red blabel bfont=swissi bsize=12 caxis=blue;run;quit;例17.26 右侧加一垂直轴。
goptions reset=global gunit=pct border cback=white colors=(black blue green red) ftitle=swissb ftext=swiss htitle=6 htext=3;data ResDat.jobs2; set ResDat.jobs; yen=dollars*125;run;title1 'Member Profile';title2 h=4 'Salaries and Number of Member Engineers';footnote j=r 'GR21N03 ';axis1 offset=(5,5) label=none width=3 value=(h=4);/*接上页*/proc gplot data=ResDat.jobs2;format dollars dollar7. num yen comma9.0;bubble dollars*eng=num / haxis=axis1 vaxis=10000 to 40000 by 10000 hminor=0 vminor=1 blabel bfont=swissi bcolor=red bsize=12 caxis=blue;bubble2 yen*eng=num / vaxis=1250000 to 5000000 by 1250000 vminor=1 bcolor=red bsize=12 caxis=blue;run;quit; 例17.28 叠加图形。
goptions reset=global gunit=pct border cback=white colors=(black blue green red) ftitle=swissb ftext=swiss htitle=6 htext=4; title1 'Dow Jones Yearly Highs and Lows';footnote1 h=3 j=l ' Source: 1997 World Almanac' j=r 'GR21N06 ';symbol1 color=red interpol=join value=dot height=3;symbol2 font=marker value=C color=blue interpol=join height=2;/*接上页*/axis1 order=(1955 to 1995 by 5) offset=(2,2) label=none major=(height=2) minor=(height=1) width=3;axis2 order=(0 to 6000 by 1000) offset=(0,0) label=none major=(height=2) minor=(height=1) width=3;legend1 label=none shape=symbol(4,2) position=(top center inside) mode=share;proc gplot data=ResDat.djia;plot high*year low*year / overlay legend=legend1 vref=1000 to 5000 by 1000 lvref=2 haxis=axis1 hminor=4 vaxis=axis2 vminor=1;run;quit; 图表过程图表过程 图表过程GCHART输出高精度图表。
使用GCHART过程可以制作二维或三维的柱状图和饼图 图表过程句法图表过程句法PROC GCHART
分类变量确定类别的方法GCHART过程还提供不同的选项允许按不同的要求进行分类GCHART过程不提供分析变量时,作图时使用的缺省统计量是频数,指明分析变量时使用的缺省统计量是总和为了控制柱(饼的角)的排列顺序和用数值变量分类时类的个数可以使用GCHART过程与分类变量有关的选项 与分类变量有关的选项说明: 例17.33 分类变量选项举例vbar sales / levels=10;vbar sales / 1000 to10000 by 1000;vbar year / discrete;hbar city / mindpoint=’BJ’ ‘SH’ ‘GZ’;hbar city / ascending;选择分析变量和统计量选择分析变量和统计量 没有选择分析变量时,缺省使用每个类的观测频数为输出统计量 选择分析变量和统计量的选项有:§ SUMVAR=规定分析变量;§ TYPE=FREQ|CFREQ|PERCENT|CPERCENT|MEAN|SUM分别设定统计量为频数、累积频数、百分比、累积百分比、均值或总和没有规定分析变量时,缺省统计量为FREQ,规定分析变量时,缺省统计量为SUM。
使用统计量MEAN和SUM时必须规定分析变量 例17.34 规定分析变量和统计量hbar city / sumvar=sales type=mean;应用举例应用举例 例17.35 总和统计量柱形图goptions reset=global gunit=pct border cback=white ctext=black colors=(blue green red) ftext=swiss ftitle=swissb htitle=6 htext=3.5;title 'Total Sales';footnote j=r 'GR13N01';proc gchart data=ResDat.totals;format sales dollar8.;block site / sumvar=sales;run;quit;例17.36 分组柱形图goptions reset=global gunit=pct border cback=white colors=(blue green red) ctext=blackftitle=swissb ftext=swiss htitle=4 htext=3;title 'Average Sales by Department';footnote j=r 'GR13N02 ';legend1 cborder=black label=('Quarter:') position=(middle left outside) mode=protect across=1; /*接上页*/proc gchart data=ResDat.totals;format quarter roman.;format sales dollar8.;label site='00'x dept='00'x;block site / sumvar=sales type=mean midpoints='Sydney' 'Atlanta' group=dept subgroup=quarter legend=legend1 noheading coutline=black caxis=black;run;quit;例17.38 3D子组垂直柱形图。
goptions reset=global gunit=pct border cback=white colors=(black red green blue) ftitle=swissb ftext=swiss htitle=6 htext=4 offshadow=(1.5,1.5);title1 'Total Sales by Site';footnote1 h=3 j=r 'GR13N04 ';axis1 label=none origin=(24,);axis2 label=none order=(0 to 100000 by 20000) minor=(number=1) offset=(,0);legend1 label=none shape=bar(3,3) cborder=black cblock=gray origin=(24,);pattern1 color=lipk;pattern2 color=cyan;pattern3 color=lime;/*接上页*/proc gchart data=reflib.totals;format quarter roman.;format sales dollar8.;vbar3d site / sumvar=sales subgroup=dept inside=subpct outside=sum width=9 space=4 maxis=axis1 raxis=axis2 cframe=gray coutline=black legend=legend1;run;quit;例17.40 总和统计量饼图。
goptions reset=global gunit=pct border cback=white colors=(blue green red) ctext=black ftitle=swissb ftext=swiss htitle=6 htext=4;title 'Total Sales';footnote j=r 'GR13N08(a) ';proc gchart data=ResDat.totals;format sales dollar8.;pie site / sumvar=sales coutline=black;run;footnote j=r 'GR13N08(b) ';pie3d site / sumvar=sales coutline=black explode='Paris';run;quit;。





