数据分析.docx

上传人:b****3 文档编号:4646377 上传时间:2023-05-07 格式:DOCX 页数:20 大小:64.38KB
下载 相关 举报
数据分析.docx_第1页
第1页 / 共20页
数据分析.docx_第2页
第2页 / 共20页
数据分析.docx_第3页
第3页 / 共20页
数据分析.docx_第4页
第4页 / 共20页
数据分析.docx_第5页
第5页 / 共20页
数据分析.docx_第6页
第6页 / 共20页
数据分析.docx_第7页
第7页 / 共20页
数据分析.docx_第8页
第8页 / 共20页
数据分析.docx_第9页
第9页 / 共20页
数据分析.docx_第10页
第10页 / 共20页
数据分析.docx_第11页
第11页 / 共20页
数据分析.docx_第12页
第12页 / 共20页
数据分析.docx_第13页
第13页 / 共20页
数据分析.docx_第14页
第14页 / 共20页
数据分析.docx_第15页
第15页 / 共20页
数据分析.docx_第16页
第16页 / 共20页
数据分析.docx_第17页
第17页 / 共20页
数据分析.docx_第18页
第18页 / 共20页
数据分析.docx_第19页
第19页 / 共20页
数据分析.docx_第20页
第20页 / 共20页
亲,该文档总共20页,全部预览完了,如果喜欢就下载吧!
下载资源
资源描述

数据分析.docx

《数据分析.docx》由会员分享,可在线阅读,更多相关《数据分析.docx(20页珍藏版)》请在冰点文库上搜索。

数据分析.docx

数据分析

解:

(1)拟合

的线性回归模型

利用

的观测数据,通过SAS系统progreg过程拟合线性回归模型

拟合出

的拟合值

,残差

及学生化残差

程序:

建立回归模型,输出因变量拟合值、残差、学生化残差

dataexercise2_9;

inputx1-x3y;

cards;

50512.348

36462.357

40482.266

41441.870

28431.889

49542.936

42502.246

45482.454

52622.926

29502.177

29482.489

43532.467

38552.247

34512.351

53542.257

36492.066

33562.579

29461.988

33492.160

55512.449

29522.377

44582.952

43502.360

;

run;

procregdata=exercise2_9;

modely=x1-x3;

outputout=ap=precditr=residstudent=student;

procprintdata=a;

run;

 

TheSASSystem18:

26Sunday,October10,20041

TheREGProcedure

Model:

MODEL1

DependentVariable:

y

AnalysisofVariance

SumofMean

SourceDFSquaresSquareFValuePr>F

Model34133.633221377.8777413.01<.0001

Error192011.58417105.87285

CorrectedTotal226145.21739

 

RootMSE10.28945R-Square0.6727

DependentMean61.34783AdjR-Sq0.6210

CoeffVar16.77232

 

ParameterEstimates

ParameterStandard

VariableDFEstimateErrortValuePr>|t|

Intercept1162.8759025.775656.32<.0001

x11-1.210320.30145-4.010.0007

x21-0.665910.82100-0.810.4274

x31-8.6130312.24125-0.70

Obsx1x2x3yprecditresidstudent

150512.34848.5888-0.5888-0.06150

236462.35768.8628-11.8628-1.28303

340482.26663.55102.44900.24682

441441.87068.44961.55040.17231

528431.88984.84964.15040.45204

649542.93642.6336-6.6336-0.78114

742502.24659.7986-13.7986-1.38301

845482.45455.7768-1.7768-0.18998

952622.92633.6754-7.6754-0.91730

1029502.17776.39400.60600.06341

1129482.48975.141913.85811.54979

1243532.46754.867912.13211.21437

1338552.24761.3103-14.3103-1.58470

1434512.35167.9539-16.9539-1.71028

1553542.25743.821513.17851.54520

1636492.06669.4490-3.4490-0.35404

1733562.57964.112114.88791.62692

1829461.98880.78037.21970.75782

1933492.16072.2187-12.2187-1.23677

2055512.44941.67597.32410.81159

2129522.37773.33963.66040.38762

2244582.95246.02165.97840.66555

2343502.36057.72702.27300.22775

画出学生化残差的正态QQ图

的拟合值的残差图

,并求相关系数

proccapabilitygraphicsnoprintdata=a;

qqplotstudent/normal(mu=0sigma=1);

run;

procsortdata=a;

bystudent;

prociml;

usea;

readallvar{student}intorr;

doi=1to23;

qi=probit((i-0.375)/23.25);

q=q//qi;

end;

rq=rr||q;

createcorrelvar{rq};

appendfromrq;

quit;

procprintdata=correl;

run;

proccorrdata=correl;

run;

procregdata=exercise2_9;

modely=x1-x3;

outputout=ap=fittedyr=residual;

run;

procprintdata=a;

run;

procgplotdata=a;

plotresidual*fittedyresidual*x1residual*x2residual*x3;

symbolv=doti=none;

run;

TheCORRProcedure

2Variables:

RQ

SimpleStatistics

VariableNMeanStdDevSumMinimumMaximum

R230.009541.027520.21944-1.710281.62692

Q2300.966910-1.928741.92874

 

PearsonCorrelationCoefficients,N=23

Prob>|r|underH0:

Rho=0

RQ

R1.000000.98357

<.0001

Q0.983571.00000

<.0001

大致在一条直线上,且由corr过程结果看出,二者的相关系数估计值

=0.98357接近于1,因此认为此线性回归模型中误差项服从正态分布的假设是合理的.

由残差图可知,它们没有明显的趋势性,是较为满意的形式.再结合有关误差项分布正态性检验的有关结果,认为相应的线性回归模型以及误差项独立同正态分布的假定对所给数据是较为合理和可行的.

(2)修正的复相关系数准则、

准则选择模型

dataexercise2_9;

inputx1-x3y;

cards;

50512.348

36462.357

40482.266

41441.870

28431.889

49542.936

42502.246

45482.454

52622.926

29502.177

29482.489

43532.467

38552.247

34512.351

53542.257

36492.066

33562.579

29461.988

33492.160

55512.449

29522.377

44582.952

43502.360

;

run;

procregdata=exercise2_9;

modely=x1-x3/selection=adjrsq;

run;

procregdata=exercise2_9;

modely=x1-x3/selection=cp;

run;

TheSASSystem19:

19Sunday,October10,200414

TheREGProcedure

Model:

MODEL1

DependentVariable:

y

C(p)SelectionMethod

Numberin

ModelC(p)R-SquareVariablesinModel

22.49510.6641x1x2

22.65790.6613x1x3

34.00000.6727x1x2x3

14.29950.5986x1

117.98650.3628x3

218.12000.3949x2x3

119.01310.3451x2

准则选择最优模型

准则选择最优模型

预测平方和准则选择PRESSp最优回归方程

dataexercise2_9;

inputx1-x3y;

cards;

50512.348

36462.357

40482.266

41441.870

28431.889

49542.936

42502.246

45482.454

52622.926

29502.177

29482.489

43532.467

38552.247

34512.351

53542.257

36492.066

33562.579

29461.988

33492.160

55512.449

29522.377

44582.952

43502.360

;

run;

procregdata=exercise2_9;

modely=x1/noprint;

outputout=a1press=press;

run;

procmeansussdata=a1;

varpress;

run;

procregdata=exercise2_9;

modely=x2/noprint;

outputout=a2press=press;

run;

procmeansussdata=a2;

varpress;

run;

procregdata=exercise2_9;

modely=x3/noprint;

outputout=a3press=press;

run;

procmeansussdata=a3;

varpress;

run;

procregdata=exercise2_9;

modely=x1x2/noprint;

outputout=a4press=press;

run;

procmeansussdata=a4;

varpress;

run;

procregdata=exercise2_9;

modely=x1x3/noprint;

outputout=a5press=press;

run;

procmeansussdata=a5;

varpress;

run;

procregdata=exercise2_9;

modely=x2x3/noprint;

outputout=a6press=press;

run;

procmeansussdata=a6;

varpress;

run;

procregdata=exercise2_9;

modely=x1x2x3/noprint;

outputout=a7press=press;

run;

procmeansussdata=a7;

varpress;

run;

 

TheMEANSProcedure

AnalysisVariable:

pressResidualwithoutCurrentObservation

USS

------------

3024.21

------------

USS

------------

4853.28

------------

USS

------------

4652.84

-----------

USS

------------

2714.10

------------

USS

------------

2693.43

------------

USS

------------

4966.43

------------

USS

------------

3046.29

------------

由上述预测平方和结果看出,

的预测平方和PRESSp=2693.43最小,此模型为最终选择的模型.

(3)逐步回归法

dataexercise2_9;

inputx1-x3y;

cards;

50512.348

36462.357

40482.266

41441.870

28431.889

49542.936

42502.246

45482.454

52622.926

29502.177

29482.489

43532.467

38552.247

34512.351

53542.257

36492.066

33562.579

29461.988

33492.160

55512.449

29522.377

44582.952

43502.360

;

run;

procregdata=exercise2_9;

modely=x1-x3/selection=stepwiseslentry=0.10slstay=0.10details;

run;

TheSASSystem19:

19Sunday,October10,200423

TheREGProcedure

Model:

MODEL1

DependentVariable:

y

StepwiseSelection:

Step1

StatisticsforEntry

DF=1,21

Model

VariableToleranceR-SquareFValuePr>F

x11.0000000.598631.31<.0001

x21.0000000.345111.070.0032

x31.0000000.362811.960.0024

 

Variablex1Entered:

R-Square=0.5986andC(p)=4.2995

AnalysisofVariance

SumofMean

SourceDFSquaresSquareFValuePr>F

Model13678.435853678.4358531.31<.0001

Error212466.78154117.46579

CorrectedTotal226145.21739

 

ParameterStandard

VariableEstimateErrorTypeIISSFValuePr>F

Intercept121.8318211.0422114299121.73<.0001

x1-1.527040.272883678.4358531.31<.0001

Boundsonconditionnumber:

1,1

-------------------------------------------------------------------------------------------------------------------------------

StepwiseSelection:

Step2

StatisticsforEntry

DF=1,20

Model

VariableToleranceR-SquareFValuePr>F

x20.7822760.66413.900.0622

x30.7523000.66133.700.0686

 

Variablex2Entered:

R-Square=0.6641andC(p)=2.4951

AnalysisofVariance

SumofMean

SourceDFSquaresSquareFValuePr>F

Model24081.219492040.6097519.77<.0001

Error202063.99790103.19989

CorrectedTotal226145.21739

TheSASSystem19:

19Sunday,October10,200424

TheREGProcedure

Model:

MODEL1

DependentVariable:

y

StepwiseSelection:

Step2

ParameterStandard

VariableEstimateErrorTypeIISSFValuePr>F

Intercept166.5913324.908444616.2675244.73<.0001

x1-1.260460.289191960.5609219.000.0003

x2-1.089320.55139402.783653.900.0622

Boundsonconditionnumber:

1.2783,5.1133

-------------------------------------------------------------------------------------------------------------------------------

StepwiseSelection:

Step3

 

StatisticsforRemoval

DF=1,20

PartialModel

VariableR-SquareR-SquareFValuePr>F

x10.31900.345119.000.0003

x20.06550.59863.900.0622

 

StatisticsforEntry

DF=1,19

Model

VariableToleranceR-SquareFValuePr>F

x30.3481210.67270.500.4902

Allvariablesleftinthemodelaresignificantatthe0.1000level.

Noothervariablemetthe0.1000significancelevelforentryintothemodel.

 

SummaryofStepwiseSelection

VariableVariableNumberPartialModel

StepEnteredRemovedVarsInR-SquareR-SquareC(p)FValuePr>F

1x110.59860.59864.299531.31<.0001

2x220.06550.66412.49513.900.0622

最优模型为

(4)最优模型的拟合检验

TheSASSystem19:

19Sunday,October10,200427

TheREGProcedure

Model:

MODEL1

DependentVariable:

y

AnalysisofVariance

SumofMean

SourceDFSquaresSquareFValuePr>F

Model24081.219492040.6097519.77<.0001

Error202063.99790103.19989

CorrectedTotal226145.21739

 

RootMSE10.15873R-Square0.6641

DependentMean61.34783AdjR-Sq0.6305

CoeffVar16.55924

 

ParameterEstimates

ParameterStandard

VariableDFEstimateErrortValuePr>|t|

Intercept1166.5913324.908446.69<.0001

x11-1.260460.28919-4.360.0003

x21-1.089320.55139-1.980.0622

复相关系数平方和为

与前面的结果0.6727相比较,可见均方残差、回归系数估计及拟合优度的度量值

均变化很小,即当

在模型中时,

的影响是很小的.最优回

展开阅读全文
相关资源
猜你喜欢
相关搜索
资源标签

当前位置:首页 > 自然科学 > 数学

copyright@ 2008-2023 冰点文库 网站版权所有

经营许可证编号:鄂ICP备19020893号-2