Saturday, December 5, 2015

Some Basic Stat Concepts from my notes

  • Correlation
    • Pearson: Product Corelation
    • Spearman: Rank Corelation
      • Whenever we have Ordinal Data (Rank Column, Marks of Students ).
      • If two ranks are same, then (next rank+(next+1))/2 rank, i:e (2+3)/2
      • If Three ranks are same, then rank is: (next rank+(next+1)+(next+2))/2. i: 1+3+4/2
      • And so on.
  • Hypothesis Test
    • Find t value
    • Find p value
    • Check it p value less than  or more than alpha
    • If p value is less than alpha then 
  • T Test: Sample Size is Small: Say 30.
    • One Tail Test
      • When Hypothesis talks about mean as > or < of Certail value
      • When Fail to reject Null Hypothesis?
        • If the P from test is > than Alpha(0.05, generally), then Null Hypothesis is Accepted(Failed to reject)
        • If T from experiment is < than T found out for Degree of Freedom(Number of Observations) and Critical Value (Alpha say 0.05) from T Table of One Tail, then Null Hypothesis is Accepted.
    • Two Tail Test
      • When Null Hypothesis talks about = of certain value.
      • When Fail to reject Null Hypothesis?
        • If the P from test is > than Alpha(0.05, generally), then Null Hypothesis is Accepted!(Failed to reject)
        • If T from experiment is beyond T Area(Experiment T must not fall in between Table T value range( found out for Degree of Freedom(Number of Observations) and Critical Value (Alpha say 0.05) from T Table of One Tail, then Null Hypothesis is Accepted.
  • Normal Distribution
    • 68-95-99.7 for 1,2,3 SD Area from Mean.
    • Standardization:
      • z(i)=(x-mean)/(standard distribution)
  • Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point.
  • Kurtosisis a measure of whether the data are peaked or flat relative to a normal distribution.
  • Mean Absolute Deviation: x-Mean
  • Upper Hinge: Mid of Median to Max Value
  • Lower Hinge: Mid of Median to Min Value

Evernote helps you remember everything and get organized effortlessly. Download Evernote.

Wednesday, June 24, 2015

Monday, October 20, 2014

ORE-OBIEE Integration

Let us say I want to have prediction and linear model plots of data table in Oracle having data of height and weight using ORE, to use server memory execution rather than client memory execution and show the result in OBIEE reporting tool.

Prerequisite:

1. I have ORE in 12c pdb configured with a user, say RUSER2 with RQROLE and RQADMIN roles.
2. OBIEE 11.1.1.7.0 is configured.
3. Table PRED_TBL is created with height, weight and ID columns.
4. I know R,ORE and OBIEE

Let us start with scripts:

#olm is a ORE script which returns predicted table.
begin
sys.rqScriptDrop('olm');
sys.rqScriptCreate('olm', 'function(dat) {
raw<-dat
library(ORE)
lmdl<-lm(WEIGHT~.,raw)
lpred<-predict(lmdl,raw)
pred_tbl<-cbind(raw,lpred)
pred_tbl
}');
end;

#Creating view which calls olm script with PRED_TBL as input
create or replace view R_PREDICT as
select HEIGHT, WEIGHT, ID,LPRED from table( rqTableEval(
        cursor(select * from PRED_TBL),cursor(select 1 as "ore.connect" from dual),
        'select 1 HEIGHT,2 WEIGHT, 3 ID,4 LPRED from dual',
        'olm'));
     
     

##lmpmg returns plots of liner model.
 begin
 sys.rqScriptDrop('lmpng');
 sys.rqScriptCreate('lmpng',
'function(dat){
 plot(lm(WEIGHT~.,dat))
  }');
end;

## PNG_TEST view will hold the plots in blob,
create or replace view PNG_TEST as
select ID,image from table( rqTableEval(
        cursor(select * from PRED_TBL),cursor(select 1 as "ore.connect" from dual),
        'PNG',
        'lmpng'));    

##ORE Rpd is the simple self explanatory OBIEE RPD.
ORE.rpd

Thats it! Create Table view in presentation layer from each of above views and have fun!

Don't forget to give reference if you are using elsewhere or leave comment if this information helped
:-)

Hari Prasad :-)

Saturday, September 6, 2014

Chi Square Distribution and Difference from T

To which question Chi square distribution answers?

Suppose you have a data population. It is

1. Normally distributed
2. You know the standard deviation of population.

Now you want to conduct an experiment out of the sample of data. You will get standard deviation of that sample too.

Now what is the probability that next sample of same size you pick will have less than or equal to standard deviation of earlier sample standard deviation computed? (Greater than case is nothing but 1- (less than case). By default it is left tail test.)

First Step: Find chisquare critical value.

Χ2 = [ ( n - 1 ) * s2 ] / σ2 

R function:

chisqcv <- function(samplesize,samplestandarddeviation,populationstandarddeviation){
result<-((samplesize-1)*(samplestandarddeviation*samplestandarddeviation)/(populationstandarddeviation*populationstandarddeviation))
  return(result)
}

example:

chisquarecriticalvalue<-chisqcv(7,6,4)

Once you find critical value, find out cumulative probability distribution:
pchisq(chisquarecriticalvalue,degreesoffreedom)

Good Source:

http://stattrek.com/probability-distributions/chi-square.aspx

Value you will get is the answer for our question.

In T: You have sample mean and population mean. You are giving a probability that next sample mean is equal to earlier sample mean. It will not talk about standard deviation.

Sunday, August 31, 2014

Saturday, August 30, 2014

Monday, August 25, 2014