Using R to Compare Dates -
i've got 2 csv files.
one file lists when , why employee leaves.
employeeid,department,separation_type,separation_date,fyfq 119549,sales,retirement,09/30/2013 2629053,sales,termination,09/30/2013 120395,sales,retirement,11/01/2013 122450,sales,transfer,11/30/2013 123962,sales,transfer,11/30/2013 1041054,sales,resignation,12/01/2013 990962,sales,retirement,12/14/2013 135396,sales,retirement,01/11/2014
another file lookup table shows start , end dates of every fiscal quarter:
fyfq,start,end fy2014fq1,10/1/2013,12/31/2013 fy2014fq2,1/1/2014,3/31/2014 fy2014fq3,4/1/2014,6/30/2014 fy2014fq4,7/1/2014,9/30/2014 fy2015fq1,10/1/2014,12/31/2014 fy2015fq2,1/1/2015,3/31/2015
i'd r find fyfq separation_date occurred in , print fourth column in data.
input:
separations.csv: >employeeid,department,separation_type,separation_date,fyfq >990962,sales,retirement,12/14/2013 >135396,sales,retirement,01/11/2014
fiscalquarterdates.csv:
>fyfq,start,end >fy2013fq4,7/1/2013,9/30/2013 >fy2014fq1,10/1/2013,12/31/2013 >fy2014fq2,1/1/2014,3/31/2014
desired output:
output.csv:
>employeeid,department,separation_type,separation_date,fyfq >990962,sales,retirement,12/14/2013,fy2014fq1 >135396,sales,retirement,01/11/2014,fy2014fq2
i'm assuming there's function iterate through fiscalquarterdates.csv , evaluate if each separation date in fyfq, i'm not sure.
any thoughts on best way this?
this worked.
#read in csv , declare th3 4th column date separations <- read.csv(file="separations_dummydata.csv", head=true,sep=",",colclasses=c(na,na,na,"date")) #use zoo package (i installed it) convert separation_date quarter type , set quarter 1/4. construct variable fyyfqq. library(zoo) separations$fyfq <- format(as.yearqtr(separations$separation_date, "%m/%d/%y") + 1/4, "fy%yfq%q") #write out csv in working directory. write.csv(separations, file = "sepscomplete.csv", row.names = false)
you don't need second dataframe: simple function solve this:
yr<-with(firstdf,as.numeric(substr(seperation_date,7,10))) mth<-with(firstdf,as.numeric(substr(seperation_date,1,2))) firstdf$fyfq<-with(firstdf, ifelse(mth<=3,paste0("fy",yr,"fq2"), ifelse(mth>3 & mth<=6,paste0("fy",yr,"fq3"), ifelse(mth>7 & mth<=9,paste0("fy",yr,"fq4"), paste0("fy",yr+1,"fq1") ))))
Comments
Post a Comment